Home - nrbrase/WebScraper GitHub Wiki
Welcome to the WebScraper Wiki!
We are building a web scraper that collects data and stores the information in a text file. Users will be able to go to a website and scrape information off of it.
1. Project standards: MVP
Using the java library jsoup, we will scrape the web of a pharmacy company (CVS) throughout one state to retrieve information about locations, store numbers, and phone numbers. This information will be used for updating databases as needed.
1.1 Feature 1: Parallel running
Given the desired company (CVS) we plan to integrate the data scraping to run in parallel. This will speed up the run time and produce the results in a usable time frame.
1.2 Feature 2: All Locations
Using one pharmacy (CVS) we will scrape every state and gather all of the location information: Location, store numbers, and phone numbers.
1.3 Feature 3: Progress bar
Given the state we will show a loading progress bar as each location is printed. This will be indeterminate progress bar to show that it is scraping.
Roles (Subject to change at companies/groups discretion):
Kim: Scraping, GUI, youtube uploads
- Scrape CVS for the store information given
- Setup basic GUI
- Youtube uploads for each section of the project
Nick: Scraping, parallel running, dropdown menu
- Scrape CVS for the store information given
- Run the scrape in parallel for faster usable runtime
- Create a dropdown menu to select the state of choice
Stephanie: Create a landing page, progress bar in GUI, general documentation
- Landing page creation via requirements on course website
- Add indeterminate progress bar to main GUI for users to know that it is making progress
- Update general documentation, readme, how to's