To do - karantyagi/hackathons-near-me GitHub Wiki

Next Up

Website v/s webapp(hosted on HEROKU) - RESTful, Django python website v/s RESTful Flask API - Python

0.0) To mine extra info about all hackathon main pages: use tags in head : google: Which tags hep in SEO html5 documentation head tags convention html5 head tag coding standard

0.1) Make a log file to store the date and time when the crawler /scraper was last run

  1. use cronjob, Linux, daemon, scrap once every 5th day or whatever time suits me better

  2. Saving a text file version everytime the crawler runs. Like V2018_02_12.txt for 12th Feb. Implement version, log files of DB to track changes etc... implement this concept at a basic level.

  3. Batch file, config file, etc - Goal: is to run the scrawl.py file once per day after laptop opens or resumes from sleep etc [executing the script once per day remotely]

  4. At last you need to implement repeated crawling (IR concept) - refreshing database after every 24 hours when you host your project on the cloud. Script will be residing on cloud then.