Getting Started - GonzaloUlla/unlp-dbd-newsler GitHub Wiki
Welcome to the Newsler wiki!
Getting Started
- Install Python 3 (you can get it here: https://www.python.org/downloads/release/python-377)
- Check your environment variables and make sure the Python install dir (where python.exe lives) and Scripts dir are both in your path
- Download the project
- Open a PowerShell/CMD window in the project root
- Install Scrapy and other requirements using pip:
python pip install -r requirements.txt
- Create the Newsler scrappy project:
python scrapy startproject newsler
- Copy the spiders from the scrapy_prototype folder to the spiders folder in Newsler project:
cp .\scrapy_prototype\* .\newsler\newsler\spiders
- Run the Newsler spiders:
python3 -m scrapy crawl TheGuardian
Also try: CNN, AlJazeera, DW, FoxNews