scrapyDownPic - juedaiyuer/researchNote GitHub Wiki
图片下载管道篇
命令行下进入你需要存放项目的目录并创建项目
$ scrapy startproject mzitu_scrapy
New Scrapy project 'mzitu_scrapy', using template directory '/home/juedaiyuer/anaconda3/envs/py2crawler/lib/python2.7/site-packages/scrapy/templates/project', created in:
/home/juedaiyuer/opensource/mycode/pythonPro/mzitu_scrapy
You can start your first spider with:
cd mzitu_scrapy
scrapy genspider example example.com
首先在item.py中新建我们需要的字段。我们需要啥?我们需要套图的名字和图片地址!!
import scrapy
class MzituScrapyItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
name = scrapy.Field()
image_urls = scrapy.Field()
spider.py
首先导入我们需要的包:
from scrapy import Request
from scrapy.spider import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from mzitu_scrapy.items import MzituScrapyItem
error
SyntaxError: Non-ASCII character '\xe7' in file
程序中的编码出问题了,只要在程序的最前面加上
#-*- coding: UTF-8 -*-