[论坛提问] 请教：scrapy使用CrawlSpider时遇到UnicodeDecodeError编码问题？

代码：
# -*- coding:utf-8 -*-
import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
from movieheaven.items import MovieheavenItem

class MySpider(CrawlSpider):
name = 'movie'
allowed_domains = ['dytt8.net']
start_urls = ['http://www.dytt8.net/html/gndy/dyzz/index.html']

page_lx = LinkExtractor(allow=('list_23_\d+\.html'))
rules = [
Rule(page_lx, callback='myparse', follow=True)
]

def myparse(self, response):
pass

问题：
判断问题点在于，爬取该网站时，返回的 response 的编码是 'gb18030'，从而会导致UnicodeDeodeError错误，在scrapy shell测试时，使用response=response.replace(encoding='utf-8')处理之后，问题解决，那么，在项目中这个问题该怎么解决呢？

EvanXue · EvanXue

waiting..waiting..

帐号		自动登录	找回密码
密码			加入黑马

[论坛提问] 请教：scrapy使用CrawlSpider时遇到UnicodeDecodeError编码问题？

1 个回复