scrapy_pymysql.err.IntegrityError: (1062, "Duplicate entry '1' for key 'PRIMARY'")
2019-01-12 15:28
4547 查看
问题描述:
python:3.6
ubantu:5.4.0-6ubuntu1~16.04.4
在使用scrapy为框架,将采集到的数据使用pymysql保存到虚拟机中的时候,数据采集没有问题,但是在插入的时候出现了问题,报错如下:
[code]Traceback (most recent call last): File "e:\anaconda3\lib\site-packages\twisted\internet\defer.py", line 654, in _runCallbacks current.result = callback(current.result, *args, **kw) File "E:\Scrapy\Jianshu\Jianshu\pipelines.py", line 36, in process_item self.update(item) File "E:\Scrapy\Jianshu\Jianshu\pipelines.py", line 31, in update self.cursor.execute(update_time) File "e:\anaconda3\lib\site-packages\pymysql\cursors.py", line 170, in execute result = self._query(query) File "e:\anaconda3\lib\site-packages\pymysql\cursors.py", line 328, in _query conn.query(q) File "e:\anaconda3\lib\site-packages\pymysql\connections.py", line 516, in query self._affected_rows = self._read_query_result(unbuffered=unbuffered) File "e:\anaconda3\lib\site-packages\pymysql\connections.py", line 727, in _read_query_result result.read() File "e:\anaconda3\lib\site-packages\pymysql\connections.py", line 1066, in read first_packet = self.connection._read_packet() File "e:\anaconda3\lib\site-packages\pymysql\connections.py", line 683, in _read_packet packet.check_error() File "e:\anaconda3\lib\site-packages\pymysql\protocol.py", line 220, in check_error err.raise_mysql_exception(self._data) File "e:\anaconda3\lib\site-packages\pymysql\err.py", line 109, in raise_mysql_exception raise errorclass(errno, errval) pymysql.err.IntegrityError: (1062, "Duplicate entry '1' for key 'PRIMARY'")
表定义:
代码:
[code]# -*- coding: utf-8 -*- # Define your item pipelines here # # Don't forget to add your pipeline to the ITEM_PIPELINES setting # See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html import pymysql class XXXXPipeline(object): def __init__(self): self.conn = pymysql.connect("192.168.0.124", "root", "123456", "a") self.cursor = self.conn.cursor() self._sql = None self.num = 1 @property def get_sql(self): if not self._sql: self._sql = sql = """insert into XXXX_save values({},'{}','{}','{}','{}','{}','{}')""" return self._sql def update(self, item): print("item['content']:") print(repr(item['content'])) update_time = self.get_sql.format(self.num, item['title'], item['author'], item['author_img'], item['artical_id'], item['pub_time'], item['content'].replace("'","")) print("update_time:", update_time) self.cursor.execute(update_time) self.conn.commit() self.num += 1 def process_item(self, item, spider): self.update(item) return item
原因分析:
原因在于当你执行爬虫文件的时候,向虚拟机中的mysql插入了数据,当你停掉了程序后在去执行,mysql中还保存上次采集的数据,定义的ID为主键,定义的时候并没有进行设置自增,而是通过spider文件进行传值,每次从1开始(这是个瑕疵,随后进行改正),也就是说每次在执行spider文件的时候,需要先清空数据库中的表。
在去执行spider文件。
运行成功,数据库中已经插入了数据
相关文章推荐
- Last_Errno: 1062,Last_Error: Error 'Duplicate entry '...' for key 'PRIMARY'' on query. 的详细分析过程
- ERROR 1062 (23000): Duplicate entry '5' for key 'PRIMARY'
- [Err] 1062 - Duplicate entry '0' for key 'PRIMARY'
- txt导入Mysql:ERROR 1062(23000):Duplicate entry '0' for key 'PRIMARY'
- SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '5-47' for key 'PRIMARY'
- mysql数据库:ERROR 1062 (23000): Duplicate entry '0' for key 'PRIMARY'
- mysql主从库同步错误:1062 Error 'Duplicate entry '1438019' for key 'PRIMARY'' on query
- #1062 - Duplicate entry '0' for key 'PRIMARY'—— mysql的小问题
- ERROR 1062 (23000): Duplicate entry '%-root' for key 'PRIMARY'
- MySql出现ERROR 1062 (23000): Duplicate entry '%-root' for key 'PRIMARY'
- 利用navicat向mysql数据库导入数据时出现1062 - Duplicate entry '0' for key 'PRIMARY'错误的解决办法
- mysql error:#1062 Duplicate entry ‘***′ for key 1问题解决方法
- ERROR 1062 (23000): Duplicate entry '%-root' for key 'PRIMARY'
- Python + SQLAlchemy + MySQL出现IntegrityError, Duplicate entry - 1062
- MySQL1062错误 [Err] 1062 -Duplicate entry
- 判断mysql是否插入成功时要注意,否则会出现mysql Integrity constraint violation: 1062 Duplicate entry.....这种问题
- Duplicate entry '0' for key 'PRIMARY'
- Duplicate entry '0' for key 'PRIMARY'
- pymysql.err.DataError: ("Data truncated for column 'minamount' at row 1")
- pymysql.err.InternalError: (1054, "Unknown column 'None' in 'field list'")