Mysql实现数据的不重复写入(inser…
2016-03-04 10:00
711 查看
最近做数据处理时候,遇到一个问题。用一个id自增主键时候,数据表中会插入大量重复数据(除ID不同)。这虽然对最终数据处理结果没有影响,但是有1个问题,如果数据量超大,对处理的速度影响成几何倍数增长!所以必须找到不重复插入的方法。
谷歌之:大量bolg有相关资料,但都是
INSERT INTO users_roles (userid,
roleid) SELECT
'userid_x', 'roleid_x' FROM
dual WHERE
NOT EXISTS (SELECT
* FROM users_roles WHERE
userid = 'userid_x' AND
roleid = 'roleid_x');这样的sql语句,尝试,不能解决问题。sql语法错误!
果断找官方文档,于mysql5.6版本查看到的insert文档如下图:
online help
insert文档地址为:http://dev.mysql.com/doc/refman/5.6/en/insert.html
5.6版本的官方文档中没有以上语法了,这有三种插入语句。分别分析。
[PARTITION (
[(
{VALUES | VALUE} ({
[ ON DUPLICATE KEY UPDATE
[,
[/code]
【With
you can quickly insert many rows into a table from one or many
tables.不能从本表查询插入本表】
Or:
[PARTITION (
SET
[ ON DUPLICATE KEY UPDATE
[,
[/code]
Or:
[PARTITION (
[(
SELECT ...
[ ON DUPLICATE KEY UPDATE
[,
[/code]
You
can use
of
overwrite old rows.
the counterpart to
the treatment of new rows that contain unique key values that
duplicate old rows: The new rows are used to replace the old rows
rather than being discarded.
也就是说REPLACE这条语句就是替代INSERT语句重写以前的数据,也就是更新!当你有一个唯一键UNIQUE
KEY时候,需要插入的行中含有这个唯一键值,REPLACE语句是重写表中已有的行,而INSERT IGNORE则是丢弃新数据处理!
关键字使用:
Note:延迟插入即DELAYED在后续版本会丢弃不用,所以不用学了。【原文:As
of MySQL 5.6.6,
deprecated, and will be removed in a future release.
Use
instead.】
一、
only storage engines that use only table-level locking (such
as
and
errors that occur while executing
the
are ignored. For example,
without
a row that duplicates an
existing
or
in the table causes a duplicate-key error and the statement is
aborted. With
the row still is not inserted, but no error occurs. Ignored errors
may generate warnings instead, although duplicate-key errors do
not.2.
a similar effect on inserts into partitioned tables where no
partition matching a given value is found.
Without
such
are aborted with an error; however,
when
used, the insert operation fails silently for the row containing
the unmatched value, but any rows that are matched are
inserted. 】
and a row is inserted that would cause a duplicate value in
a
or
an
the old row is performed. The affected-rows value per row is 1 if
the row is inserted as a new row, 2 if an existing row is updated,
and 0 if an existing row is set to its current values. If you
specify the
to
connecting to mysqld,
the affected-rows value is 1 (not 0) if an existing row is set to
its current values.
新问题:是可以插入了,但是ID会变的不连续。问题具体为:当有一条重复数据插入时候,使用INSERT IGNORE INTO 语句执行完毕后,重复数据没有插入但是ID自增还是运行了一次,这就导致ID出现不连续的情况。这些不连续的ID值也就是出现重复的时候。[b]使用[b]ON
DUPLICATE KEY
UPDATE时也会出现此问题。REPLACE则是删除旧记录,新纪录卸载表后面,ID还是不连续。[/b][/b]
[b]
[b]原因是什么呢?官方文档原文为:http://dev.mysql.com/doc/refman/5.6/en/insert-on-duplicate.html[/b][/b]
If you
specify
inserted that would cause a duplicate value in
a
or
an
the old row. For example, if
column
declared as
contains the value
the following two statements have similar effect:
(The effects are
not identical for an
where
an auto-increment column.With an auto-increment
column, an
increases the auto-increment value
but
not.)
With
the affected-rows value per row is1 if the row is
inserted as a new row,2 if an existing row
is updated, and0 if an existing row
is set to its current values. If you specify
the
to
connecting to mysqld,
the affected-rows value is 1 (not 0) if an existing row is set to
its current values.
原因正是我的表中ID是一个自增量!【原文:If
a table contains an
and
or updates a row, the
returns the
Note:1. 尽量避免用ON DUPLICATE KEY UPDATE去更新多UNIQUE
KEY的表,有时候会出乎意料!2. The
is ignored when you use
in MySQL 5.6.4 and later,
are flagged as unsafe for statement-based
replication.复合主键或者多个唯一键时不安全。。。
Prior to MySQL 5.6.6,
an
affected a partitioned table using a storage engine such
as
employs table-level locks locked all partitions of the table. This
was true even for
(This did not and does not occur with storage engines such
as
employ row-level locking.) In MySQL 5.6.6 and later, MySQL uses
partition lock pruning, so that only partitions into which rows are
inserted are actually locked. For more information,
see Section 18.6.4, “Partitioning and
Locking”.
转至:http://blog.csdn.net/zhanh1218/article/details/21459297
谷歌之:大量bolg有相关资料,但都是
INSERT INTO users_roles (userid,
roleid) SELECT
'userid_x', 'roleid_x' FROM
dual WHERE
NOT EXISTS (SELECT
* FROM users_roles WHERE
userid = 'userid_x' AND
roleid = 'roleid_x');这样的sql语句,尝试,不能解决问题。sql语法错误!
果断找官方文档,于mysql5.6版本查看到的insert文档如下图:
online help
insert文档地址为:http://dev.mysql.com/doc/refman/5.6/en/insert.html
5.6版本的官方文档中没有以上语法了,这有三种插入语句。分别分析。
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE] [INTO] [code]tbl_name
[PARTITION (
partition_name,...)]
[(
col_name,...)]
{VALUES | VALUE} ({
expr| DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col_name=
expr
[,
col_name=
expr] ... ]
[/code]
【With
INSERT ... SELECT,
you can quickly insert many rows into a table from one or many
tables.不能从本表查询插入本表】
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE] [INTO] [code]tbl_name
[PARTITION (
partition_name,...)]
SET
col_name={
expr| DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col_name=
expr
[,
col_name=
expr] ... ]
[/code]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE] [INTO] [code]tbl_name
[PARTITION (
partition_name,...)]
[(
col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col_name=
expr
[,
col_name=
expr] ... ]
[/code]
You
can use
REPLACEinstead
of
INSERTto
overwrite old rows.
REPLACEis
the counterpart to
INSERT IGNOREin
the treatment of new rows that contain unique key values that
duplicate old rows: The new rows are used to replace the old rows
rather than being discarded.
也就是说REPLACE这条语句就是替代INSERT语句重写以前的数据,也就是更新!当你有一个唯一键UNIQUE
KEY时候,需要插入的行中含有这个唯一键值,REPLACE语句是重写表中已有的行,而INSERT IGNORE则是丢弃新数据处理!
关键字使用:
Note:延迟插入即DELAYED在后续版本会丢弃不用,所以不用学了。【原文:As
of MySQL 5.6.6,
INSERT DELAYEDis
deprecated, and will be removed in a future release.
Use
INSERT(without
DELAYED)
instead.】
一、
LOW_PRIORITY:很可能一直不被执行。
二、HIGH_PRIORITY
:可能导致并发插入数据不可用。Note:LOW_PRIORITY
and HIGH_PRIORITYaffect
only storage engines that use only table-level locking (such
as
MyISAM,
MEMORY,
and
MERGE).
三、
IGNORE:使用此关键词插入数据时,写入数据error也会被ignore。【原文:
1.If
you use the IGNORE
keyword,errors that occur while executing
the
INSERTstatement
are ignored. For example,
without
IGNORE,
a row that duplicates an
existing
UNIQUEindex
or
PRIMARY KEYvalue
in the table causes a duplicate-key error and the statement is
aborted. With
IGNORE,
the row still is not inserted, but no error occurs. Ignored errors
may generate warnings instead, although duplicate-key errors do
not.2.
IGNOREhas
a similar effect on inserts into partitioned tables where no
partition matching a given value is found.
Without
IGNORE,
such
INSERTstatements
are aborted with an error; however,
when
INSERT IGNOREis
used, the insert operation fails silently for the row containing
the unmatched value, but any rows that are matched are
inserted. 】
1.也就是说当有唯一键或者主键时候,新数据和唯一键值,主键值有重复,新数据不会被写入,也不会报错。
2.当给的value list数据类型等与表的结构不一致时候,也会ignore,不会报错。
3.使用ignore关键字时,无效的value会根据表中对应字段的数据类型自动调整为最接近的value,也就是说value会改变。
四、 ON
DUPLICATE KEY UPDATE:
If
you specify ON
DUPLICATE KEY UPDATE
,and a row is inserted that would cause a duplicate value in
a
UNIQUEindex
or
PRIMARY KEY,
an
UPDATEof
the old row is performed. The affected-rows value per row is 1 if
the row is inserted as a new row, 2 if an existing row is updated,
and 0 if an existing row is set to its current values. If you
specify the
CLIENT_FOUND_ROWSflag
to
mysql_real_connect()when
connecting to mysqld,
the affected-rows value is 1 (not 0) if an existing row is set to
its current values.
那么到此,我们已经找到三种方法实现写不重复数据!
方法1:指定一个或多个UNIQUE KEY,使用insert ignore into
指令,即遇到唯一键值相同时,丢弃新数据,这考虑到了一般都会有自增ID(必须为主键);或者用
ON
DUPLICATE KEY UPDATE col_name=
expr...去更新新数据到表(
需要注意唯一键值不能变,否则可能出错
)。
方法2:不用自增ID,用复合主键!写数据方法同上!
方法3:指定UNIQUE KEY or PRIMARY KEY用REPLACE语句。
为什么不用自增ID呢?答:实践发现,复合主键是组合多列为1个主键,主键还是一个,所有列的值合为一个主键。用了自增ID再用复合主键等于没设复合主键。
新问题:是可以插入了,但是ID会变的不连续。问题具体为:当有一条重复数据插入时候,使用INSERT IGNORE INTO 语句执行完毕后,重复数据没有插入但是ID自增还是运行了一次,这就导致ID出现不连续的情况。这些不连续的ID值也就是出现重复的时候。[b]使用[b]ON
DUPLICATE KEY
UPDATE时也会出现此问题。REPLACE则是删除旧记录,新纪录卸载表后面,ID还是不连续。[/b][/b]
[b]
[b]原因是什么呢?官方文档原文为:http://dev.mysql.com/doc/refman/5.6/en/insert-on-duplicate.html[/b][/b]
If you
specify
ON DUPLICATE KEY UPDATE, and a row is
inserted that would cause a duplicate value in
a
UNIQUEindex
or
PRIMARY KEY,MySQL performs
an
UPDATEof
the old row. For example, if
column
ais
declared as
UNIQUEand
contains the value
1,
the following two statements have similar effect:
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1; UPDATE table SET c=c+1 WHERE a=1;
(The effects are
not identical for an
InnoDBtable
where
ais
an auto-increment column.With an auto-increment
column, an
INSERTstatement
increases the auto-increment value
but
UPDATEdoes
not.)
With
ON DUPLICATE KEY UPDATE,
the affected-rows value per row is1 if the row is
inserted as a new row,2 if an existing row
is updated, and0 if an existing row
is set to its current values. If you specify
the
CLIENT_FOUND_ROWSflag
to
mysql_real_connect()when
connecting to mysqld,
the affected-rows value is 1 (not 0) if an existing row is set to
its current values.
原因正是我的表中ID是一个自增量!【原文:If
a table contains an
AUTO_INCREMENTcolumn
and
INSERT ... ON DUPLICATE KEY UPDATEinserts
or updates a row, the
LAST_INSERT_ID()function
returns the
AUTO_INCREMENTvalue.】如果不是自增量,ID处理会更麻烦。。。这问题暂时无解。。。
Note:1. 尽量避免用ON DUPLICATE KEY UPDATE去更新多UNIQUE
KEY的表,有时候会出乎意料!2. The
DELAYEDoption
is ignored when you use
ON DUPLICATE KEY UPDATE.3. Thus,
in MySQL 5.6.4 and later,
INSERT ... SELECT ON DUPLICATE KEY UPDATEstatements
are flagged as unsafe for statement-based
replication.复合主键或者多个唯一键时不安全。。。
Prior to MySQL 5.6.6,
an
INSERTthat
affected a partitioned table using a storage engine such
as
MyISAMthat
employs table-level locks locked all partitions of the table. This
was true even for
INSERT ... PARTITIONstatements.
(This did not and does not occur with storage engines such
as
InnoDBthat
employ row-level locking.) In MySQL 5.6.6 and later, MySQL uses
partition lock pruning, so that only partitions into which rows are
inserted are actually locked. For more information,
see Section 18.6.4, “Partitioning and
Locking”.
转至:http://blog.csdn.net/zhanh1218/article/details/21459297
相关文章推荐
- mysql,mysqli,PDO的各自不同介绍
- MYSQL的FOUND_ROWS()函数
- MySQL
- mysql主从同步配置
- mysql 队列 实现并发读
- MYSQL千万级数据量的优化方法积累
- 经典分享MySQL的limit查询优化
- Mysql建表和索引使用规范
- MySql在建立索引优化时需要…
- mysql中文全文索引
- MySQL 手工注入语句总结
- MySQL--慢查询
- 使用PDO查询Mysql来避免SQL注入风…
- MySQL错误2014原因无法执行查询--P…
- mysql操作技巧随笔--链表删除数据
- Mysql 复合唯一键值的使用--O…
- mysql SELECT @last&nbs…
- 如何将EXCEL内容导入mysql
- mysql主从复制
- 超详细mysql left join,…