(转载)The shortest, fastest, and easiest way to compare two tables in SQL Server: UNION
2008-01-01 14:21
686 查看
转自:http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx
When you have two tables (or resultsets from SELECT statements) that you wish to compare, and you want to see any changes in ANY columns, as well as to see which rows exist in 1 table but not the other (in either direction) I have found that the UNION operator works quite well.
UNION allows you to compare all columns very quickly, and also handles comparing NULL values to other NULLs successfully, which a join clause or a WHERE condition doesn't normally do. It also allows you to very quickly see which rows are missing in either table, which only a FULL OUTER JOIN will do, but of course we all know to avoid those at all costs (right?) -- a full outer join is about as “unrelational” as you can get. (every column returned is potentially Null and must be wrapped in a COALESCE function). Best of all, the UNION is quick and easy and short.
The basic idea is: if we GROUP the union of two tables on all columns, then if the two tables are identical all groups will result in a COUNT(*) of 2. But for any rows that are not completely matched on any column in the GROUP BY clause, the COUNT(*) will be 1 -- and those are the ones we want. We also need to add a column to each part of the UNION to indicate which table each row comes from, otherwise there is no way to distinguish between which row comes from which table.
So, here's an example, assuming we are comparing tables A and B, and the primary key of both tables is ID:
SELECT MIN(TableName) as TableName, ID, COL1, COL2, COL3 ...
FROM
(
SELECT 'Table A' as TableName, A.ID, A.COL1, A.COL2, A.COL3, ...
FROM A
UNION ALL
SELECT 'Table B' as TableName, B.ID, B.COL1, B.COl2, B.COL3, ...
FROM B
) tmp
GROUP BY ID, COL1, COL2, COL3 ...
HAVING COUNT(*) = 1
ORDER BY ID
The above returns all rows in either table that do not completely match all columns in the other. In addition, it returns all rows in either table that do not exist in the other table. It handles nulls as well, since GROUP BY normally consolidates NULL values together in the same group. If both tables match completely, no rows are returned at all.
The MIN() aggregate function used on the TableName column is just arbitrary -- it has no effect since we are only returning groups of rows in which there has been no consolidation with the GROUP BY (note the HAVING clause).
I've posted an implementation of this technique as a stored procedure in the SQLTeam script library section of the forums (http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=23054). Here it is, below:
CREATE PROCEDURE CompareTables(@table1 varchar(100),
@table2 Varchar(100), @T1ColumnList varchar(1000),
@T2ColumnList varchar(1000) = '')
AS
-- Table1, Table2 are the tables or views to compare.
-- T1ColumnList is the list of columns to compare, from table1.
-- Just list them comma-separated, like in a GROUP BY clause.
-- If T2ColumnList is not specified, it is assumed to be the same
-- as T1ColumnList. Otherwise, list the columns of Table2 in
-- the same order as the columns in table1 that you wish to compare.
--
-- The result is all rows from either table that do NOT match
-- the other table in all columns specified, along with which table that
-- row is from.
declare @SQL varchar(8000);
IF @t2ColumnList = '' SET @T2ColumnList = @T1ColumnList
set @SQL = 'SELECT ''' + @table1 + ''' AS TableName, ' + @t1ColumnList +
' FROM ' + @Table1 + ' UNION ALL SELECT ''' + @table2 + ''' As TableName, ' +
@t2ColumnList + ' FROM ' + @Table2
set @SQL = 'SELECT Max(TableName) as TableName, ' + @t1ColumnList +
' FROM (' + @SQL + ') A GROUP BY ' + @t1ColumnList +
' HAVING COUNT(*) = 1'
exec ( @SQL)
When you have two tables (or resultsets from SELECT statements) that you wish to compare, and you want to see any changes in ANY columns, as well as to see which rows exist in 1 table but not the other (in either direction) I have found that the UNION operator works quite well.
UNION allows you to compare all columns very quickly, and also handles comparing NULL values to other NULLs successfully, which a join clause or a WHERE condition doesn't normally do. It also allows you to very quickly see which rows are missing in either table, which only a FULL OUTER JOIN will do, but of course we all know to avoid those at all costs (right?) -- a full outer join is about as “unrelational” as you can get. (every column returned is potentially Null and must be wrapped in a COALESCE function). Best of all, the UNION is quick and easy and short.
The basic idea is: if we GROUP the union of two tables on all columns, then if the two tables are identical all groups will result in a COUNT(*) of 2. But for any rows that are not completely matched on any column in the GROUP BY clause, the COUNT(*) will be 1 -- and those are the ones we want. We also need to add a column to each part of the UNION to indicate which table each row comes from, otherwise there is no way to distinguish between which row comes from which table.
So, here's an example, assuming we are comparing tables A and B, and the primary key of both tables is ID:
SELECT MIN(TableName) as TableName, ID, COL1, COL2, COL3 ...
FROM
(
SELECT 'Table A' as TableName, A.ID, A.COL1, A.COL2, A.COL3, ...
FROM A
UNION ALL
SELECT 'Table B' as TableName, B.ID, B.COL1, B.COl2, B.COL3, ...
FROM B
) tmp
GROUP BY ID, COL1, COL2, COL3 ...
HAVING COUNT(*) = 1
ORDER BY ID
The above returns all rows in either table that do not completely match all columns in the other. In addition, it returns all rows in either table that do not exist in the other table. It handles nulls as well, since GROUP BY normally consolidates NULL values together in the same group. If both tables match completely, no rows are returned at all.
The MIN() aggregate function used on the TableName column is just arbitrary -- it has no effect since we are only returning groups of rows in which there has been no consolidation with the GROUP BY (note the HAVING clause).
I've posted an implementation of this technique as a stored procedure in the SQLTeam script library section of the forums (http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=23054). Here it is, below:
CREATE PROCEDURE CompareTables(@table1 varchar(100),
@table2 Varchar(100), @T1ColumnList varchar(1000),
@T2ColumnList varchar(1000) = '')
AS
-- Table1, Table2 are the tables or views to compare.
-- T1ColumnList is the list of columns to compare, from table1.
-- Just list them comma-separated, like in a GROUP BY clause.
-- If T2ColumnList is not specified, it is assumed to be the same
-- as T1ColumnList. Otherwise, list the columns of Table2 in
-- the same order as the columns in table1 that you wish to compare.
--
-- The result is all rows from either table that do NOT match
-- the other table in all columns specified, along with which table that
-- row is from.
declare @SQL varchar(8000);
IF @t2ColumnList = '' SET @T2ColumnList = @T1ColumnList
set @SQL = 'SELECT ''' + @table1 + ''' AS TableName, ' + @t1ColumnList +
' FROM ' + @Table1 + ' UNION ALL SELECT ''' + @table2 + ''' As TableName, ' +
@t2ColumnList + ' FROM ' + @Table2
set @SQL = 'SELECT Max(TableName) as TableName, ' + @t1ColumnList +
' FROM (' + @SQL + ') A GROUP BY ' + @t1ColumnList +
' HAVING COUNT(*) = 1'
exec ( @SQL)
相关文章推荐
- Fastest Way to Update Rows in a Large Table in SQL Server
- Simple and easiest way to make a screen shot in .net 2.0
- How to use Trusted Connection when SQL server and web Server are on two separate machines.
- How to monitor blocking in SQL Server 2005 and in SQL Server 2000 [ZT-MS]
- Partitioned Tables and Indexes in SQL Server 2005
- Easy way to change collation of all database objects in SQL Server
- (转)A SQL query walks into a bar and sees two tables. He walks up to them and says 'Can I join you?'
- Top 10 steps to optimize data access in SQL Server: Part V (Optimize database files and apply partitioning)
- Using Spring 4 WebSocket, sockJS and Stomp support to implement two way server client communication
- Stored procedures to implement paging for large tables or queries in SQL Server 2005 and SQL Server 2008
- Download and run CSQL on Linux -Only two commands will be used to start the server
- String comparison is a common programming task and Java provides several way to compare two String i
- 转载:Plan freezing and other plan guide enhancements in SQL Server 2008
- How to backup and restore database in SQL Server
- Part 9 Union and union all in sql server
- Partitioned Tables and Indexes in SQL Server 2005 (EN原版)
- How to Get First and Last Day of a Month in SQL Server
- Tables without a clustered index are not supported in this version of SQL Server. Please create a clustered index and try again.
- Partitioned Tables and Indexes in SQL Server 2005
- Essential LightWave v9: The Fastest and Easiest Way to Master LightWave 3D