您的位置:首页 > 其它

批量导入数据(ADO.NET)

2008-11-11 07:41 393 查看
如果你需要在程序中批量插入成千上万行的数据,你会怎么编写代码呢?最近在帮朋友调优这个的时候,总结了几种方法,并对其进行比较。

大概的界面如下,我模拟了一个客户资料表.





数据我是放在一个XML文件的,大约6734行。类似下面的格式

<?xmlversion="1.0"encoding="utf-8"?>
<root>
<Customers>
<CustomerID>ALFKI</CustomerID>
<CompanyName>Sina</CompanyName>
<ContactName>MariaAnders</ContactName>
<ContactTitle>SalesRepresentative</ContactTitle>
<Address>ObereStr.57</Address>
<City>Berlin</City>
<PostalCode>12209</PostalCode>
<Country>Germany</Country>
<Phone>030-0074321</Phone>
<Fax>030-0076545</Fax>
</Customers>
<Customers>
<CustomerID>ANATR</CustomerID>
<CompanyName>AnaTrujilloEmparedadosyhelados</CompanyName>
<ContactName>AnaTrujillo</ContactName>
<ContactTitle>Owner</ContactTitle>
<Address>Avda.delaConstitución2222</Address>
<City>MéxicoD.F.</City>
<PostalCode>05021</PostalCode>
<Country>Mexico</Country>
<Phone>(5)555-4729</Phone>
<Fax>(5)555-3745</Fax>
</Customers>
<Customers>
<CustomerID>ANTON</CustomerID>
<CompanyName>AntonioMorenoTaquería</CompanyName>
<ContactName>AntonioMoreno</ContactName>
<ContactTitle>Owner</ContactTitle>
<Address>Mataderos2312</Address>
<City>MéxicoD.F.</City>
<PostalCode>05023</PostalCode>
<Country>Mexico</Country>
<Phone>(5)555-3932</Phone>
</Customers>
</root>

.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

下面首先在服务器稍微准备一下环境

USE[tempdb]
GO

SETANSI_NULLSON
GO
SETQUOTED_IDENTIFIERON
GO
CREATETABLE[dbo].[Customers](
[CustomerID][nchar](5)NOTNULL,
[CompanyName][nvarchar](40)NOTNULL,
[ContactName][nvarchar](30)NULL,
[ContactTitle][nvarchar](30)NULL,
[Address][nvarchar](60)NULL,
[City][nvarchar](15)NULL,
[Region][nvarchar](15)NULL,
[PostalCode][nvarchar](10)NULL,
[Country][nvarchar](15)NULL,
[Phone][nvarchar](24)NULL,
[Fax][nvarchar](24)NULL
)ON[PRIMARY]

CREATEPROCEDURE[dbo].[usp_InsertCustomer]
@CustomerIDnchar(5),
@CompanyNamenvarchar(40),
@ContactNamenvarchar(30),
@ContactTitlenvarchar(30),
@Addressnvarchar(60),
@Citynvarchar(15),
@Regionnvarchar(15),
@PostalCodenvarchar(10),
@Countrynvarchar(15),
@Phonenvarchar(24),
@Faxnvarchar(24)
AS

SETNOCOUNTON

INSERTINTO[dbo].[Customers](
[CustomerID],
[CompanyName],
[ContactName],
[ContactTitle],
[Address],
[City],
[Region],
[PostalCode],
[Country],
[Phone],
[Fax]
)VALUES(
@CustomerID,
@CompanyName,
@ContactName,
@ContactTitle,
@Address,
@City,
@Region,
@PostalCode,
@Country,
@Phone,
@Fax
)

.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

我们在tempdb中创建了一个表和一个存储过程
首先,我们把数据加载到一个DataSet
DataSetds=newDataSet();
privatevoidbtLoadData_Click(objectsender,EventArgse)
{
stringdataFile="CustomersData.xml";
ds.ReadXml(dataFile);
bindingSource1.DataSource=ds;
bindingSource1.DataMember="Customers";
dataGridView1.DataSource=bindingSource1;

}


.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

然后,我们第一个测试代码是遍历这个DataSet,每一行提交一次

privatestringGetConnectionString()
{
return"server=(local);database=tempdb;integratedsecurity=true;";
}
///<summary>
///直接遍历,一个一个的提交给服务器。时间为265毫秒左右
///每一行都需要写日志
///</summary>
///<paramname="sender"></param>
///<paramname="e"></param>
privatevoidbtOneByOne_Click(objectsender,EventArgse)
{
SqlConnectionconn=newSqlConnection(GetConnectionString());
SqlCommandcmd=conn.CreateCommand();
cmd.CommandText="usp_InsertCustomer";
cmd.CommandType=CommandType.StoredProcedure;
conn.Open();

TimeSpanstartTime=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime;

foreach(DataRowrowinds.Tables[0].Rows)
{
cmd.Parameters.Clear();
SqlParameter[]param=newSqlParameter[]{
newSqlParameter("@CustomerID",row[0].ToString()),
newSqlParameter("@CompanyName",row[1].ToString()),
newSqlParameter("@ContactName",row[2].ToString()),
newSqlParameter("@ContactTitle",row[3].ToString()),
newSqlParameter("@Address",row[4].ToString()),
newSqlParameter("@City",row[5].ToString()),
newSqlParameter("@Region",row[6].ToString()),
newSqlParameter("@PostalCode",row[7].ToString()),
newSqlParameter("@Country",row[8].ToString()),
newSqlParameter("@Phone",row[9].ToString()),
newSqlParameter("@Fax",row[10].ToString())
};
cmd.Parameters.AddRange(param);
cmd.ExecuteNonQuery();
}
conn.Close();

TimeSpanduration=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime.Subtract(startTime);
MessageBox.Show("已经全部插入成功,所用时间为"+duration.Milliseconds.ToString()+"毫秒");
}

.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

接下来,我们使用ADO.NET内置的一个DataAdapter来提交

///<summary>
///这是使用Adapter的方式,其实还是遍历,而且语法也没有简单
///同时,速度甚至更慢。时间为650毫秒左右
///每一行都需要写日志
///</summary>
///<paramname="sender"></param>
///<paramname="e"></param>
privatevoidbtUseAdapter_Click(objectsender,EventArgse)
{
SqlDataAdapteradapter=newSqlDataAdapter();
adapter.AcceptChangesDuringUpdate=false;//为了演示目的,把这个开关关掉,以免它在更新完成后把数据集标记为未更改
adapter.UpdateBatchSize=10;//这个好像也没有什么用

SqlConnectionconn=newSqlConnection(GetConnectionString());
SqlCommandinsertCommand=conn.CreateCommand();
insertCommand.CommandText="usp_InsertCustomer";
insertCommand.CommandType=CommandType.StoredProcedure;

insertCommand.UpdatedRowSource=UpdateRowSource.None;

insertCommand.Parameters.Add("@CustomerID",SqlDbType.NChar,5,"CustomerID");
insertCommand.Parameters.Add("@CompanyName",SqlDbType.NVarChar,40,"CompanyName");
insertCommand.Parameters.Add("@ContactName",SqlDbType.NVarChar,30,"ContactName");
insertCommand.Parameters.Add("@ContactTitle",SqlDbType.NVarChar,30,"ContactTitle");
insertCommand.Parameters.Add("@Address",SqlDbType.NVarChar,60,"Address");
insertCommand.Parameters.Add("@City",SqlDbType.NVarChar,15,"City");
insertCommand.Parameters.Add("@Region",SqlDbType.NVarChar,15,"Region");
insertCommand.Parameters.Add("@PostalCode",SqlDbType.NVarChar,10,"PostalCode");
insertCommand.Parameters.Add("@Country",SqlDbType.NVarChar,15,"Country");
insertCommand.Parameters.Add("@Phone",SqlDbType.NVarChar,24,"Phone");
insertCommand.Parameters.Add("@Fax",SqlDbType.NVarChar,24,"Fax");

adapter.InsertCommand=insertCommand;
TimeSpanstartTime=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime;
adapter.Update(ds,"Customers");
TimeSpanduration=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime.Subtract(startTime);
MessageBox.Show("已经全部插入成功,所用时间为"+duration.Milliseconds.ToString()+"毫秒");

}

.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

最后,我们找到了最快的方法

///<summary>
///使用新的API,批量导入,这个速度很快,大约26毫秒,很显然,这种方式只写一次日志,不会为每一行写日志
///</summary>
///<paramname="sender"></param>
///<paramname="e"></param>
privatevoidbtBCP_Click(objectsender,EventArgse)
{
using(SqlConnectionconn=newSqlConnection(GetConnectionString()))
{
SqlBulkCopybcp=newSqlBulkCopy(conn);
bcp.DestinationTableName="Customers";
bcp.BatchSize=100;//这是批尺寸可以调整
for(inti=0;i<11;i++)
{
bcp.ColumnMappings.Add(i,i);
}
TimeSpanstartTime=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime;
conn.Open();
bcp.WriteToServer(ds.Tables[0]);
TimeSpanduration=System.Diagnostics.Process.GetCurrentProcess().UserProcessorTime.Subtract(startTime);
MessageBox.Show("已经全部插入成功,所用时间为"+duration.Milliseconds.ToString()+"毫秒");

}
}


.csharpcode,.csharpcodepre
{
font-size:small;
color:black;
font-family:consolas,"CourierNew",courier,monospace;
background-color:#ffffff;
/*white-space:pre;*/
}
.csharpcodepre{margin:0em;}
.csharpcode.rem{color:#008000;}
.csharpcode.kwrd{color:#0000ff;}
.csharpcode.str{color:#006080;}
.csharpcode.op{color:#0000c0;}
.csharpcode.preproc{color:#cc6633;}
.csharpcode.asp{background-color:#ffff00;}
.csharpcode.html{color:#800000;}
.csharpcode.attr{color:#ff0000;}
.csharpcode.alt
{
background-color:#f4f4f4;
width:100%;
margin:0em;
}
.csharpcode.lnum{color:#606060;}

还有一种办法是通过在服务器OPENXML,因为XML反复处理效率很差,所以就没有测试了,可以断定它肯定比其他几种还要慢。

另外提示一下,如果不用编程的方式,那么有其他三个可能的途径去做这个事情

1.BCP工具(这是一个命令行,可以做导入和导出,不过来源文件如果不规范,那么可能很费劲)

2.BULKINSERT语句(这是一个T-SQL语句,只能做导入,我们上面使用的SQLBULKCopy应该和他很类似)

3.XMLBulkLoad(这是一套COM的对象模型,适合导入XML文档)

还有,在做大量的数据导入和导出时,可以考虑微软为SQLServer配套的SSIS(IntegrationService)


                                            
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: