October 23, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Importing Data into SQL Server

  • October 2, 2007
  • By William R. Vaughn
  • Send Email »
  • More Articles »

The SqlBulkCopy Class Properties

Table 2 lists the SqlBulkCopy properties that further modify how the BCP operation is to be handled, including how much feedback it returns while running. These properties include:

Table 2: The SqlBulkCopy Properties

SqlBulkCopy Properties Default Description
BatchSize 1 (Int) How many rows are sent per round-trip to SQL Server. Tune for best performance. Recommend around 500/batch.
BulkCopyTimeout   (Int) How many seconds should the application wait for the operation to complete. Set to normal operation length X1.5.
ColumnMappings None (SqlBulkCopyColumnMapping) Maps input data columns to output target table columns. Not required if the source and target column names are the same.
DestinationTableName Required (string) Addresses target table in Connection-specified database. Syntax: database.ownerschema.tablename.
NotifyAfter None (Int) Fires the SqlRowsCopied after N complete batch of rows processed. For best performance, do not specify this value or set to a fairly high number to notify no more often than once every 5 seconds or so.

Executing the WriteToServer Method

The data source for the SqlBulkCopy class can be an array of DataRow objects, an existing DataTable object or any object that exposes the IDataReader interface. This includes any of the .NET data provider classes—even an Odbc or OleDb DataReader. Again, this means you can source data from virtually any data source with a driver. By leveraging ADO.NET's ability to construct data tables from XML, you also can use properly formatted XML documents as your data source. See the DataSet.ReadXML method for more information. If you don't have a driver for your data, you can write your own DataReader implementation. There are several examples of this on the web.

The SqlBulkCopy class starts the BCP operation once the WriteToServer method is called. It accepts any object that supports IDataReader or a DataTable. You can also pass in a DataRowState option to specify that you only want "Modified" rows or rows that have some other specific RowState.

Once initiated by WriteToServer, the SqlBulkCopy class fires the SqlRowsCopied event if you specify a NotifyAfter value. In this event handler, you can abort the operation by setting the e.Abort property to True. The problem I found is that it's tough to pass in a form-level value into this routine as they are not updated until the WriteToServer thread is complete. I expect this will take some fancy multi-threading footwork to implement.

While the operation is running, the Connection State is set to busy—unless you've enabled MARS, in which case other commands are interleaved between the BCP operations. Once you're done, be sure to close the SqlBulkCopy instance. This frees the Connection for subsequent use.

Tips and Hints

You have to have your ducks in a row before you initiate the BCP operation using the SqlBulkCopy class. Here are a few tips:

  • Make sure you have access to the target table. This means you'll need to use an account that has both SELECT and INSERT rights. In many cases, I simply run a script to build the target table on the fly before I start loading it with data, but this (generally) requires DBO rights.
  • No, it's not a good idea to target an existing production table for reasons I've already mentioned. Import your data into a "work" table and run your validation routines against it. Once the data is pure, fold the data into the production table(s) using stored procedures.
  • To gain maximum efficiency, set your batch size fairly large (as in several hundred to several thousand rows) but not too large because it puts an additional burden on the server. I generally experiment with the BatchSize property to see which gives me the best performance and reliability.
  • Don't set the NotifyAfter property too low—again, you don't want to interrupt the processing any more than necessary.
  • If the SqlBulkCopy operation fails, it's probably a good idea to flush the connection pool—or simply disable pooled connections for this operation. I've seen reports of unusable connections once something goes wrong.
  • Over the years, I've also heard of BCP operations crashing SQL Server. Although these reports have dwindled with more recent versions, you might be wise to test these bulk operations thoroughly before betting your company on them.

To Recap

Avoid any process that brings data to the client, processes it, and sends it back to the server—even with SqlBulkCopy. 99% of the time you can process the data far more efficiently on the server using TSQL or CLR stored procedures. Do you feel uncomfortable with these approaches? Perhaps they need to be added to your list of skills...

The Example

I have posted a full project to my Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition) support site so registered readers can download the completed code. I'll provide the highlights here to get you started...

Part 1 of the example includes code to establish the SqlConnection to the target database—Biblio in this case (the sample database provided with my book). I then execute a SQL Command batch (using a custom class) to create the target database table using a TSQL script. Notice that the slqBCbc object is declared "WithEvents" to expose the SqlRowsCopied event.

Part 2 illustrates code to open a connection using the ODBC "text" provider and expose a DataReader on this connection. Sure, you could have opened a custom file and parsed it into a DataTable or an array of DataRows if that's necessary—the WriteToServer method will accept either.



Click here for a larger image.





Page 2 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel