Importing Data into SQL Server, Page 2
The SqlBulkCopy Class Properties
Table 2 lists the SqlBulkCopy properties that further modify how the BCP operation is to be handled, including how much feedback it returns while running. These properties include:
Table 2: The SqlBulkCopy Properties
|BatchSize||1||(Int) How many rows are sent per round-trip to SQL Server. Tune for best performance. Recommend around 500/batch.|
|BulkCopyTimeout||(Int) How many seconds should the application wait for the operation to complete. Set to normal operation length X1.5.|
|ColumnMappings||None||(SqlBulkCopyColumnMapping) Maps input data columns to output target table columns. Not required if the source and target column names are the same.|
|DestinationTableName||Required||(string) Addresses target table in Connection-specified database. Syntax: database.ownerschema.tablename.|
|NotifyAfter||None||(Int) Fires the SqlRowsCopied after N complete batch of rows processed. For best performance, do not specify this value or set to a fairly high number to notify no more often than once every 5 seconds or so.|
Executing the WriteToServer Method
The SqlBulkCopy class starts the BCP operation once the WriteToServer method is called. It accepts any object that supports IDataReader or a DataTable. You can also pass in a DataRowState option to specify that you only want "Modified" rows or rows that have some other specific RowState.
Once initiated by WriteToServer, the SqlBulkCopy class fires the SqlRowsCopied event if you specify a NotifyAfter value. In this event handler, you can abort the operation by setting the e.Abort property to True. The problem I found is that it's tough to pass in a form-level value into this routine as they are not updated until the WriteToServer thread is complete. I expect this will take some fancy multi-threading footwork to implement.
While the operation is running, the Connection State is set to busy—unless you've enabled MARS, in which case other commands are interleaved between the BCP operations. Once you're done, be sure to close the SqlBulkCopy instance. This frees the Connection for subsequent use.
Tips and Hints
You have to have your ducks in a row before you initiate the BCP operation using the SqlBulkCopy class. Here are a few tips:
- Make sure you have access to the target table. This means you'll need to use an account that has both SELECT and INSERT rights. In many cases, I simply run a script to build the target table on the fly before I start loading it with data, but this (generally) requires DBO rights.
- No, it's not a good idea to target an existing production table for reasons I've already mentioned. Import your data into a "work" table and run your validation routines against it. Once the data is pure, fold the data into the production table(s) using stored procedures.
- To gain maximum efficiency, set your batch size fairly large (as in several hundred to several thousand rows) but not too large because it puts an additional burden on the server. I generally experiment with the BatchSize property to see which gives me the best performance and reliability.
- Don't set the NotifyAfter property too low—again, you don't want to interrupt the processing any more than necessary.
- If the SqlBulkCopy operation fails, it's probably a good idea to flush the connection pool—or simply disable pooled connections for this operation. I've seen reports of unusable connections once something goes wrong.
- Over the years, I've also heard of BCP operations crashing SQL Server. Although these reports have dwindled with more recent versions, you might be wise to test these bulk operations thoroughly before betting your company on them.
Avoid any process that brings data to the client, processes it, and sends it back to the server—even with SqlBulkCopy. 99% of the time you can process the data far more efficiently on the server using TSQL or CLR stored procedures. Do you feel uncomfortable with these approaches? Perhaps they need to be added to your list of skills...
I have posted a full project to my Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition) support site so registered readers can download the completed code. I'll provide the highlights here to get you started...
Part 1 of the example includes code to establish the SqlConnection to the target database—Biblio in this case (the sample database provided with my book). I then execute a SQL Command batch (using a custom class) to create the target database table using a TSQL script. Notice that the slqBCbc object is declared "WithEvents" to expose the SqlRowsCopied event.
Part 2 illustrates code to open a connection using the ODBC "text" provider and expose a DataReader on this connection. Sure, you could have opened a custom file and parsed it into a DataTable or an array of DataRows if that's necessary—the WriteToServer method will accept either.
Page 2 of 3