Managed Extensions: Using the .NET OLE DB Classes to Read Text Files
Welcome to this week's installment of .NET Tips & Techniques! Each week, award-winning Architect and Lead Programmer Tom Archer demonstrates how to perform a practical .NET programming task using either C# or Managed C++ Extensions.
A previous .NET Tips & Techniques article illustrated how to access comma-delimited text files using the .NET ODBC classes. This week's tip covers performing the same task using the .NET OLE DB classes.
Note: If you're trying to decide which database-access technology to use—ODBC or OLE DB—I posted an article on my consulting Web site that compares ODBC to OLE DB.
Using the OLE DB ClassesMuch like any of the database-access classes provided by the .NET BCL (Base Class Library), the OLE DB classes that you'll use most often represent a connection to the data source (OleDbConnection), a command object that will be executed against the data source (OleDbCommand), and a reader that is used to enumerate the results of a command (OleDbDataReader). Therefore, the standard process for reading any type of file using OLE DB classes is the following:
- Instantiate an OleDbConnection object, passing to it a connection string that must include the name of the OLE DB provider for the underlying data source.
- Call the OleDbConnection::Open method.
- Instantiate an OleDbCommand object and specify to it the SQL that you wish to be executed against the data store.
- Call one of the OleDbCommand execute methods—ExecuteNonQuery, ExecuteReader,
- Used for a command that will not return a result set, but will return a value indicating the number of rows affected
- Returns a result set based on the executed command
- Returns only the first column of the first row of the result set, ignoring the balance of the result set
- In the case of using the OleDbCommand::ExecuteReader:
- A reader object is returned. It is enumerated by using the OleDbDataReader::Read method to read all rows returned (indicated by the OleDbDataReader::FieldCount property).
- For each row, the OleDbDataReader::GetName method is used to determine the current column name, and then you can either use the generic OleDbDataReader::GetValue method to get the actual data value or methods specific to the data being read—such as GetBytes, GetChar, GetDateTime, and so on.
Specifying the Connection StringWhen formatting an OLE DB connection string, be aware of the following three main parts:
- Provider—Always set this to the OLE DB provider that you will be using. In the case of accessing text files, this value is Provider=Microsoft.Jet.OLEDB.4.0;
- Data Source—When accessing text files, set this value to the fully qualified file name (e.g., Data Source=myData.txt;)
- Extended Properties—This is where things get a bit interesting. This parameter allows you to specify several properties that are specific to the provider being used. Specifically, you can specify things such as whether the file has a column row (as its first row) and how the data is formatted.
Here are some example Extended Properties settings:
No header row, file is delimited
Header row, file is delimited
The schema.ini FileSince the Jet OLE DB provider uses the Microsoft Text Driver to access text files, you can define a schema.ini file just as you would if you were using ODBC to access the data. This is covered in a previous article so I won't repeat all that here. However, bare in mind that if a conflict exists between what you specify in the Extended Properties part of the OLE DB connection string and a parameter you specify in the schema.ini file, the schema.ini file will take precedence. For example, if you specify HDR=No in the connection string, but specify ColNameHeader=TRUE in the schema.ini file, then the file will be parsed as if the first row is a column header.