http://www.developer.com/net/cplus/article.php/3407551/Managed-Extensions-Using-the-NET-ODBC-Classes-to-Read-Text-Data.htm
Welcome to this week's installment of .NET Tips & Techniques! Each week, award-winning Architect and Lead Programmer Tom Archer demonstrates how to perform a practical .NET programming task using either C# or Managed C++ Extensions.
While data for professional-caliber applications is most typically stored in traditional
database systems, sometimes the data your application must use is in text format. This includes situations where you are accessing a small amount of test data, as well as scenarios where another system is providing a text file for you to use and you have no control over its format. While most people naturally think of using streams to read and write text files, ODBC has for years provided a driver specifically for this purpose.
Why would you use the driver and incur the overhead of ODBC when you can easily stream the data? Well, for starters, ODBC provides a generic SQL-like interface to the data. Secondly, due to ODBC's generic interface, using it instead of directly accessing a file via a stream allows you to more easily use the same code to access data in other formats.
For example, let's say that the data your code will ultimately work with is stored in a traditional RDBMS (relational database management system), such as SQL Server or Oracle. However, you might want to test your application logic against a small amount of data that you can quickly enter into a text file via any editor (such as Notepad). Using ODBC, you would simply specify different DSNs or ODBC drivers (for a DSN-less connection) based on which file format you're using. That way, you wouldn't have to maintain two completely different code bases for accessing your data (one for streaming text files and one for reading from the RDMBS).
This article illustrates how easy it is to read text data using the .NET ODBC classes.
Note that when the Text Driver is used to make a connection to a given path (specified with the DBQ parameter), only files in that specific directory can be accessed—not files in any subdirectories.
By default, the text driver assumes that the data contains a column heading row. Therefore, if your data does not contain this row and you do not define a schema.ini file, you will find that the first row of data being ignored. For example, if your data looked like the following, the reader code above would display only the second and third records (leaving out your favorite author!):
data file
data file
data file
To download the accompanying source code for this article, click here. The founder of the Archer Consulting Group (ACG), Tom Archer has been the project lead on three award-winning applications and is a best-selling author of 10 programming books as well as countless magazine and online articles.
Managed Extensions: Using the .NET ODBC Classes to Read Text Data
September 14, 2004
Reading Data from a DSN-less Text File
The ODBC Desktop Drivers include a driver for reading text called the Microsoft Text Driver. The easiest way to access text data is to simply use the ODBC Admin application (odbcad32.exe) and—specifying the text driver—create a DSN against the desired text file. However, I'll show you the basic steps for using the .NET ODBC classes to access a text file, such that you don't have to perform the extra step of creating a DSN. (The complete code—including basic error handling and clean-up—can be found in the dialog class of this article's demo application.)
StringBuilder* connString = new StringBuilder();
connString->Append(S"Driver={Microsoft Text Driver (*.txt; *.csv)};");
connString->AppendFormat(S"DBQ={0};",
Path::GetDirectoryName(strFileName));
OdbcConnection* connection = new OdbcConnection(connString->ToString());
connection->Open();
CString strSelect;
strSelect.Format(_T("SELECT * FROM [%s]"),
Path::GetFileName(strFileName));
OdbcCommand* command = new OdbcCommand(strSelect, connection);
OdbcDataReader* reader = command->ExecuteReader();
while (reader->Read())
{
for (int iCurrCol = 0; iCurrCol < reader->FieldCount; iCurrCol++)
{
// retrieve values via the various ODBCReader methods—such
// as GetValue
AfxMessageBox((CString)(reader->GetValue(iCurrCol)->ToString()));
}
}
Taking Control of the Process with the schema.ini File
Once you've started working with text files via a DSN-less connection, you might run into situations that will have you asking things like "How do I specify how the file is delimited (e.g., tab vs. comma)?" or "Where can I specify the character set?" These settings and more can be specified via a very simple file named schema.ini that resides in the same directory as the data file. The schema.ini file
is documented on the Microsoft Web site so I won't attempt to cover every possible parameter that can be specified. However, I will cover the most popular question I see on the Internet: how to specify if the data includes (as its first row) the column names of the data.
Tom Archer,Archer Consulting Group
Bradley Jones,Jupitermedia
Bill Gates,Microsoft
In order to specify that the data does not include a column row and that you don't wish to
name the columns, your schema.ini file would look like the following:
[data.txt]
ColNameHeader=FALSE
In terms of specifying the column names for your data, you have two choice:
The column name can be retrieved from the reader using the OdbcDataReader::GetName method. This article's demo application uses the last technique and—while being very simple in scope—allows you to tinker with your data file and schema.ini file so that you can easily test the various configuration combinations until you get it right for your particular application.
Name,Company
Tom Archer,Archer Consulting Group
Bradley Jones,Jupitermedia
Bill Gates,Microsoft
schema.ini file
[data.txt]
ColNameHeader=TRUE
Tom Archer,Archer Consulting Group
Bradley Jones,Jupitermedia
Bill Gates,Microsoft
schema.ini file
[data.txt]
ColNameHeader=FALSE
Col1=Name Char Width 255
Col2=Company Char Width 255
Download the Code
About the Author