SQL Server 2005 Integration Services (SSIS) is a new Extract, Transform, and Load (ETL) tool that ships with SQL Server 2005. As with many of Microsoft’s development tools, you can extend SSIS to perform other operations not already bundled with the product. SSIS provides the SSIS Class Library for just this purpose. The SSIS Class Library Object Model includes all the classes and interfaces you need to extend or augment SSIS.
The best way to learn the capabilities of a new object model is to study the key points-of-interest in the object model. SSIS lends itself well to the approach. The sample program included with this article (ExploreIS) navigates the key points-of-interest in the SSIS Class Library. The following sections walk you through ExploreIS to introduce key classes in the SSIS Object model.
One caveat before delving into the details: The samples were developed using the June 2005 CTP of SQL Server 2005 and Visual Studio 2005.
A Primer for New SSIS Developers
Although this article assumes you’re experienced with SSIS, an introduction to SSIS development is beyond its scope. If you are new to SSIS, read the Developer.com article “SQL Server Integration Services: SQL Server 2005’s New ETL Platform” to gain an understanding of SSIS development.
Now, you’re ready to begin learning about the SSIS object model. The first stop is an overview of ExploreIS.
ExploreIS Under the Hood
ExploreIS is a Visual Studio 2005 Form Desktop application that operates in a straightforward manner. When you pass the path of the Integration Services file (.dtsx) to the program, it navigates the DTSX file and prints information to a listbox on the main Form.
For demonstration, the application has been hard-coded to explore the Data Lineage sample thath ships with the SQL Server 2005 SDK code samples. The Data Lineage sample reads data from a text file, adds some lineage information to the data, and writes the data to the SQL Server 2005 AdventureWorks sample database. Figure 1 shows a view of the package as seen from the Package Explorer tab.
Figure 1. A View of the Data Lineage Package from the Package Explorer Tab
ExploreIS renders some of the contents of the Data Lineage package in a hierarchical fashion (see Figure 2). Thus, items at the lower levels are indented below the containing object.
Figure 2. Results Returned by ExploreIS
Now that you understand the functionality of the sample code and have been introduced to the sample Data Lineage package, you can move on to the SSIS Object Model.
TaskHosts, Tasks, and Packages
When you develop a SSIS package, you drag, drop, and configure separate items from the Task toolbar into your project. Each Task component you drop into you project really consists of two primary classes, a TaskHost class and a Task class, along with other support classes.
The role each class plays in SSIS is partitioned as follows:
- TaskHost handles the behavior of the component in the SSIS environment. TaskHost includes event-handling functions, package variables, and logging.
- Task handles the execution of the component and may include other specialized classes called Data Flow Components.
The Data Lineage Sample contains two tasks, a Data Flow Task and an Execute SQL Task. The Package class maintains all parts of the SSIS package, including events, variables, and tasks. The Package class maintains TaskHosts (along with other classes) in a collection called Executables.
As you can see, the TaskHosts and Tasks have very specialized functions. The functionality of the Data Flow Task in SSIS is further subdivided into classes inheriting from the class PipelineComponent.
Pipeline Components are the muscles of the SSIS Data Flow Task. Pipeline Components work with ConnectionManager objects to move and transform all types of data. (A discussion of the ConnectionManager object will come later.)
In the Data Lineage Sample, you configure PipelineComponents on the Data Flow tab in the SSIS development environment. The Data Lineage Sample uses three PipelineComponents:
- A source component called Flat File Source
- A transformation component called Logged Lineage
- A destination component called OLE DB Destination
Note that the Data Flow Task appears to be implemented with unmanaged code. As you will see later in the article, a number of wrapper classes control access to this underlying unmanaged code. Because the underlying COM is hidden, you don’t need to be a COM expert. Being aware of the underlying infrastructure will be useful should you encounter any difficulties using the SSIS Data Flow Task.
Now that you’ve been introduced to TaskHosts, Tasks, and PipelineComponents, you’re ready to move on to the ConnectionManager.
ConnectionManagers are the classes underlying the Connections you see in the SSIS development environment. The Data Lineage sample package contains two Connection Managers: a SQL Server connection to the AdventureWorks database and a connection to the source data files.
ConnectionManagers are maintained in the Package class in a collection called Connections.
Now, you’re finally ready to delve into the details of the sample program.
Sample Program Classes
As discussed previously, ExploreIS reads some of the contents of a SSIS package. ExploreIS contains three key classes with the following roles:
- DTSPackageWrapper abstracts the navigation of the package class through a ReadNextContent function.
- DTSPackageContent reads the contents of Tasks, TaskHosts, ConnectionManagers, and Data Flow Components. Its overload constructor accepts Executable objects and ConnectionManager objects.
- DTSPackageContentItem stores what the DTSPackageContent class reads.
When ExploreIS executes, DTSPackageWrapper uses the SSIS Package class to open the Package and iterate over the Executable and Connections collections.
The heart of ExploreIS lies in the DTSPackageContent functions LoadExecutableContent and LoadConnectionContent. LoadExecutableContent and LoadConnectionContent work in similar ways, but LoadExecutableContent is much more complicated. The remainder of the article focuses on the LoadExecutableContent function.
Finding the Task
The SSIS Package object’s Executables collection can contain TaskHost as well as other classes that represent other items in a SSIS package. When the DTSPackageContent object is initialized using the Executable constructor, LoadExecutableContent executes in response to the LoadContentItems function call.
The following code snippet is from the LoadExecutableContents function:
private void LoadExecutableContents()
WriteContentItemToCollection(“*-” + _exec.ToString());
if (_exec is TaskHost)
th = _exec as TaskHost;
WriteContentItemToCollection(“**” + th.Name);
if (th.InnerObject is MainPipe)
mp = th.InnerObject as MainPipe;
As discussed previously, TaskHost contains the Task object. ExploreIS outputs the name of the SSIS Task. Data Flow Tasks get special treatment. The MainPipe class encapsulates the functionality of the Data Flow Task in the SSIS object library.
Now, you can turn to the useful collections in the MainPipe class.
Iterating Through the Data Flow Task PipelineComponents
As stated previously, the Data Flow Task is implemented by using unmanaged code and accessed through the MainPipe class. The following code snippet illustrates how information about the PipelineCompents setup in the Data Flow Task can be accessed by using collections in the MainPipe class:
mp = th.InnerObject as MainPipe;
WriteContentItemToCollection(“- MainPipe=” + mp.ToString());
foreach (IDTSPath90 pat in mp.PathCollection)
WriteContentItemToCollection(“– Path90=” + pat.Name);
//The foreach below accesses each component in the pipeline
foreach (IDTSComponentMetaData90 md in
Putting the Object Model to Use
Certainly, if you want to extend SSIS, you must understand the object model. However, the object model does have other uses. For example, one of the motivations for developing ExploreIS was to investigate how to use the object model to extract information for the Meta-data Repository in a third-party tool.
Also, many of the SSIS samples included in SQL Server 2005 show you how to create SSIS packages programmatically. You can use the SSIS object model to make SSIS part of a separate custom application. For example, you could build a tool that executes SSIS without a DTSX file.
Many other classes may be of further use to you. SQL Server 2005 Books Online includes more suggestions and it documents the Object Model in further detail.
No Class Knowledge Required
You don’t have to understand all the classes in an object model to gauge how you should interact with it. ExploreIS illustrates how you can read a SQL Server Integration Services Package using a handful of key classes.
Download the Code
To download the accompanying source code for the examples, click here.