Using SQL Server 2008 Change Data, Page 3
Understanding Change Data CaptureNow that you understand the basics of Change Tracking, you can start learning about the more advanced feature: Change Data Capture or CDC. As you have learned, Change Data Capture is able to tell which records have changed in your tables, but unlike Change Tracking, it can also tell how these columns changed.
The get started with Change Data Capture, you first need to enable it for both your database and table(s). Enabling must be done by executing a stored procedure; you cannot enable Change Data Capture from SQL Server Management Studio's property dialog boxes. Instead, the command you need to execute is:
Executing this command takes a little while, as SQL Server needs to create and prepare several system tables. The following is a list of the tables that are being created:
cdc.captured_columns cdc.change_tables cdc.ddl_history cdc.index_columns cdc.lsn_time_mapping dbo.systranschemas
As you can guess, you do have the possibility to query the contents of these tables directly, but since both the contents and the schemas are designed for internal use, the data in these system tables might change once the next service pack comes along. Thus, the better advice is to use the CDC functions to query these tables just like with Change Tracking data.
After enabling support for the database, the next step is to enable Change Data Capture for a particular table. This can be done by executing another SQL stored procedure, giving as a parameter name the table name for which you wish to enable monitoring. This could go like this for a table named MyTable in the schema dbo:
EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'MyTable', @role_name = N'AdminRoleOrUser', @supports_net_changes = 1
Here, the sys.sp_cdc_enable_table stored procedure prepares a table for Change Data Capture. The source schema and name parameters indicate which schema and table are to be enabled, and the role name parameter specifies who can read the collected records as it could contain sensitive information. Finally, the "net change support" feature allows you to query combined changes up to certain point in time. The alternative is "all changes support" which means that you will get an entry each time a change has been made. This means that "all changes" is better suited for keeping a log.
So far, you have learned that Change Data Capture works on a similar basis than the simpler Change Tracking: you must enable the feature both on the database level and individual table level. However, Change Data Capture requires you to perform one additional step: you need to enable a capture job for your database. This capture job is a regular SQL Agent job, which you can enable and disable at will. To enable the job for your database, you would need to execute an SQL statement similar to the following:
EXEC sys.sp_cdc_start_job @job_type = N'capture'
When you execute this statement, you might see three or four informational messages that the SQL Server Agent is not running and cannot be contacted. Luckily, this is easy to fix: you just need to start the agent service. Note however that the Change Data Capture collects data even without the agent, but then you cannot use the ready-made functions. The best option then is to enable SQL Server Agent before enabling the job.
Understanding capture instancesWhen reading the SQL Server documentation about Change Data Capture, it often refers to the term capture instance. This refers to a particular instance of collected data, and since you can only enable Change Data Capture on table-basis, each capture instance corresponds to a table in your database.
It is also important to understand how capture instances are named. By default, the capture instance name is formed from the schema and the table name being monitored with an underscore. For example, if you have the table called dbo.MyTable, then the capture instance name will be dbo_MyTable. And just as with table and column names in SQL Server, capture instance names are not case sensitive.
When you enable Change Data Capture for a specific table, SQL Server creates a special table to store the changes made to the original table. This table contains the same columns as the source table and several housekeeping columns. In addition to this internal change table, SQL Server also creates a special function for it. This function is then able retrieve the changes given a start and end time. For example, if you would like to retrieve all changes made to the table, you would use a function beginning with the name cdc.fn_cdc_get_all_changes. Since this stored procedure (and its sister cdc.fn_cdc_get_net_changes, if the source table is enabled for net changes) is created specifically for a given table, the capture instance name is appended to the function name. For instance, the table MyTable would have the following function name:
The function expects three parameters, a from LSN number, a to LSN number, and a row filtering option. Here, LSN stands for Log Sequence Number which is a unique identifier for each transaction. It is defined as binary(10). You can use the functions sys.fn_cdc_get_min_lsn and sys.fn_cdc_get_max_lsn to retrieve the minimum and maximum LSN numbers, respectively.