Some applications are designed to pull data from a central database into a local cache. For example, on-the-road sales personnel only connect occasionally to the central database to get the latest inventory data. Or heavy processing of business logic is offloaded from the production database server, processed locally and sent back to the production database. These applications need a way to track data changes in order to know what data has been changed. Before SQL Server 2008, application developers had to implement custom tracking solutions using triggers or timestamp columns, and creat additional tables in order to track data changes. As we all know, triggers are expensive. Each table involved in the DML operation is checked recursively via an internal function for the presence of a trigger. In addition, because triggers are executed as a part of the transactions that cause them to be invoked, the transactions take longer to commit and introduce complicated blocking issues. SQL Server 2008 provides a new feature, Change Tracking. Change tracking is great for building one-way or two-way synchronization applications and is designed to work with Sync Services for ADO.NET. Application developers can use Change Tracking to synchronize any data between SQL Server databases, or even between SQL Server and non-SQL Server databases. Compared with replication, Change Tracking is more for developers than DBAs as it provides developers with a flexible foundation to build synchronization applications with .NET, but lacks stored procedure support or built-in monitoring tools like Replication Monitor.
Enabling Change Tracking on an existing table does NOT require any changes on the table schema. The only requirement is that the table must already have a primary key. Change tracking information is recorded synchronously on transaction commit time so it presents an accurate sequence of DML operations. Although Change Tracking is executed synchronously with transactions, its performance overhead is very light compared to triggers. Besides, it captures only the values of the primary key columns of the changed rows and records the values in the change tracking table. These values can be joined with the base table to get the changed data. There is less storage overhead compared with Change Data Capture (for more information, see MAKs CDC series) because non-key columns are not captured in the change tracking table. However, because only net changes, not intermediate changes, to data rows can be queried, Change Tracking is not suitable for auditing.
In this article, we will demonstrate how to set up Change Tracking on the table HumanResources.Department in the database AdventureWorks2008 and how to query the changed data. If you havent installed the AdventureWorks2008 sample database, you can download it from http://msftdbprodsamples.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=18407
Before you enable change tracking on a table, you need to enable change tracking for the database. Execute the following command to enable Change Tracking on the AdventureWorks2008 database.
ALTER DATABASE AdventureWorks2008 SET CHANGE_TRACKING = ON (AUTO_CLEANUP=ON, CHANGE_RETENTION=8 hours) GO
When this command runs, it also enables automatic cleanup of change tracking information as the AUTO_CLEANUP option is turned on. The retention period is set to at least 8 hours. An internal task runs every 30 minutes to remove old transactions. SQL Server will also begin to populate an internal system table sys.syscommittab with transaction information. This table will have one row for every transaction that causes data modifications on tracked tables in the database. Although this table cannot be viewed in non-DAC connections, the information in the table is exposed by a DMV, sys.dm_tran_commit_table. Here are the columns in the DMV.
- commit_ts: A monotonically increasing number that serves as a database-specific timestamp for each committed transaction.
- xdes_id: A database-specific internal ID for the transaction.
- commit_lbn: The number of the log block that contains the commit log record for the transaction.
- commit_csn: The instance-specific commit sequence number for the transaction.
- commit_time: The time when the transaction was committed.
If you never enabled change tracking on the database, then the DMV is empty.
select * from sys.dm_tran_commit_table
To enable Change Tracking on the HumanResources.Department table, run the following command.
ALTER TABLE HumanResources.Department ENABLE CHANGE_TRACKING GO
This command creates an internal table that is used to record changes made to the HumanResources.Department table. The table name is change_tracking_[tableObjectID]. Since it is an internal table, it can only be viewed in DAC connection. In our example, the object ID of the HumanResources.Department table is 757577737, therefore, the table name is change_tracking_757577737. The table is empty as shown below because we havent made any modifications to the HumanResources.Department table.
Here are the columns of the table.
- sys_change_xdes_id: Transaction ID of the transaction that modified the row.
- sys_change_xdes_id_seq: Sequence identifier for the operation within the transaction.
- sys_change_operation: Type of operation that affected the row: insert, update, or delete.
- sys_change_columns: List of which columns were modified (used for updates, only if column tracking is enabled).
- sys_change_context: Application-specific context information provided during the DML operation using the WITH CHANGE_ TRACKING_CONTEXT option.
- k_[name]_[ord]: Primary key column(s) from the target table. [name] is the name of the primary key column, [ord] is the ordinal position in the key, and [type] is the data type of the column.
Because the HumanResources.Department table only has one column, DepartmentID, in the primary key, there is only one primary key column k_DepartmentID_00000001 in the change tracking table.
The sys_change_columns column can be used to track which columns were modified for each update operation. (Insert and delete statements always change all of the columns so this column is always NULL for inserts and deletes). By enabling column tracking on the table, you can get only the data from the columns that have been updated. This will limit the amount of data returned and transferred through the network. This will also make consolidation of incremental data changes more efficient as big size columns, such as varbinary(max) and xml, are only returned when they have been updated. To enable column tracking, set the TRACK_COLUMNS_UPDATED option.
ALTER TABLE HumanResources.Department ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON)
You will see an example of using column tracking in the next article.
You can also use the sys_change_context column to track the context in which changes to the table were made. The context is provided by the client that issues DML statements. It can be a constant, such as an application ID. A sample update statement is as follows.
DECLARE @originator_id varbinary(128) SET @originator_id = CAST('MyApplication' AS varbinary(128)) WITH CHANGE_TRACKING_CONTEXT (@originator_id) UPDATE HumanResources.Department SET GroupName='Accounting' WHERE DepartmentID=10
If you want to disable Change Tracking on a database, you will need to first disable all change-tracked tables in the database. You can look for the list of such tables from the sys.change_tracking_tables catalog view. Here is a simple SQL statement to generate all the ALTER TABLE statements.
SELECT 'ALTER TABLE ' + object_name(object_id) + ' DISABLE CHANGE_TRACKING;' FROM sys.change_tracking_tables
After you disabled Change Tracking for the tables, run the following command to turn off Change Tracking in the database.
ALTER DATABASE AdventureWorks2008 SET CHANGE_TRACKING = OFF
This article introduced Change Tracking, and illustrated how to enable Change Tracking on database, table and turn on various options, for example, AUTO_CLEANUP, TRACK_COLUMNS_UPDATED. In the next article, we will show you how to obtain data changes from change-tracked tables.