In the concluding article of this series, we will discuss transaction support in SQL Server 2000 Data Transformation Services. Transactions within DTS packages serve the same purpose as in other SQL Server based operations - they provide the set of features described collectively using the term ACID, which is an acronym derived from four transactional characteristics - atomicity, consistency, isolation and durability. Atomicity means that a particular unit of work is either fully completed or not performed at all, consistency refers to the state of the underlying data, isolation specifies that each unit of work executes independently of others, and durability reflects permanent status of data changes.
By default, each task performs its operations independently of others within the same DTS package (from the transactional point of view). If an error is encountered, execution of the current task is abandoned. While you can use settings in the Error handling section on the Logging tab of the Package properties dialog box ("Fail package on the first error") to specify whether the entire package will fail at this point or control execution of subsequent tasks with workflow elements (which determine whether the next task will execute on success, failure, or completion), neither one of these two methods has any influence on operations completed prior to the failure. In order to accomplish this, you need to configure all related tasks to participate in the same transaction. This way, you can ensure that either all of them will complete successfully or will roll back to the original state if any one fails.
At the same time, however, you need to be aware that a package can execute only a single transaction at a time and that not every package is capable of properly handling transactions. The limitations are imposed by the types of connections and tasks employed within a package:
connections rely on their data provider for transactional support. This support is inherent to the Microsoft OLE DB Provider for SQL Server, ODBC data source (as long as the corresponding ODBC driver has implemented and set the SQL_ATT_ENLIST_IN_DTC connection attribute), and Microsoft Data Link (as long as the corresponding OLE DB provider has implemented the ITtransactionJoin interface). You can find more detailed information on the subject in the ODBC and OLD DB documentation.
tasks can be grouped in three categories based on their level of transactional support. The first one consists of tasks that are not transaction-aware, by the virtue of the way they operate, such as FTP, Dynamic Properties, Send Mail, or Copy SQL Server Object tasks. Tasks in the second category (ActiveX Script and Execute Process) can create and manage transactions internally (as part of a script or a process), but cannot participate in DTS transactions involving other tasks. Finally, the third one includes tasks that fully support transactions - Bulk Insert, Data Driven Query, Transform Data, Execute Package, Execute SQL and Message Queue.
Since DTS relies on the Microsoft Distributed Transaction Coordinator (MS DTC) when processing its transactions, you should ensure that the MS DTC service is running before you launch a package. In addition, you need to consider several configuration options that affect transactional behavior of a package and its tasks (tasks represent units of execution and are frequently referred to as steps, when discussing their role in DTS-based transactions):
- "Use transactions" checkbox on the Advanced tab of the Package Properties dialog box is required to turn on transaction functionality within the package. By default, this setting is enabled.
- "Commit on successful package completion" checkbox on the Advanced tab of the Package Properties dialog box will commit any transactions that are still open when the package completes. This does not affect in any way transactions that have already committed (as part of tasks level settings).
isolation level" drop-down list on the Advanced tab of the Package
Properties dialog box determines (as you can expect), the transaction isolation
level (which directly affects the locking mechanism) and can take one of the
- "Read Committed" - default setting for transaction isolation, which indicates that changes caused by other transactions executing in parallel with the current one are not visible until these transactions commit.
- "Read Uncommitted" - allows viewing uncommitted changes caused by other transactions executing in parallel with the current one
- "Repeatable Read" - ensures that data being read by the current transaction stays consistent until this transaction commits
- "Serializable" - most restrictive isolation level ensuring complete independence of the current transaction (highest locking level)
- "Chaos" - like "Read Uncommitted," permits viewing uncommitted changes, but without holding update locks until the end of transaction. Due to its lack of support on the SQL Server platform and inability to roll back, it is rarely used.
The less restrictive options (read uncommitted and chaos) improve concurrency (the ability to execute multiple transactions simultaneously) but increase the possibility of inconsistent reads (such as non-repeatable reads, dirty reads, and phantom reads), while the more restrictive options (read committed, repeatable read, or serializable) have the opposite effect.
- "Join transaction if present" checkbox on the Options tab of the Workflow Properties dialog box (accessible from a task's context sensitive menu by selecting Workflow -> Workflow Properties menu option) causes a task to be included in a transaction, providing one already is in progress and has not committed yet. Otherwise, execution of the task results in a new transaction.
- "Commit transaction on successful completion of this step" checkbox on the Options tab of the Workflow Properties dialog box triggers transaction commit when the task completes successfully (assuming that "Use transactions" checkbox on the package level is marked on).
- "Rollback transaction on failure" checkbox on the Options tab of the Workflow Properties dialog box triggers rollback of the transaction whenever the task fails (again, assuming that "Use transactions" checkbox on the package level is marked on).
Now let's look at a very simple example illustrating the use of transactions in DTS packages. We will create two tasks, configure them as part of the same transaction and examine the results. As our data stores, we will use a text file containing a single entry:
and the Shippers table in the Northwind database. Start by creating the text file (e.g. C:\Temp\Shippers.txt) and copying the line listed above into it. Next, launch DTS Designer and insert a Microsoft OLE DB Provider for SQL Server Connection pointing to the Northwind database into a new package. From the Task menu (or the toolbar on the left hand side) select Bulk Insert Task and configure its properties by typing a description (e.g. BULK INSERT Shippers), selecting the existing Northwind connection from the Existing connection listbox, pointing to Shippers as the destination table, and specifying C:\Temp\Shippers.txt as your source data file). Ensure that a semicolon is used as column delimiter. Close the Bulk Insert Task Properties dialog box, right click on the task icon, and select Workflow -> Workflow Properties context sensitive menu option. In the Workflow properties dialog box, switch to the Options tab and select the "Join transaction if present" check box. This will trigger the creation of a new transaction when the task is launched. Close the dialog box. Create the second task by selecting Execute SQL Task from the Task menu. In the Task Properties dialog box, type in a Description (e.g. "INSERT ShipperID 1"), select Northwind in the Existing connection listbox, and enter the following SQL Statement:
SET IDENTITY_INSERT Northwind.dbo.Shippers ON GO INSERT INTO Shippers (ShipperID, CompanyName, Phone) VALUES (1, 'Road Runner', '(503) 555-4567') GO SET IDENTITY_INSERT Northwind.dbo.Shippers OFF GO
This statement will attempt to insert a duplicate row into the Shippers table, which, of course, will result in failure. Since we are planning to run both tasks as part of the same transaction, the bulk insert changes will be rolled back and the content of the Shippers table will remain the same as prior to package execution. In order to complete our configuration, bring up the Workflow Properties dialog box for the Execute SQL task, set the "Join transaction if present" and "Rollback transaction on failure" checkboxes, and confirm the configuration by clicking on the OK command button. From the DTS Designer interface, insert the workflow "On Success" element between the Bulk Insert and Execute SQL tasks. Save and execute the package. In the Executing DTS Package dialog box, you should be able to see the progress of the execution, indicating success of the first task and failure of the second. Finally, check the content of Shippers table and verify that it has not changed.
You can experiment with additional configuration options (e.g. by setting "Commit transaction on successful completion of this step" for the first task) to further explore the transactional capabilities built into DTS. You can also refer to Books On Line for discussion on more elaborate package designs, such as the checkpointing package transactions, inherited transactions or parallel execution.
Even though this article concludes our introductory overview of SQL Server 2000 Data Transformation Services, we will revisit this subject when discussing DTS Security in our upcoming SQL Server 2000 Security series.