SQL Server 2000 DTS Part 1

Tuesday Oct 14th 2003 by Marcin Policht

As an introduction to his new series, 'SQL Server 2000 DTS,' Marcin Polict provides an overview of the main concepts of this technology.

As a database administrator, besides manipulation of the data residing in SQL Server 2000 databases, you are frequently faced with tasks that deal with transfer of data between diverse data sources. Frequently, such data needs to be also transformed or altered during the transfer. In the past (in SQL 6.5 and earlier), such issues presented a significant challenge and required substantial investment in development time. Data Transformation Services (DTS), introduced in SQL Server 7.0 and further improved in SQL Server 2000 drastically changed this situation. Our goal in this series of articles will be presenting the most relevant concepts of SQL 2000 DTS.

Data Transformation Services is a technology providing the ability to exchange and modify data among any OLE DB compliant data sources. SQL Server DTS is implemented as a set of programmable objects (forming DTS object model) accessible through programming and scripting, and through a much simpler to use graphical interface. Regardless of the way of interfacing with the DTS object model, activities involved in transition and transformation of data are organized into units called packages. Every DTS package functions as a container for four types of components:

  • connections - represent data source or target and are implemented using OLE DB providers. SQL 2000 DTS has built-in connections for SQL Server, Access, Excel, Visual FoxPro, text and html files, as well as third party databases such as Oracle, Paradox, and dBase. Others can be installed using additional OLE DB providers.
  • tasks - represent individual actions such as data transition or transformation. Tasks are also used to provide status notifications.
  • precedence constraints - determine conditions necessary for a task to execute, creating additional logic in the package workflow.
  • global variables - values or sets of values (such as entire rowsets) that can be shared among components of the same package.

There are different ways of creating DTS packages. The simplest one is based on available Wizards (DTS Export Wizard and DTS Import Wizard listed in the Tools -> Wizards -> Data Transformation Services in the SQL Enterprise Manager), which take you through a sequence of steps leading to creation of a new package. The trade-off for this simplicity is the limited number of configuration options. This also is the case when using Copy Database Wizard (listed in the Tools -> Wizards -> Management node). The recommended tool that provides balance between ease of use and range of available configuration options is DTS Designer (you can launch it by selecting the New Package option from the context sensitive menu of the Data Transformation Services node in the Enterprise Manager window). This is the most common way of creating complex packages, which you can also use to refine wizard-generated packages further. Finally, the most flexibility is available by accessing the DTS COM programmable interface; however, this requires familiarity with scripting or programming languages and DTS object model and takes the most time and effort.

Even though you do not have to save a package before you execute it, typically it is a good idea. This not only gives you a chance to reconfigure or rerun it later (if the data transformation process takes place multiple times), but also allows you to schedule it for execution. Packages can be stored in several types of locations, each one with a number of benefits and drawbacks:

  • sysdtspackages table of the MSDB database, which corresponds to the Local Packages node under Data Transformation Services folder of SQL Enterprise Manager. The main benefit of this option is support for versioning - for any local package you can access and edit each of its previously saved versions, not just the most current one. In addition, during save, you can assign user and owner passwords, required, respectively, in order to execute and edit this package.
  • Microsoft Repository, which corresponds to the Meta Data Services Packages node under Data Transformation Services folder of SQL Enterprise Manager. This option allows you to keep track of package history, including version number (providing access to previous versions, just as with Local Packages) and lineage information. Lineage provides the means for keeping track of changes that take place during package execution. In particular, you can collect lineage information on a per-row or per-column level. Row-level lineage keeps track of changes to individual rows as well as package execution details (user name, server name, as well as date and time). Column-level lineage maintains a listing of tables and their columns (for both source and destination) referenced in each package. Such information is typically used for auditing purposes. Both options are disabled by default. Enabling column-level lineage is possible once the package is saved in the Repository; row-level lineage requires that you create an additional column in the data source, containing row identifier. As you can imagine, overhead associated with keeping track of changes during package execution affects its speed. In fact, packages stored in the Microsoft Repository are the slowest, when compared with other storage types.
    Unlike with the Local Packages option, it is not possible to provide individual security settings for a Repository-resident package.
  • structured storage files (with extension .DTS) are ideal for copying packages between SQL servers. You can store any existing package (or a number of packages) within a .DTS file and use traditional file transfer methods (FTP, e-mail, or Windows copy and paste) to deliver it to a target SQL server. Packages within structured storage files can be secured with user and owner passwords. They can also be executed directly (without importing them back into SQL Server) using the DTSRUN.EXE command line utility (DTSRUNUI.EXE GUI applet provides a friendly way of configuring all DTSRUN parameters).
  • Visual Basic module files (with extension .BAS) are intended for Visual Basic developers, who want to be able to use programming methods of DTS object model to modify existing packages. Note that such packages are not intended to be re-imported into DTS Designer or Repository.

Packages can be run either interactively or as scheduled jobs. Interactive execution can be triggered from the DTS Designer window (e.g. by selecting the Execute option from the Package menu) or from the Command Prompt by running the DTSRUN utility. You can also use for this purpose xp_cmdshell extended stored procedure using the syntax:

master..xp_cmdshell 'DTSRUN /S "servername" /U "username" /P "password" /N "DTSPackageName"'

The Schedule Package option is available from the context sensitive menu of packages listed under the Data Transformation Services node in the Enterprise Manager. Alternatively, you can schedule DTSRUN command with appropriate parameters using Windows Scheduled Tasks (available from the Programs -> Accessories -> System Tools menu) or the DTSRUNUI utility. When scheduling packages with Enterprise Manager or DTSRUNUI, make sure that the SQL Server Agent service is configured properly and running. Finally, you can also execute a package from within another package with Execute Package task, which we will cover in one of our future articles (in this case, the way the package runs will depend on the execution mode of its parent).

When scheduling packages, it is recommended to enable logging (configurable from the Logging tab of the Package properties dialog box). This greatly simplifies troubleshooting in case of a job failure (since error messages in the job history are fairly generic). Logging provides status and progress information, as well as error messages for each step. Logging can be recorded into a text file or MSDB database tables (sysdtspackagelog and sysdtssteplog). You should remember to periodically clear them (by either deleting files or truncating tables), since new logging information is always appended to existing logs.

The way a package is launched also affects its security context. During interactive execution, a package operates using the same account as the currently logged-on user (the one who launched it). The only exception is running DTSRUN utility from within the xp_cmdshell extended stored procedure (as illustrated above), in which case, the outcome depends on settings on the Job System tab of the SQL Server Agent Properties dialog box (accessible from the Management folder in the Enterprise Manager window). Using this tab, you can either limit the ability to run xp_cmdshell to users with System Admininistrator privileges, in which case xp_cmdshell executes using SQL Server Agent service account or, otherwise, specify a proxy account that will be used instead. Scheduled packages, on the other hand, operate always as the SQL Server Agent service account, unless you use the DTSRUN utility with Windows Scheduled Tasks, where you can arbitrarily assign user account information.

Note also that SQL Server Agent scheduled jobs run on the SQL Server, while interactive ones execute on the same system on which they were launched. This might have significant implications if a package includes such potentially data and processor intensive tasks as ActiveX Scripts. In order to take advantage of the processing power of the SQL Server system, either use scheduling or connect to the server via Terminal Services. In any case, you should try avoiding ActiveX Script tasks, as they are notorious for negative impact on package execution speed.

In our first article of the series dedicated to the Data Transformation Services, we provided an overview of the main concepts of this technology. This will serve as the starting point to the discussion on DTS wizards, which will be presented in the next article.

» See All Articles by Columnist Marcin Policht

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved