SQL Server 2000 Administration in 15 Minutes a Week: More Database Creation Topics

Friday Jun 7th 2002 by Michael Aubert
Share:

Michael Aubert's sixth article in the SQL Server 2000 Administration in 15 Minutes a Week series continues last week's coverage of database creation topics with an in-depth look at SQL and T-SQL and Transaction Logs.



Welcome to the sixth article in my series SQL Server Administration in 15 Minutes a Week. Last week we created our first database in SQL Server. We looked at what files make up a SQL Server database, how data is stored inside these files, and how filegroups can be used to manage files. This week we have a lot to cover, so let's get going. The topics for this article include:

- SQL and T-SQL
- Understanding Transaction Logs


SQL and T-SQL

Structured Query Language, also known as SQL, is a query and programming language. It can be used for accessing, updating, deleting, and adding data in a database. SQL can also be used for managing the RDBMS (Relational Database Management System) itself. Different databases may use versions of SQL that vary slightly, but most comply with the standard ANSI SQL-92 implementation of SQL, commonly call ANSI SQL. You can group SQL statements into two main categories: Data Definition Language (or DDL) and Data Manipulation Language (or DML).

DDL statements, as the name suggests, allow you to define the database structure. Most DDL statements begin with CREATE, ALTER, or DROP. The two DDL statements we are going to cover today are CREATE DATABASE (used for creating new databases) and ALTER DATABASE (used for altering existing databases). We will look at the exact syntax of these two statements later on in this article.

DML statements, on the other hand, are used for manipulating the data inside database objects. For example, the SELECT statement allows you to query the data inside a database, the INSERT statement allows for the addition of new data, the UPDATE statement updates selected data, and the DELETE statement allows you to remove data. As this series progresses we will cover these statements as well as many more DDL and DML statements in greater detail.

We now know what SQL is, but what is T-SQL? Simply, T-SQL is SQL Server's enhanced version of the standard SQL programming language. T-SQL in SQL Server 2000 allows for such things as stored procedures, IF and WHILE statements, and additional functions/data types (we will cover data types when we start creating tables) that are not available in standard SQL.

To use T-SQL statements we need some way of sending these statements to the RDBMS. One way is to use the OSQL command line utility. If you are a DBA who has worked with SQL Server 6.5, you will notice that the OSQL command line utility has replaced the older ISQL utility which does not support some of the new features of SQL Server 2000. While supporting SQL Server 2000, OSQL uses the command line interface and is therefore not very user friendly. However, if you would like to learn more about this utility, lookup "osql utility" in the SQL Server Books Online.

The tool we are going to use throughout this series to write, edit, and optimize our T-SQL scripts is called the SQL Query Analyzer. This graphical user interface tool lets you run T-SQL statements, create scripts that can be saved and edited, and will even provide information that can be used to optimize both databases and T-SQL statements.

There are two main ways to open the SQL Query Analyzer. First, you can open it directly by selecting "Query Analyzer" from the Microsoft SQL Server group on the Start Menu. When the SQL Query Analyzer opens, the first thing you will be asked is to select a server and provide the log on credentials. The following screen shows a connection to the local installation of SQL Server using Windows authentication.



Alternatively, you can access the Query Analyzer from inside the Enterprise Manager. Let's look at how to do this by creating a new database using T-SQL and the Query Analyzer. To start, open Enterprise Manager and connect to your SQL Server by selecting it in the tree pane.



Next, on the tools menu select Query Analyzer. The first thing you will notice is that we are not asked to select a server for login information. When you start Query Analyzer from within the Enterprise Manager, Query Analyzer automatically uses the server you have selected in the tree pane. Additionally, it passes the logon credentials that you used to connect to the server in Enterprise Manager.



The status bar at the bottom of the Query window gives us important information about the server we are connected to, the user we are connected as, the database we are using, as well as other status information.


Page 2: Creating a Database with T-SQL


 » See All Articles by Columnist Michael Aubert


To create a database we use the CREATE DATABASE statement. Let's look at the syntax for this statement:

CREATE DATABASE database_name
[ ON
    [ < filespec > [ ,...n ] ]
    [ , < filegroup > [ ,...n ] ]
]
[ LOG ON { < filespec > [ ,...n ] } ]
[ COLLATE collation_name ]
[ FOR LOAD | FOR ATTACH ]

< filespec > ::=

[ PRIMARY ]
(
[ NAME = logical_file_name , ]
    FILENAME = 'os_file_name'
    [ , SIZE = size ]
    [ , MAXSIZE = { max_size | UNLIMITED } ]
    [ , FILEGROWTH = growth_increment ] ) [ ,...n ]

< filegroup > ::=

FILEGROUP filegroup_name < filespec > [ ,...n ]

Confusing isn't it? To brake it down line by line have a look at the following link: Transact-SQL Reference: CREATE DATABASE You can also find this information in the  SQL Server Books Online under "CREATE DATABASE".

For example we are going to create a database called DBbySQL with a 5 MB data file and 1 MB log file. Enter the following statement into the query window:




There are a few things you should note here. First, the USE statement changes the Database context (the database we are working with). In this case, we are adding a new database so we use the master database. Next, the GO command tells SQL Query Analyzer to execute the current batch of Transact-SQL statements. A batch is simply a set of Transact-SQL statements from the last GO command or from the start of the script. It is important to know that GO is not a SQL statement, rather it is a command that you can use to tell SQL Query Analyzer or OSQL (and ISQL for that madder) to send the current batch of Transact-SQL statements to SQL Server.

One last thing to notice about this script is the CREATE DATABASE statement itself. More specifically, we don't have to specify every option available. For the options we did not specify, SQL server will use default values. Also when we set the SIZE of the files, we don't have to specify MB because MB is the default for the SIZE value (although for readability you may want to add MB anyway).

Once you have entered the statement, select Execute from the Query menu. The output from Query Analyzer should look like the following:



That's it! We now have a new database created on our SQL Server.



You can also use the ALTER DATABASE statement to modify an existing database. For example, the following statement would add a new data file to the DBbySQL database with a size of 5MB and a Max size of 50MB:

USE master
GO
ALTER DATABASE DBbySQL
ADD FILE
(
  NAME = DBbySQL_data2,
  FILENAME = 'c:\DBbySQL_data2.ndf',
  SIZE = 5MB,
  MAXSIZE = 50MB
)
GO

For more information on the ALTER DATABASE statement see "ALTER DATABASE" in the SQL Server Books Online. This information can also be found on the MSDN website at: Transact-SQL Reference: ALTER DATABASE

Note: I plan to cover the Query Analyzer in more detail in later articles.


Page 3: Understanding Transaction Logs


 » See All Articles by Columnist Michael Aubert


Understanding Transaction Logs

In my last article I gave a brief description of what the transaction log is. However, we did not go over what a transaction is. The simplest definition of a transaction is a "logical unit of work." In other words, it is a grouping of SQL statements that perform one logical task. The classic example that is used most of the time to describe transactions is that of a bank transfer. Say you have two bank accounts (Account A and Account B) and you want to transfer funds from one to the other. If you execute the two SQL statements individually (one to remove the funds from Account A and one to add funds to Account B), it is possible to remove funds from Account A successfully but have the second statement fail and not add the funds to Account B. In the real world this would make for some very unhappy customers. To overcome this problem we can group statements into transactions. When statements are grouped into transactions, either they all complete successfully and are "committed," or if any one of the statements fails they are all "rolled back" and the data is left in the condition it was in before the transaction began. Until we get into data integrity, this definition will do.

The transaction log stores all the changes as they are made to the database. Once all the statements in the transaction have made their modifications to the database, the transaction is said to be committed once the commit is recorded in the transaction log. Once the transaction is committed, why does it need to stay in the log? To understand, you need to know what is going on under the hood of SQL Server.

As data is requested from a database, the information is stored in memory. Because accessing memory is so much faster than accessing the disk subsystem, SQL Server can improve performance dramatically.

First request for data from a database (request is made, pages are loaded into memory, data is sent to the client):



 

Request for data that is already in memory (pages are cashed in memory, so they are used):




When you make a modification to the data in a database, SQL Server makes the changes to the data stored in memory (not on the disk) and an entry is made in the transaction log. Pages in memory that contain data that has been changed are called "dirty pages."



In order to have pages written back to disk and release memory, SQL Server uses a thread called the lazywriter. When the lazywriter writes pages to disk, it uses the least recently used (LRU) page list to determine what pages should be written to disk. Pages that have not been used recently are at the beginning of the LRU list, however recently modified pages are at the end of the LRU list. Because of this, very heavily accessed pages may not be written to disk for a large amount of time (or never, in theory).



Page 4: Transaction Logs (Continued)


 » See All Articles by Columnist Michael Aubert


Because pages that have been recently changed are only stored in memory, they are susceptible to system crashes and power outages. To overcome this problem, SQL Server can use the log information to restore the modifications to the database if one of these events occurs. For example, if the power goes out (and our UPS has failed!) we would have lost updated pages stored in memory. In order to recover these pages, SQL Server starts in recovery mode. By using the transaction log, SQL Server can reapply the changes to the copy of the database stored on the hard drive.

This brings up another problem , however; what happens if we have pages that are heavily updated? How long will this recovery process take? The lazywriter may not write these heavily accessed pages to disk. In addition, the number of modifications that would need to be applied from the transaction log would be very high. Therefore the amount of time needed to recover the database after a failure could be enormous. To solve this problem, SQL Server uses checkpoints to shorten the time needed to recover a database in the event of a failure. When a checkpoint occurs all dirty pages that were dirty when the check point started are written to disk.

In addition, the checkpoint also writes outstanding transactions to the transaction log. In the event a failure occurs, the recovery process can recover from the last checkpoint, and it only needs to reapply the transactions that happened after the last checkpoint or that were outstanding at the time of the last checkpoint.

So when do checkpoints occur? The checkpoint interval is based on the recovery interval, which is a setting global to an instance of SQL Server that specifies the maximum number of minutes per database SQL Server would need to perform the recovery process. To change the recovery interval, you can right-click your SQL Server in the Enterprise Manager tree pane and select properties. Once the properties window appears, select the Database Settings tab.



The default value for the recovery interval is 0. This means SQL Server will automatically set the recovery interval for you. Leaving the recovery interval 0 usually results in the checkpoint interval being less than one minute. If your server has lots of memory and your database has many inserts and updates, you may find the default value causes an excessive number of checkpoints to occur. To improve performance you may want to set the recovery interval to 15 or even 30 minutes, depending on the maximum acceptable downtime.

In addition to the checkpoint interval, the checkpoint process also occurs when the SQL Server or the computer are shut down properly. For more information on the recovery interval, see "recovery interval Option" in the SQL Server 2000 Books Online.

Now, with a good understanding of what the transaction is, how the transaction log is used in the recovery process, and how dirty pages get written to disk, let's take a closer look at how transactional information is stored in a transaction log. Unlike data files, transaction logs are not stored in 8 KB pages, nor do they use file groups.

When transactional information needs to be logged, it is written to the disk in whatever size is needed. For example, if the information being logged is small, it does not need to write a whole 8 KB page. Additionally, if a large amount of information needs to be logged, it can be done by writing a large block such as 16 KB or 32 KB.

Because logging is a serial operation, putting a log file on its own hard drive or RAID array can improve performance. The only time the heads of the disk drive need to move is when a roll back occurs. The rest of the time the drive heads are in place (or nearly in place) for the next write to the log.

That wraps things up for this week. So, where do we go from here? Now that you have a good understanding of what the transaction log is used for, next week we can look at the different backup strategies available.

As always, if you have any technical questions please post them on the SQL message board.Please send any non-technical questions, comments, and feedback to my email. I hope you are finding this series to be a useful one, and I'm looking forward to your feedback.

Mike
maubert@databasejournal.com
www.2000trainers.com


» See All Articles by Columnist Michael Aubert


Share:
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved