SQL Server 2000 Administration in 15 Minutes a Week: Table Basics (Part 1)

Thursday Aug 22nd 2002 by Michael Aubert
Share:

Michael Aubert's eleventh article in the SQL Server 2000 Administration in 15 Minutes a Week series switches gears by taking a look at how to create database objects in SQL Server, with topics including relational database design concepts, SQL Server Data Types, and the basics of creating tables.


Welcome to the eleventh article in my series SQL Server Administration in 15 Minutes a Week. Over the last few weeks we have been looking at backup and recovery processes. Although information on backups could have waited until the very end of this series, I feel backups are an important topic that get over looked too much. Therefore I decided to cover backup information at the beginning of this series rather than at the end. Having covered backups for the most part, we are now going to switch gears and look at just how we create database objects in SQL Server. The topics for this week include:

- Relational Database Design Concepts
- SQL Server Data Types
- Creating Tables


Relational Database Design Concepts

Up to this point we have focused on the physical design of a database which includes things like files and filegroups. The physical design is focused on how storing data and the most efficient way to access data. Now we are going to start to look at implementing the logical design of a database which includes objects like tables and the relationships between them. The logical design is only concerned with modeling a real-world data scenario -- it is unaware of and non-dependent on any one particular Database Management System. For example, I could use the same logical design to create a database in both SQL Server and Access, and I would come up with the exact same database as far as the logical design is concerned. However, the physical design for the two databases could be very different -- the Access database would be made up of only one file, whereas the SQL Server database could be made up of many different files spread across several hard drives.

Another example is a database on SQL Server that gets another data file added to it. While the physical design has changed by adding another file, the logical design is still the same. The important point to get here is that the physical implementation of a database and the logical implementation of a database are two distinct concepts that are, for the most part, independent of one another. 

I think I've said this already, but if not, SQL Server is a Relational Database Management System (RDBMS). So it makes sense that the databases that we are going to be working with are relational databases. Although understanding how to design a relational database is an important topic, this series is aimed at SQL Server Administration, the 70-228 exam, and physical design issues -- not the logical design of a relational database (that is covered in the 70-229 exam).

If this is your first time working with a relational database of any kind or if you never got a good grasp on the design concepts, I encourage you to stop here and get a book on designing relational databases. While you could probably complete the rest of this series -- and the 70-228 exam for that matter -- without knowing how to create the logical design for a relational database, you will have an easier time understanding the rest of this series and be a much better admin if you understand the design concepts. Check out the following links to get started learning about logical database design:

ACC2000: Database Normalization Basics (Q209534)

If you have worked with other relational databases (such as Microsoft Access) before and words like normalization and foreign key are not new to you, read on.

Page 2: Data Types


 » See All Articles by Columnist Michael Aubert


Every column in your table must have a "data type," which is simply a property that defines what type of data is stored in that column. In addition, a data type will reject data that is not of the correct type (i.e. attempting to store a letter in a data type designed for numbers). SQL Server has over 25 different data types -- some with more options than others. Let's look at the different data types and the options for each:

 

Data Type Size Description
Integers    
bigint 8 bytes Holds integer (whole number) data from -2^63 (-9,223,372,036,854,775,808) through 2^63-1 (9,223,372,036,854,775,807).
int 4 bytes Holds integer (whole number) data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647).
smallint 2 bytes Holds integer data from 2^15 (-32,768) through 2^15 - 1 (32,767).
tinyint 1 byte Holds integer data from 0 through 255.
bit 1 byte for up to 8 bit columns Holds integer data with either a 1 or 0 value. Each set of up to 8 bit columns requires 1 byte. So if there are anywhere from 1 to 8 bit columns in a table, the storage space you will need is 1 byte. If there are anywhere from 9 to 16 bit columns in a table, the storage space you will need is 2 bytes. And so on...
Decimal    
decimal Anywhere from 5 to 17 bytes depending on the precision Holds fixed precision and scale numbers. When maximum precision is used, valid values are from - 10^38 +1 through 10^38 - 1. The Precision specifies the maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point. The precision must be a value from 1 through 38. The Scale specifies the maximum number of decimal digits that can be stored to the right of the decimal point. Scale must be a value from 0 through Precision. Examples:
if precision is set to 10 and scale is set to 3 the smallest (other than 0)/ largest number we could store would be 0.001 / 9999999.999
if precision is set to 8 and scale is set to 6 the smallest (other than 0)/ largest number we could store would be 0.000001 / 99.999999 
numeric - Same as decimal data type
Money    
money 8 bytes Holds monetary data values from -2^63 (-922,337,203,685,477.5808) through 2^63 - 1 (922,337,203,685,477.5807), with accuracy to one ten-thousandth of a monetary unit
smallmoney 4 bytes Holds monetary data values from -214,748.3648 through +214,748.3647, with accuracy to a ten-thousandth of a monetary unit.
Approximate    
float(n) Anywhere from 4 to 8 bytes depending on the precision Holds floating precision number data from -1.79E + 308 through 1.79E + 308. The value n is the number of bits used to store the mantissa of the float number and can range from 1 to 53
real 4 bytes Holds floating precision number data from -3.40E + 38 through 3.40E + 38. Real is the same as float(24).
Date and Time    
datetime 8 bytes Holds date and time data from January 1, 1753, through December 31, 9999, with an accuracy of three-hundredths of a second, or 3.33 milliseconds.
smalldatetime 4 bytes Date and time data from January 1, 1900 through June 6, 2079, with accuracy to the minute.
Strings - non-Unicode  
char(n) n bytes Holds fixed-length non-Unicode character data with length of n characters, where n is a value from 1 through 8000. If less than n number of characters are entered n bytes are still required because unused character spaces get padded to the end to make them n bytes long.
varchar(n) Depends on the length of data entered - 1 byte per character Holds variable-length non-Unicode character data with a length of n characters, where n is a value from 1 through 8000. The storage size is the actual length in bytes of the data entered, not n bytes.
text 16 bytes for the pointer Variable-length non-Unicode data with a maximum length of 2^31 - 1 (2,147,483,647) characters. A text column entry can hold up to 2^31 - 1 characters. It is a pointer to the location of the data value, the data is stored separately from the table data.
Strings - Unicode  
nchar(n) 2 bytes * n Holds fixed-length Unicode character data of n characters, where n is a value from 1 through 4000. Unicode characters use 2 bytes per character and can support all international characters. If less than n number of characters are entered n bytes are still required because unused character spaces get padded to the end to make them n bytes long.
nvarcher(n) Depends on the length of data entered - 2 byte per character Holds variable-length Unicode data of n characters, where n is a value from 1 through 4000. Unicode characters use 2 bytes per character and can support all international characters. The storage size is the actual length in bytes * 2 of the data entered, not n bytes.
ntext 16 bytes for the pointer Holds variable-length Unicode data with a maximum length of 2^30 - 1 (1,073,741,823) characters. The column entry for ntext is a pointer to the location of the data. The data is stored separately from the table data
Binary    
binary(n) n + 4 bytes Holds fixed-length binary data of n bytes, where n is a value from 1 through 8000. Use binary when column data entries are consistent in size.
varbinary(n) Depends on the length of data entered + 4 bytes Holds variable-length binary data of n bytes, where n is a value from 1 through 8000. Use varbinary when column data entries are inconsistent in size.
image 16 bytes for the pointer Used for variable-length binary data longer than 8000 bytes, with a maximum of 2^31 - 1 bytes. An image column entry is a pointer to the location of the image data value. The data is stored separately from the table data
Other    
sql_variant size varies A column of type sql_variant may contain rows of different data types. For example, a column defined as sql_variant can store int, binary, and char values. The only types of values that cannot be stored using sql_variant are text, ntext, image, timestamp, and sql_variant.
timestamp 8 bytes Timestamp is a data type that exposes automatically generated binary numbers, which are guaranteed to be unique within a database. Timestamp is used typically as a mechanism for version-stamping table rows. Each table can have only one timestamp column.
uniqueidentifier 16 bytes Stores a 16-byte binary value that is a globally unique identifier (GUID).

Page 3: Data Types (Continued)


 » See All Articles by Columnist Michael Aubert


That about covers it for the various data types. There are two data types I intentionally left out (cursor and table) as they are really not applicable here because you can't have a column that is made up of those data types.

There are a few things I feel are important to point out about the data types in the preceding table. First, back in "Database Creation Basics" I said the maximum size a row could be was 8060 bytes. In order to calculate how big each row of your table will be and see if it is under 8060 bytes, typically you can just add up the size of each column to get the answer. Although it is true that a row can't span more than one page, there is a way to get around the 8060 byte limitation - the text, ntext, and image data types (aka "BLOB" or Binary Large OBject ). By default, the text, ntext, and image data types don't store their data in the row like the other fields of that row do. Instead, they store a 16 byte pointer that redirects SQL Server to another 8 K page(s) where the data can be found. By having only a pointer to the data stored in the row, you can get around the 8060 byte limitation. But be warned, the text, ntext, and image data types have some limitations that we will see over the next few articles. One downside is that since SQL Server must retrieve two or more pages (verses only one page for rows without any text/image data) using the text, ntext, and image data types slows SQL Server down. One good practice is only to use/retrieve from the text, ntext, and image data types when it is absolutely necessary.

Speaking of variables that have a non 'n' version and a 'n' version, let's talk about non-Unicode strings vs. Unicode strings. If you remember back to "Installation Basics" one of the options when installing SQL Server was to select a collation that included the selection of an ASCII code page. The non-Unicode data types -- char, varchar, and text -- store the ASCII values from the code page for each character. Because there are only 256 ASCII values per code page, the space required to store an ASCII value is only 1 byte (2^8). But what if you need to have multiple language support or there is not an ASCII code page that has all the characters you need? Well, by using the Unicode data types nchar, nvarchar, and ntext, you can store multiple languages and have access to over 256 characters. In order to achieve this, unlike the non-Unicode data types, the Unicode data types are made up of a 2 byte number for each character that gives a possible 65,536 characters (2^16) -- more than enough to include all the characters you would need for multilingual support. (Check out http://www.unicode.org/ for more info on Unicode)

The next point I would like to make about data types is fixed-length and the variable-length data types. If you use the char or nchar data types, no matter how many characters you enter, the field will always use n number (or 2*n for nchar) of bytes. For example, if I set the length (i.e. n) of a column to 10 and had three rows: 'Dan', 'Mike', and 'Jason' -- the space needed to store those three rows would be 30 bytes even though there is only 9 characters worth of data. On the other hand, if I used the variable-length version of char or varchar, the space needed to store the three example rows is only 9 bytes -- a 21 byte savings. Now that may not seem like much, but multiply that by several columns of data and 100,000 rows and it adds up quick. So it may seem like varchar and nvarchar are the way to go all the time, but they also have a downside - the time needed to deal with the variable length nature of varchar and nvarchar slows SQL Server down. As a rule of thumb, it is best to use char or nchar whenever the strings will be about the same size from row to row. Whereas if the length of the strings varies greatly between rows, it is typically best to use varchar or nvarchar. Now for the three row example I gave it may be best just to leave it as char -- looking at a larger sample from that table may show more names are greater than 5 characters or that the speed increase by using a fixed-length data type outweighs the extra disk space that gets used. Like I said earlier...when working with SQL Server most things are a balancing act; you just have to see what gives you the best performance for your situation.

Finally, there are some more generalized points about data types that are important. When you choose a data type the goal should be to choose one that is as small in size as possible, but one that will still accept all the values that may be possible over the lifetime of the database. A good example of this is selecting the size of integer to use. If you know you are only going to store a yes/no answer, bit is the perfect size and anything larger, such as smallint or int, will only waste storage space. Another example of wasting space would be using datetime when you only need the range and accuracy of smalldatetime. By selecting the smallest data type possible, you not only save disk space but you also reduce the number of pages that makeup the table -- which in turn lowers the time needed to run a query (i.e. SQL Server has less pages to retrieve from the hard disk and scan).

You should also try and let your data types do as much work for you as possible when dealing with data integrity. A good example of this is to use datetime or smalldatetime to store date information - those data types may take up more space and use more processing power then making your own date from a 6 character string, but in the long run using the proper data type for the job will save you time.

Page 4: Creating Tables


 » See All Articles by Columnist Michael Aubert


Creating Tables

Once you have a logical design to work from, the process of creating your tables is quite simple. Start by opening Enterprise Manager, expand the server and database you would like to add a table too, then right click "Tables" and select "New Table..." from the menu.



Click to Enlarge



The New Table window appears and should look something like the following:



Each row represents an individual column of the table being created/edited.

"Column Name" is used to enter the name of the column.

"Data Type" sets the type of data the column will contain.

"Length" specifies the length (or 'n') of the column. For some data types, such as char and varchar, this value can be changed and specifies the maximum number of characters the column can store. For other data types, such as int and decimal, that don't have a length, "Length" shows the size (in bytes) the column will need.

"Allow Nulls" determines whether a particular column can be left blank. A check indicates nulls are allowed in that column. That is, if a row is added/edited, the row is not required to have a value in that field. If there is no checkmark in Allow Nulls, any rows that are added or edited are required to have a value entered for that field. Remember: 1, 0, and <NULL> (or 'ABC', '   ', <NULL>) are three different things. For example, while 0 and <NULL> may seem similar, 0 typically indicates "False" and <NULL> indicates "No Value." It is important to keep this concept in mind; it will have implications for us further down the road.

A table that holds employee data may look something like:



There are a few more things to note here.

First, if you have ever used Access before, you should recognize the key icon on the left of the EmployeeID row. This indicates what row(s) of the table make(s) up the primary key (i.e. the column(s) that are used to uniquely identify every row in a table). To set the primary key, select the appropriate row (you can hold down the Ctrl key to select more than one row at a time) and then click the "Set primary key" icon (looks like a key) in the toolbar.

Next, as you move from row to row, you will notice that the available options on the "Columns" tab change. Lets take a look at these options individually:

Description - A space for you to enter any comments about what the column is used for. Anything entered in this box will have no effect on SQL Server or the table.

Default Value - The value that will be entered if no other value is specified when a row is added to the table. For example, I could set the default value of the Title column to "staff." Each time an employee gets added and no title is entered, "staff" will automatically be substituted in place of <NULL>. In addition, you are not limited to just text for the default value, you can also use functions. Some more common ones include GETDATE() that returns the current system date/time and NEWID() that returns a new globally unique identifier. Note that if you set the column's "Identity" property to yes, the default value is unavailable.

Precision/Scale - Used to set the precision and scale values of the decimal and numeric data types. These options are not available for other data types. For more information on precision and scale, see the decimal data type in the data types table at the beginning of this article.

Identity/Seed/Increment - Similar to the AutoNumber option in Access. If a column's Identity option is set to true, SQL Server will automatically generate a new number for each row added to the table in this column. A table can have only one column defined with the Identity property set to yes, and that column must use the decimal, int, numeric, smallint, bigint, or tinyint data type. The Seed value specifies what number SQL Server should start at. The Increment value specifies the number that should be added to the Seed value to determine successive identity numbers. Note that Identity columns are only guaranteed to generate unique numbers within one table - two tables both with an Identity column may (or may not) generate the same numbers. Another thing to keep in mind is that if the table experiences many deletions, large numbering gaps may occur because deleted identity values are not reused.

Is RowGuid - Specifies that this column contains the GUID for the rows in this table. Only one table can have its Is RowGuid value set to yes, and the data type for this column must be set to uniqueidentifier. In addition, you should also set the default value for this column to use the NEWID() function. Unlike an Identity value, GUID values are guaranteed to (aka "should") be unique for every row, in every table, in every database, on every computer in the world.

Formula - Used to enter a formula to make this a computed column. Instead of just storing data, a computed column takes a formula (like LastName & ', ' & FirstName) and generates a value depending on the formula you entered. In addition, computed columns are generated on the fly, so updating one column that the formula references will update the computed column automatically.

Collation - Gives the ability to set the collation for each individual column of a table. <database default> will set the column to use the same collation setting as the database the table is located in.

Page 5: Creating Tables (Continued)


 » See All Articles by Columnist Michael Aubert


Once you have added all the columns you can click the save icon (left most, looks like a disk) on the toolbar and you will be prompted to enter a name for this table:




Click to Enlarge



Enter a name and click OK. To close the New Table screen click the lower, inner, X at the upper right. You can also use the "Window" menu to switch between windows or arrange them.

After refreshing the Tables list (F5) the table we just created appears in SQL Server Enterprise Manager:



You can always go back and edit a table by right clicking it in Enterprise Manager and selecting "Design Table":



One last thing, using enterprise manager is not the only way to create a table -- you can also use the CREATE TABLE statement. For more information, see CREATE TABLE and ALTER TABLE in the SQL Server books online.

Well that about wraps it up for this week. Next week we are going to continue to talk about creating tables -- more specifically, we will look at additional table properties, creating relationships, and creating check constraints. As always...If you have any technical questions please post them on the SQL message board. Please send any non-technical questions, comments, and feedback to my email. I hope you are finding this series to be a useful one, and I'm looking forward to your feedback.

Mike
maubert@databasejournal.com
www.2000trainers.com


» See All Articles by Columnist Michael Aubert

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved