Implementing CRUD Operations Using Stored Procedures: Part 2

Tuesday Nov 4th 2003 by Andrew Novick
Share:

Andy Novick gets down and dirty with stored procedures, exploring why you might want to use them instead of SQL script to implement CRUD operations.

In database terms, CRUD stands for the four essential database operations: Create, Read, Update and Delete. To create a high performance system, these four operations should be implemented by stored procedures, each procedure implementing one of the four operations. This is the second in a two part series of articles about why and how to go about writing the stored procedures. Last month's article covered why to use stored procedures and delved into the technical issues that dictate how the procedure should be written with emphasis on SQL Server features that affect the design. This month's article discusses the procedures themselves, goes into detail about each one and discusses code generation options for creating the procedures with a program instead of by hand.

There are several reasons for using stored procedures instead of SQL script to implement the CRUD operations. The reasons are:

  • The best possible performance
  • Removes the SQL code from the other layers of the application
  • Prevents SQL injection attacks
  • Prevents casual table browsing and modifications

These reasons were examined in last month's article. Which you will find at this link: http://databasejournal.com/features/mssql/article.php/3082201

Overall, I think the case is pretty strong for using stored procedures.

The CRUD Stored Procedures

The four CRUD operations should be implemented by four stored procedures. I generally generate them instead of writing them by hand so I always create all four procedures for every table. Options for generating the procedures are discussed near the end of this article, but generation does not affect how they are written.

Last month's article introduced the Product table that follows. 'It is similar to Product tables that 'you will find in Northwind and other databases but 'it has been created with some features that illustrate how specific SQL Server features must be handled in the stored procedures. Let's take a look at the CREATE TABLE script:

setuser
GO

EXEC sp_addtype N'AppUSERID', N'varchar (20)', N'not null'
GO

CREATE TABLE Product (
    ProductID int NOT NULL identity (1,1) PRIMARY KEY 
  , ProductName varchar(100) DEFAULT 'Product unnamed'
  , [Description] varchar(2000) NULL
  , UnitCost money NOT NULL-- Cost of making it
  , UnitsInStock int NOT NULL -- units in the inventory
  , InventoryCost as UnitCost * UnitsInStock -- Computed column
  , CreatedDT datetime NOT NULL DEFAULT getdate()
  , CreatedByUSERID AppUSERID DEFAULT user_name()
  , LastUpDT datetime NULL
  , LastUpdByUSERID AppUSERID NULL
  , RowVrsn rowversion NOT NULL
  )

Some of the features to watch out for as we examine each of the CRUD procedures are:

  • The identity column
  • The rowversion column
  • Default values
  • The computed column
  • The special fields: CreatedDT, CreatedByUSERID, LastUpDT, and LastUpdByUSERID

Last month's article explained why these, with the exception of the default values, present special issues for generating the stored procedures. Instead of repeating that discussion, the issues will be pointed out as the procedures are implemented. However, before creating the procedures we will need a naming convention for their names, which is discussed next.

Naming the Procedures

All stored procedures need a name that should tell the developer and DBA what the procedure does. Creating the right name makes everyone's life easier because the procedure is easier to find in the SQL Server tools.

To differentiate the procedures that are used for the CRUD operations from other stored procedures that might be in the database, I like to name them staring with the characters "dp_". Any prefix will do. I stay away form "sp_" and "usp_". The former because of the performance hit when a user stored procedure has that prefix and the latter because that's the prefix I use for most manually written stored procedures in the database.

I follow the prefix with the table name. Using the table name right after the prefix insures that the four CRUD procedures for the same table are grouped together in Query Analyzer and Enterprise Manager.

Finally, the procedure name typically ends with the name of the CRUD operation that it implements. I use "_ins", "_sel", "_upd", and "_del" because these names reflect the four SQL Statements that do most of the work in the procedure. You can use any set of suffixes that mean something significant to you. For example, a good set of alternate suffixes might be "_C", "_R", "_U", and "_D".

Using the above formula, the four procedures are shown in the following table along with an alternative name.

Operation

Stored Procedure Name

Alternative Name

Create

dp_Product_ins

dp_Product_C

Read

dp_Product_sel

dp_Product_R

Update

dp_Product_upd

dp_Product_U

Delete

dp_Product_del

dp_Proudct_D

Of course, these aren't the only possible names and you may have your own convention. Now on to the first procedure: Create.

Create Procedures

The Create operation performs a SQL INSERT statement. It is going have one parameter for almost every column in the table. Most of these columns are used in the INSERT statement to create the row. However, these types of columns are only for output:

  • The identity column
  • Computed columns
  • The rowversion column

By supplying these columns as OUTPUT parameters, the application code does not have to perform a read operation before it can start using the new row. I have seen several data access tools do just that. Insert a row and then perform a select to get the computed columns and rowversion. That wastes a round trip to the database.

The use of SCOPE_IDENTITY() to get the identity value was discussed in the Issues section of last month's article. It is a new system function in SQL Server 2000 that is more robust than using @@IDENTITY.

Rowversion is an alternate name for timestamp. The new name is more compatible with the SQL-92 standard, which uses timestamp for a different data type. I generally use a rowversion in every table where concurrent updates are possible. This will not come into play in the Create procedure but we will see it later when updating the database. All the Create proc has to do is return the timestamp of the row so that it can be updated if necessary.

Let's take a look at the Create stored procedure for our Product table.

CREATE PROCEDURE dp_Product_ins
	@ProductID	int OUTPUT ,
	@ProductName	varchar(100) = NULL  OUTPUT ,
	@Description	varchar(2000) = NULL ,
	@UnitCost	money,
	@UnitsInStock	int,
	@InventoryCost int = NULL  OUTPUT,
	@CreatedByUSERID	varchar(20) = NULL  OUTPUT ,
	@RowVrsn	timestamp OUTPUT 
AS

  Set NoCount On
  IF @ProductName Is Null
	SET @ProductName = ('Product unnamed')
   IF @CreatedByUSERID Is Null
	SET @CreatedByUSERID = (user_id())

  INSERT INTO [Product] WITH (ROWLOCK)  (
	[ProductName],
	[Description],
	[UnitCost],
	[UnitsInStock],
	[CreatedByUSERID])
    Values (
	@ProductName,
	@Description,
	@UnitCost,
	@UnitsInStock,
	@CreatedByUSERID)

  SET @ProductID = SCOPE_IDENTITY()

  SELECT  @ProductName = [ProductName]
        , @InventoryCost = [InventoryCost]
        , @CreatedByUSERID = [CreatedByUSERID]
        , @RowVrsn = [RowVrsn] 
	FROM [Product] 
	WHERE 	[ProductID] = @ProductID

The first thing to notice is the OUTPUT parameters for the identity column, the rowversion column, and the computed column, InventoryCost. However, any column that has a default also is an OUTPUT parameter because the stored procedure could be setting its value.

The default values are set before the INSERT statement with these lines:

  IF @ProductName Is Null
	SET @ProductName = ('Product unnamed')
   IF @CreatedByUSERID Is Null
	SET @CreatedByUSERID = (user_id())

They have to be set by code instead of using SQL Server's defaulting mechanism because when they are included in the INSERT statement, SQL Server does not apply the default. It allows the NULL value to be inserted into the table.

The identity value is captured with the line:

  SET @ProductID = SCOPE_IDENTITY()

and then used in the SELECT statement that grabs the OUTPUT parameters that aren't already set before the INSERT statement is executed. That does not include LastUpDT or LastUpdByUSERID because the row has not been updated yet.

Even though there is a default value for the CreatedByUSERID, it is only there as a last resort. Because most of my procedures are called from ASP or ASP.Net applications, the identity of user_id() doesn't really give useful information about who requested the update. Instead, the calling application is responsible for setting this value to something that is meaningful to the application, such as the ID of the currently logged in user. In practice, I have found that this works acceptably.

Once the procedure returns, the caller has the up-to-date values for all columns in the new row. It does not have to read them from the database. At some other time when it needs to read the row, it must invoke the Read procedure, which is discussed next.

Read Procedures

The read procedure has one OUTPUT column for every column in the table. Let's take a look at the procedure for the Product table:

CREATE PROCEDURE dp_Product_sel
  @ProductID	int OUTPUT,
  @ProductName	varchar(100) OUTPUT,
  @Description	varchar(2000) OUTPUT,
  @UnitCost	money OUTPUT,
  @UnitsInStock	int OUTPUT,
  @InventoryCost	money OUTPUT,
  @CreatedDT	datetime OUTPUT,
  @CreatedByUSERID	varchar(20) OUTPUT,
  @LastUpDT	datetime OUTPUT,
  @LastUpdByUSERID	varchar(20) OUTPUT,
  @RowVrsn	timestamp OUTPUT
AS

    Set NoCount On
DECLARE @myRowCount int, @myError int

SELECT @ProductID      = [ProductID],
  @ProductName      = [ProductName],
  @Description      = [Description],
  @UnitCost      = [UnitCost],
  @UnitsInStock      = [UnitsInStock],
  @InventoryCost      = [InventoryCost],
  @CreatedDT      = [CreatedDT],
  @CreatedByUSERID      = [CreatedByUSERID],
  @LastUpDT      = [LastUpDT],
  @LastUpdByUSERID      = [LastUpdByUSERID],
  @RowVrsn      = [RowVrsn]
FROM [Product]
WHERE 	[ProductID] = @ProductID 
OPTION (FAST 1) 

Select @myRowCount = @@RowCount, @myError = @@Error
 IF @myRowcount <> 1 RETURN 100 
 RETURN @myERROR

The SELECT statement retrieves all the columns in the table. Since they are all OUTPUT parameters, they will travel back to the caller without the need to create a rowset. This also means that on the receiving end there is not any need to create an ADO recordset or ADO.Net DataReader or DataAdapter to receive the rowset. An ADO.Command object or ADO.NET SQLCommand object is all that is required. An additional benefit of using only OUTPUT parameters is that other stored procedures can use this procedure without having to INSERT INTO a temporary table to get column values. That is what they would have to do if data was returned in a rowset.

One of my design decisions was what to do when the row was not found. I decided to use the return code to indicate this to the caller. I chose code 100 because that is the code that Oracle uses. I have not found a better choice but suggestions are welcome.

If the table has additional unique indexes, I have sometimes created an additional Read procedure that allows the caller to request a row by that index. The need for such alternative procs depends on your application.

Update Procedures

Now that we can Create and Read rows, we might have to update them. Here is the update procedure for the Product table:

CREATE PROCEDURE dp_Product_upd
	@ProductID	int,
	@ProductName	varchar(100),
	@Description	varchar(2000),
	@UnitCost	money,
	@UnitsInStock	int,
	@LastUpDT	datetime,
	@LastUpdByUSERID	varchar(20),
	@RowVrsn	timestamp
AS

    Set NoCount On
UPDATE [Product] WITH (ROWLOCK)  SET
	[ProductName] = @ProductName,
	[Description] = @Description,
	[UnitCost] = @UnitCost,
	[UnitsInStock] = @UnitsInStock,
	[LastUpDT] = @LastUpDT,
	[LastUpdByUSERID] = @LastUpdByUSERID
WHERE	[ProductID] = @ProductID AND
	[RowVrsn] = @RowVrsn OPTION (FAST 1) 
If @@ROWCOUNT = 0 RETURN 100 
RETURN @@ERROR

The update procedure has just one UPDATE statement. The WHERE clause of the statement specifies the key of the record and the value that the caller has for RowVrsn. Checking RowVrsn prevents overlapping updates that would wipe out a user's changes.

Notice that the procedure does not have any OUTPUT parameters. I found that this made sense because once my client-side objects updated their values, they were always destroyed. This is a choice that you might make differently, depending on your application. If a row can be updated multiple times in succession by the same caller, then you will want to return any computed columns and the new value of the rowversion column. You will need a SELECT statement to accomplish this.

I do not allow changes to the row's key. Of course, in this case, the key is an identity column but I prohibit this as a general proposition. I find that primary key changes are usually a mistake and the application should handle key changes by creating a new row and avoiding the possibility of cascading updates.

Another design choice that I made was to allow the caller to set the LastUdDT and LastUpdByUSERID columns. They could have been given values in this procedure. However, just as the Create procedure expects the caller to supply values for CreateDT and CreatedByUSERID; it's the applications responsibility to set these columns to meaningful values.

Delete Procedures

The delete procedure is the simplest of the bunch. All it has to do is delete the row. There aren't any OUTPUT parameters. Here is the Delete proc for the Product table.

CREATE PROCEDURE dp_Products_del
	@ProductID	int
AS

    Set NoCount On
    DELETE [Products] WITH (ROWLOCK) 
        WHERE ProductID = @ProductID 
        OPTION (FAST 1)

That concludes the design of the four CRUD stored procedures. Are you looking forward to writing them on your next project with, let's say, eighty tables? I doubt it. There is a better way: code generation.

Generating the Procedures

Several years ago, I found myself faced with writing about 360 procedures by hand. That did not seem to me like a good idea so I looked around for a way to generate them. While there are several products on the market, there were none available at the time really seemed to do the job.

I ended up writing my own generator based on code in the article Automate Writing Stored Procedures published in the June 2001 issue of Visual Basic Programmer's Journal. The program uses SQL-DMO to read the structure of the database and then writes a script to create the CRUD stored procedures. I modified it based on the need to handle the SQL Server 2000 features that I have mentioned in this article and on the particular needs of my application. I liked the approach so much that I extended the code to generate Visual Basic 6.0 data access classes for each table and to wrap views and stored procedures for easy access by the client-side code. This relieved the client-side developers of the need to write any SQL.

My interest in code generation was recently renewed by a presentation by Rockford Lhokda from the material in his new book Expert One-on-One Visual Basic.Net Business Objects. It is a fabulous book, which I have reviewed at this link: http://www.novicksoftware.com/BookReviews/Visual%20Baisc.Net%20Business%20Objects.htm I am currently researching alternatives for generating the stored procedures and .Net classes based on Lhokda's CLSA framework. In the process, I have compiled a list of code generators that you might be interested in: http://www.novicksoftware.com/TipsAndTricks/tips-object-relational-mapping-for-.net.htm

Conclusion

Parts I and II of this series have shown how to write stored procedures to implement the basic CRUD database operations in SQL Server. Features of SQL Server 2000 such as identity columns, rowversion columns, computed columns, defaults, and some special purpose columns raise issues that must be handled in the design process. Those decisions and the requirements of your client side code dictate how the four procedures are written.

Writing the procedures by hand is possible but tedious. I have been investigating code generation solutions and even implemented one of my own. Code generation is definitely the way to go.

» See All Articles by Columnist Andrew Novick

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved