SQL Sam and the Never-Ending Delete - Part 1 (with clues)

Tuesday Aug 24th 1999 by Steve Hontz
Share:

We're just trying to do a simple delete, Sam, and the system just keeps banging away at the disks for hours and never finishes!

Environment: SQL Server 6.5, SP5a


Turn OFF clues

"Sam! Stop this crazy thing!" pleaded Chuck Watson, Data Center Manager, as SQL Sam strode into the server room. Next to Chuck stood an official-looking man and woman, dressed in neat suits. Sam recognized them as Dana Nully and Fox Nullder, the DBAs sent from the consulting agency.

"Don't worry, Chuck, whatever it is, we'll get it figured out," soothed Sam. "What's going on?"

"We're just trying to do a simple delete, Sam, and the system just keeps banging away at the disks for hours and never finishes!"

Nullder stepped forward. "What we have here is a clear case of alien abduction," he said.

Nully rolled her eyes. "Nullder, we have no evidence that aliens were involved here," she reminded him.

"Well, how else do you explain it, Nully? There's no way a simple DELETE statement could run so long!"

"Hmmm," said SQL Sam. "I believe the truth is in here. Let's see what we can find out. Chuck, tell me exactly what you're trying to do."

"Well, Sam, we have a customers table in our accounts database that uses a char(25) column as the primary key. It holds the customer's phone number, left-padded with zeroes, like this:

0000000000000008002781234

"For our testing, we populated the table with 5 million rows of data. Now we want to remove all the dummy customer numbers that look like 800-9XX-XXXX. That's about A CLUE!a million rows of data. We made a simple DELETE query like this:

DELETE FROM customers WHERE cust_num LIKE '%8009______'

"But this simple DELETE has been a nightmare! First, we ran out of space in the log, even though Truncate log on checkpont was turned on. So we increased the size of the log. Now we don't run out of space, but the DELETE still runs forever. We've even killed the process, even rebooted, but then the database takes forever in recovery mode. So, we're trying it again. But the delete never stops!"

"Aliens," muttered Nullder.

"Let's see if there are any blocking processes," Sam said, ignoring Nullder. Sam ran sp_who2 and got this result:

SPID  Status     Login HostName    BlkBy DBName                      Command          CPUTime DiskIO LastBatch      ProgramName SPID  
----- ---------- ----- ----------- ----- --------------------------- ---------------- ------- ------ -------------- ----------- ----- 
1     sleeping   sa      .           .   master                      MIRROR HANDLER   0       0      08/10 14:24:22             1     
2     sleeping   sa      .           .   master                      LAZY WRITER      0       0      08/10 14:24:22             2     
3     sleeping   sa      .           .   master                      CHECKPOINT SLEEP 0       37     08/10 14:24:22             3     
4     sleeping   sa      .           .   master                      RA MANAGER       0       24     08/10 14:24:22             4     
12    RUNNABLE   sa    TESTSYS       .   accounts                    INSERT           12768   2765   08/10 14:30:40 MS SQLEW    12

Continue with SQL Sam and the Never-Ending Delete

 

Go to the solution now!

 

Back to SQL Sam Cases


Steve Hontz is President of The Bit Corner, Inc., a Phoenix-based consulting firm specializing in Windows NT, C++, and SQL Server development. You can reach him at steve@bitcorner.com.


SQL Sam and the Never-Ending Delete - Part 2

Environment: SQL Server 6.5, SP5a


Turn OFF clues

"Hmmm," said Sam. There's only one process in your accounts database, and it's not blocked. But there is something interesting here- A CLUE!the command shows up as "INSERT", not "DELETE".

"Could be a bug in sp_who2," suggested Nully.

"Maybe," said Sam, "but I doubt it. I think there's more going on here than meets the eye. Is it okay if we stop this delete, Chuck?"

"Sure, Sam. We're not in production now. I just want to get to the bottom of this."

SQL Sam killed the process and then waited for the DELETE operation to rollback. It took a long time because of the large number of records involved.

"Okay, it's time to look at the query plan," said Sam. Sam brought up a query window and entered the following:

set rowcount 1
DELETE FROM customers WHERE cust_num LIKE '%8009______'

"What's the 'set rowcount 1' for?" asked Chuck.

"That's so we can delete just one row and see what happens," said Sam. He then turned on the Show Query Plan option in the Query Options dialog and ran the query. This is what he saw:

STEP 1
The type of query is DELETE
The update mode is deferred
FROM TABLE
customers
Nested iteration
Table Scan
TO TABLE
customers
STEP 1
The type of query is COND
STEP 1
The type of query is SELECT
FROM TABLE
dbo.customers DELETED
Nested iteration
Using Dynamic Index
STEP 1
The type of query is COND
STEP 1
The type of query is INSERT
The update mode is deferred
FROM TABLE
dbo.customers deleted
Nested iteration
Using Dynamic Index
NOT EXISTS : nested iteration
FROM TABLE
Updated_Customers 
Nested iteration
Table Scan
TO TABLE
Updated_Customers 

"Okay," said Sam. "I can now see that your simple DELETE isn't so simple. I can see several reasons why this delete runs so long."

What did SQL Sam see? Go to the solution now!

 

Review SQL Sam and the Never-Ending Delete - Part 1

 

Back to SQL Sam Cases


Steve Hontz is President of The Bit Corner, Inc., a Phoenix-based consulting firm specializing in Windows NT, C++, and SQL Server development. You can reach him at steve@bitcorner.com.


SQL Sam and the Never-Ending Delete - Solution!

"It's not alien abduction, psycho-kinesis, or some other unexplained phenomona?" said Nullder, sounding disappointed.

"No, it's really much simpler than that," said Sam. "First, you said you were deleting about a million rows from a five-million row table. You're doing it with one DELETE, which makes it an implicit transaction that must succeed in its entirety. So, SQL Server has to log the deletion of a million records before it can actually commit. That's why you ran out of log space at first."

"But we had Truncate on Checkpoint turned on," said Chuck.

"True," said Sam. "But the truncate process can only remove committed transactions. Your single transaction can't be committed until all 1 million rows are logged and written to the database, so Truncate on Checkpoint won't help you in this case.

"Second, the way the query is structured, using a LIKE clause that begins with a % wildcard, guarantees that the query optimizer cannot use any index on the table. So, it has to do a table scan to find the rows to delete. With a large table like this, that adds to the time it takes to run the query. The query plan confirmed this."

"What about that sp_who2 that showed an INSERT, not a DELETE?" asked Nully.

"That's why I ran the query plan," said Sam. "The plan proved that the DELETE wasn't just a simple DELETE. There's a trigger firing, too."

Sam pulled up the trigger defined for the table. It looked like this:

CREATE TRIGGER Update_Customers ON dbo.Customers
FOR UPDATE, DELETE
AS
DECLARE @deleted_num char (25)

SELECT @deleted_num = cust_num FROM DELETED

if @deleted_num IS NOT NULL
    INSERT INTO Updated_Customers (cust_num)
      (SELECT cust_num FROM deleted
      WHERE cust_num NOT IN (SELECT cust_num FROM Updated_Customers))

"That's why the query plan was so complicated. As rows are being deleted, they are added to the Updated_Customers table. So, not only does SQL Server have to log the deletion of the records in the customers table, it has to see if they exist in the updated_customers table and then log those changes, too."

"Well, how do we make this work, Sam?" asked Chuck.

"If you don't need the trigger for this test environment, we can drop it. And, we can make the DELETE operation happen in smaller chunks so we don't overrun the log."

Chuck agreed. SQL Sam dropped the trigger, then rewrote the DELETE operation like this:

set rowcount 10000
while 1=1
    begin
      DELETE FROM customers WHERE cust_num LIKE '%8009______'
      IF @@rowcount = 0
          BREAK
      ELSE
          CHECKPOINT
    end

This broke the delete up into deletes of 10,000 rows each, giving SQL Server a chance to commit the changes and write them out. With this change, and the trigger dropped, the million test rows were deleted in 15 minutes.

Go read more exciting SQL Sam Cases!

See the story behind the story in Behind the SQL: The Making of SQL Sam!


Steve Hontz is President of The Bit Corner, Inc., a Phoenix-based consulting firm specializing in Windows NT, C++, and SQL Server development. You can reach him at steve@bitcorner.com.


Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved