What do you do when your application logic requires that you INSERT a record if the record doesnt exist, or UPDATE the record if it does exist? If you are using SQL Server 2005 or below you would have to write a series of T-SQL statements that check to see if the record exists, and then write an UPDATE or INSERT statement depending on your existence checks. With SQL Server 2008, Microsoft has simplified the amount of T-SQL code it requires to perform INSERT/UPDATE logic, by implementing the MERGE statement. This article will explore how to use the MERGE statement.
What can you do with the MERGE Statement
You can do more with the MERGE statement then just inserting and updating records as described above. You can also use it to delete records. Another thing you can do with the MERGE statement is to perform a specific number of merge actions based on a TOP clause. The MERGE statement can also generate output that will identify which records where inserted, updated, or deleted. The best way to describe how the new MERGE statement works is to go through some examples, and examine how they work.
Basic INSERT or UPDATE Example
For the first example, I will perform updates or inserts to a target table based on a source table. The source table will contain the records you either want to insert or update, and the target table will be the table where the inserts or updates will actually be performed. This kind of operation is sometimes referred to as an upsert.
For this example I will be performing inserts and updates to a Customers table. My Customer table will be considered the target table. The records I want to insert or update will be contained in a table named NewCustomers. This table is known as the source table.
Here is the code to create and populate the target and source tables:
SET NOCOUNT ON; -- Create Target Table CREATE TABLE Customers (LastName VARCHAR(50), FirstName VARCHAR(50)); INSERT INTO Customers VALUES ('Doe', 'Jane'); -- Create Source Table CREATE TABLE NewCustomers(LastName VARCHAR(50), FirstName VARCHAR(50)); INSERT INTO NewCustomers VALUES ('Doe', 'John'); INSERT INTO NewCustomers VALUES ('Smith', 'Doris');
Ive created a Customers table with a single record, and a NewCustomers table with two different records. The records in the NewCustomers table will be used to update and insert records in to the Customers table. The first record in the NewCustomers table is going to be used to update the single existing Customers record, where as the second record in the NewCustomers table is going to be inserted into the Customers table. To do the insert and update, I will use the following MERGE statement:
-- Update and Insert into Customers MERGE Customers AS C USING NewCustomers AS NC ON C.LastName = NC. LastName WHEN MATCHED THEN UPDATE SET C.FirstName = NC.FirstName WHEN NOT MATCHED THEN INSERT (LastName, FirstName) VALUES (NC.LastName,NC.FirstName);
Here the target table is identified by naming it immediately following MERGE statement, in my example that would be Customers. Next, the MERGE statement uses the USING clause to identify the source of the updates and inserts, which in this case is the NewCustomers table. In order to identify which records are updates and which ones are inserts I need to identify a key column to match on between the Customers and NewCustomers table. To accomplish this matching of records between the two tables the MERGE statement uses the ON clause just as you would do when you are joining two tables. Here I am joining the source and target tables based on the LastName column in source and target tables. The different WHEN criteria of the MERGE statement identify the conditions for when an update or insert is to be performed. When a record is found in both source (NewCustomers) and target (Customers) tables then an UPDATE statement is performed to update on the FirstName column. This condition meets the WHEN MATCHED criteria above. When a join does not find a record in the target (Customers) table for a record in the source (newCustomer), then this is the WHEN NOT MATCHED criteria is executed. When this occurs an INSERT statement is performed to insert the LastName and FirstName value into the Customers table. After I have run my MERGE statement, my updated Customers table contains the following records:
LastName FirstName ------------------------------------------ -------------------------------------------------- Doe John Smith Doris
By looking at the output, you can see the MERGE statement changed the FirstName column of the Doe record from Jane to John. The MERGE did an UPDATE to accomplish this by using the MATCHED criteria. The second record was created when the NOT MATCH criteria inserted a record.
Basic Insert, Update and Delete Example using MERGE
The MERGE statement can be used to maintain a target table based on information in a source table. By maintaining a target table, I mean the MERGE statement can perform the following actions:
- Insert a new record from the source if the record is not in the target.
- Update a target record if the record exists in the source table.
- Delete a record in the target table if the record is not found in the source.
To show how this works let me go through an example. My example will use the MERGE statement to maintain a student roster for the students enrolled in the current year. For this example, my target table will be Students. Here is the code to create and populate my Students table:
CREATE TABLE Students (LastName VARCHAR(50), FirstName VARCHAR(50), Address VARCHAR(100), Age INT); INSERT INTO Students VALUES ('Fritz', 'David', '181 Kline Street', 14) ,('Reese', 'Paul' , '4429 South Union', 14) ,('Brown', 'Jake' , '5401 Washington Ave',14);
My student table needs to be updated every year with a list of all the new students as well as the returning students. To do this I need a source table that will contain those records that need to be inserted, or updated. Here is the code to create and populate my source table:
CREATE TABLE NewYearRoster(LastName VARCHAR(50), FirstName VARCHAR(50), Address VARCHAR(100), Age INT); INSERT INTO NewYearRoster VALUES ('Fritz', 'David', '181 Kline Street', 15) ,('Reese', 'Paul', '1950 Grandview Place', 15) ,('Adam', 'Wilbur', '4231 W. 93rd', 15);
My merge example will be using the source table (NewYearRoster) to maintain the target table (Students) by inserting new students, updating students information if they were enrolled last year (meaning already in Student table), and delete student records if they are not enrolled for the new year (meaning not in NewYearRoster table). To accomplish this here is my MERGE statement:
MERGE Students AS T USING NewYearRoster AS S ON S.LastName = T. LastName and S.FirstName = T.FirstName WHEN MATCHED THEN UPDATE SET T.Address = S.Address, T.Age = S.Age WHEN NOT MATCHED THEN INSERT (LastName, FirstName, Address, Age) VALUES (S.LastName, S.FirstName, S.Address, S.Age) WHEN NOT MATCHED BY SOURCE THEN DELETE;
After I run the above command, my Student table contains the following records:
LastName FirstName Address Age -------------- ---------------- ------------------------------------------------ ----------- Fritz David 181 Kline Street 15 Reese Paul 1950 Grandview Place 15 Adam Wilbur 4231 W. 93rd 15
If you review this with my original Students table, you will find that the above MERGE statement did the following:
- David Fritzs record got the Age columns updated (also the address was updated but it was the same in both the source and target tables)
- Paul Reeses record got the Address and Age columns updated
- Jake Browns student record was deleted
- Adam Wilburs record was inserted
Adding Additional MERGE Search Conditions
You can add additional conditions to the MATCH or NOT MATCH clauses to make your MERGE search conditions more complicated. Below I have modified my above simple insert, update, or delete example to only update existing records if the address in the NewYearRoster is different from the current Students address record:
MERGE Students AS T USING NewYearRoster AS S ON S.LastName = T. LastName and S.FirstName = T.FirstName WHEN MATCHED and T.Address <> S.Address THEN UPDATE SET T.Address = S.Address, T.Age = S.Age WHEN NOT MATCHED THEN INSERT (LastName, FirstName, Address, Age) VALUES (S.LastName, S.FirstName, S.Address, S.Age) WHEN NOT MATCHED BY SOURCE THEN DELETE;
I added my additional MERGE search condition to the WHEN MATCHED clause, so now it reads WHEN MATCHED and T.Address <> S.Address THEN UPDATE . By doing this the David Fritzs Students record does not get updated because this record has the same address in both the target and source tables.
Output Clause
If you want to find out what records are being inserted, updated, or deleted when you run a MERGE statement you can use the OUTPUT clause. This clause will output information you specify for every record inserted, updated, and/or deleted. Ive modified my basic insert, update, and delete example above to output the type of action taken, plus the First and Last name. Here is that modified MERGE statement.
MERGE Students AS T USING NewYearRoster AS S ON S.LastName = T. LastName and S.FirstName = T.FirstName WHEN MATCHED THEN UPDATE SET T.Address = S.Address, T.Age = S.Age WHEN NOT MATCHED THEN INSERT (LastName, FirstName, Address, Age) VALUES (S.LastName, S.FirstName, S.Address, S.Age) WHEN NOT MATCHED BY SOURCE THEN DELETE OUTPUT $action, Inserted.LastName, Inserted.FirstName, Deleted.LastName, Deleted.FirstName;
When I run this statement, I get the following output:
$action LastName FirstName LastName FirstName ---------- ------------ ------------- ------------- ------------------- INSERT Adam Wilbur NULL NULL UPDATE Fritz David Fritz David UPDATE Reese Paul Reese Paul DELETE NULL NULL Brown Jake
Now you can tell this statement performed 4 different actions, 1 insert, 2 updates and 1 delete statement.
Using Top Clause in MERGE Statement
You can add a TOP clause to the MERGE statement. When you do this, the MERGE statement will limit the number of actions performed to the number specified in the TOP clause. Below I have modified my basic insert, update, and delete example to contain a TOP clause to request that only 2 actions be taken:
MERGE TOP (2) Students AS T USING NewYearRoster AS S ON S.LastName = T. LastName and S.FirstName = T.FirstName WHEN NOT MATCHED THEN INSERT (LastName, FirstName, Address, Age) VALUES (S.LastName, S.FirstName, S.Address, S.Age) WHEN NOT MATCHED BY SOURCE THEN DELETE WHEN MATCHED THEN UPDATE SET T.Address = S.Address, T.Age = S.Age OUTPUT $action, Inserted.LastName, Inserted.FirstName, Deleted.LastName, Deleted.FirstName;
Here is the output produced by the OUTPUT clause when I run this MERGE statement:
$action LastName FirstName LastName FirstName ---------- ------------ ------------- ------------- ------------------- UPDATE Fritz David Fritz David UPDATE Reese Paul Reese Paul
Here the MERGE statement only performed 2 UPDATE statements. If you look at my MERGE statement I arranged the WHEN clause to have the UPDATE statement as the last set of actions to be taken. However, you can see that my MERGE statement did not perform the actions in the order they where specified in the MERGE statement. If you play with this statement you will find the updates, are performed first, then inserts, and lastly delete statements will be performed.
Benefits of Using MERGE
The MERGE statement simplifies the amount of work it takes to implement insert and update logic. Now with a single statement you can easily implement insert, update, and delete logic to handle criteria for maintaining a table. The MERGE statement handles all the joining of the source and target. This minimizes the amount of code you need to write to handle merge logic. The MERGE statement is just one of the many enhancements that Microsoft has included in SQL Server 2008.