MERGE is a new statement introduced in the SQL:2003 standard for performing multiple DML (Data Manipulation Language) statements against a target table at once. In this article well look into ways to take advantage of this powerful addition to SQL Server 2008.
What is MERGE
In SQL Server 2005 and prior, if you needed to insert, update, and delete from a table based on changes to another table, you had to write all three statements separately. With SQL Server 2008, all the changes to the target table can be combined into one MERGE statement. The advantage being that the tables being used for the MERGE are processed one time instead of one per INSERT, UPDATE, or DELETE.
MERGE takes a source query to use as a basis for determining what differences exist between it and the target table. Match conditions are established to direct when and what data are affected in the target table by the existence of a match or the lack of a match.
Below is the syntax for MERGE from SQL Server 2008 Books Online. I believe the syntax is best understood by looking at examples.
[ WITH <common_table_expression> [,...n] ] MERGE [ TOP ( expression ) [ PERCENT ] ] [ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ] USING <table_source> ON <merge_search_condition> [ WHEN MATCHED [ AND <clause_search_condition> ] THEN <merge_matched> ] [...n ] [ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ] THEN <merge_not_matched> ] [ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ] THEN <merge_matched> ] [...n ] [ <output_clause> ] [ OPTION ( <query_hint> [ ,...n ] ) ] ;
There are many potential uses for the MERGE statement such as with integration of systems, data warehousing, and even to facilitate movement of reference data changes from development through testing and into production. Here well examine the first one, integration between systems.
Assume you work for the e-commerce division of a large pet store corporation. That company has many brick-and-mortar pet stores as well as your online storefront for pet supplies. Your job is to make sure that certain data is properly maintained between your storefront and the pet stores in order to ensure the customer has a consistent experience across the corporation. The most important piece of information to be concerned with is the customer data. If John Smith walks into the pet shop in Chesapeake and purchases a puppy, you want to know near real time. So when he returns home and realizes he needs squeaky toys to substitute for his now maimed tennis shoes, he can come straight to your online pet supply store and get a warm fuzzy when he realizes the site already knows him and has recommended products he might find useful for improving his experience with his new billy goat..er puppy.
When John Smith purchased that puppy, the Chesapeake store collected data about him and sent it off to a service which in turn calls your stored procedure to update the online stores database. Your stored procedure used to look like the following:
CREATE PROCEDURE SaveCustomer ( @custID INTEGER, @custName VARCHAR(100), @custAddr1 VARCHAR(100), @custAddr2 VARCHAR(100), @custCity VARCHAR(50), @custState CHAR(2), @custZip VARCHAR(9), @custPhone VARCHAR(10), @custEmail VARCHAR(100)) AS BEGIN IF EXISTS (SELECT 1 FROM customer WHERE custID = @custID) UPDATE customer SET custName = @custName, custAddr1 = @custAddr1, custAddr2 = @custAddr2, custCity = @custCity, custState = @custState, custZip = @custZip, custPhone = @custPhone, custEmail = @custEmail WHERE custID = @custID ELSE INSERT INTO customer (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) VALUES (@custID, @custName, @custAddr1, @custAddr2, @custCity, @custState, @custZip, @custPhone, @custEmail) END
Now that youve discovered MERGE, you re-write your stored procedure to look like this.
ALTER PROCEDURE SaveCustomer ( @custID INTEGER, @custName VARCHAR(100), @custAddr1 VARCHAR(100), @custAddr2 VARCHAR(100), @custCity VARCHAR(50), @custState CHAR(2), @custZip VARCHAR(9), @custPhone VARCHAR(10), @custEmail VARCHAR(100)) AS BEGIN MERGE dbo.customer AS cust USING (SELECT @custID, @custName, @custAddr1, @custAddr2, @custCity, @custState, @custZip, @custPhone, @custEmail) AS src (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) ON cust.custID = src.custId WHEN MATCHED THEN UPDATE SET custName = src.custName, cust.custAddr1 = src.custAddr1, cust.custAddr2 = src.custAddr2, cust.custCity = src.custCity, cust.custState = src.custState, cust.custZip = src.custZip, cust.custPhone = src.custPhone, cust.custEmail = src.custEmail WHEN NOT MATCHED THEN INSERT (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) VALUES (src.custID, src.custName, src.custAddr1, src.custAddr2, src.custCity, src.custState, src.custZip, src.custPhone, src.custEmail) END
Lets break this down. The first thing after the keyword MERGE is the target table. The USING clause specifies the source result set. In this case, the source comes from the arguments passed into the stored procedure. The source result set is aliased as src and the columns are given aliases as well. ON specifies the condition to be used to determine a match between the source and target tables. WHEN MATCHED THEN tells the engine what action should be performed on the target table should the ON clause result in a match between the tables. In our case, if the customer is found by custID, we want to uspdate that row in our customer table with the values passed in. If a matching row is not found, we drop into the action found under WHEN NOT MATCHED. This time, the customer is not found by custID, so we want to insert him into our customer table. Simple as that.
Another example of a use for MERGE, as I mentioned above is the movement of reference data changes through your development, test, and production environments. Over the course of application maintenance, there is often need to change or add to the reference data supplied to the application. This reference data may be the products you offer, the countries you do business in, part numbers, etc.
You write the script up that includes the data changes you are making for the next release and its packaged up and sent to testing. Something else breaks in testing that is part of the larger build. Now your script has to be smart enough to know if its already been run in the test database. Lets take a look at this. Heres your original script for a new product entry. Youve been bitten before by your script being run more than once in the same database, so youve wrapped it in an IF condition.
IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201005) INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag) VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y') GO IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201006) INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag) VALUES (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y') GO
This same script would look like this using MERGE.
MERGE Products AS p USING (VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y'), (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y') ) AS src (srcProductID, srcProductName, srcManufName, srcActiveFlag) ON p.ProductID = src. ProductID WHEN NOT MATCHED THEN INSERT (ProductID, ProductName, ManufName, ActiveFlag) VALUES (srcProductID, srcProductName, srcManufName, srcActiveFlag)
Personally, I find this syntax easier to write when there are many new data values to add. You can put all the new data together in the USING clause instead of in multiple VALUES clauses seperated by INSERTS as in the previous script.
SQL Server 2008 offers a new DML statement that allows you to combine INSERT, UPDATE, and DELETE operations into one. By processing the source and target tables on time using MERGE you can realize performance gains in addition to script simplification.
For More Information
MSDN - MERGE
TechNet - Optimizing MERGE Statement Performance