SqlCredit - Part 9: Message Authentication Codes

Thursday Sep 27th 2007 by Rob Garrison

This month's installment of "Developing a Complete SQL Server OLTP Database Project" covers searching encrypted data, dictionary attacks, and look-ups by hashed value.

Download the files for this article.

This is part 9 of a series. If you have not read part 8, you can find it here.

The Problem: Searching Encrypted Data

Once we have encrypted the data in our table, we cannot efficiently search that data without making further changes. Remember from part 7 that “EncryptByCert, EncryptByKey, EncryptByAsymKey, and EncryptByPassPhrase all create different results (ciphertext) each time they are called.”

The solution is to use Message Authentication Codes. Note that this is a special use of the general term used for messaging.

Hashing is different from encryption in that it always returns the same ciphertext. If I run this code:

	SELECT HashBytes('MD5', 'secret')

I get this ciphertext:


Running it again returns the same result. Running it on a different machine also returns the same result.

Different algorithms return different ciphertext. For example, if I run this code:

	SELECT HashBytes('SHA1', 'secret')

I get this ciphertext:


HashBytes always returns varbinary results with the following length:

Algorithm      Length
MD2, MD4, MD5      16
SHA, SHA1      20

I created two UDTs (HashResultMD and HashResultSHA) based on the result lengths of the different algorithms. It would be very reasonable to use the maximum length and have only one type. Another option is to use just some of the bytes of the result. I have chosen here to use the complete result. It is important to understand that, even using the full result, there is not a guarantee that you will not have collisions.

What About Dictionary Attacks?

Because hashing creates consistent ciphertext, it would be possible to run a dictionary attack against a table and find values that match a given string. From the example above, we can see that any entry with ciphertext that matches “0x5EBE2294ECD0E0F08EAB7690D2A6EE69” was created by hashing the string “secret” using the MD5 algorithm.

The way to fix this is by using a “salt” value. If we can unencrypt a salt value using DecryptByKey (which we have seen is fast) and then pre-pend (or append) the string we’re trying to store, then a hacker can’t just hash a set of values and compare the ciphertext against the result.

The salt should be a reasonable length, but I don’t think you should expose the exact length in your code. If you create a UDT that allows for the storage of a hashed value prepended with a secret, the UDT should allow for some variance in the secret length.

Look-Up by Hashed Value

Reference the attached file EncryptByKeyPerfTest_Setup.sql.

Given a value (for example, CardNumber or SecureString), we can now perform efficient searches. Since hashing consistently returns the same ciphertext, we can first hash the look-up value and then find a record or records that match that value.

Note that SecurityCode1 does not have a corresponding MAC column in CardEncryptByKey. This is because there is no use case for looking up a record by its security code without its card number. If we use a MAC column to allow efficient look-ups of records, the security code column is used only for validation that the proper record has been found.

Create MAC columns only for columns that will be searched, and remember that the MAC supports efficient look-up and not exact filtering.

A Simple(ish) Example

Reference the attached file SimplestEncrWithMAC.sql.

It is easier to show code than explain everything behind it. Here, we build a table to store the salt and another to store data records:

	CREATE TABLE dbo.SecretSalt (
	    SaltID        tinyint        NOT NULL,
	    Salt          varbinary(100) NOT NULL
	CREATE TABLE dbo.Hash1 (
	    RecordID      int            NOT NULL,
	    EncryptResult varbinary(116) NOT NULL,  -- Encrypted
	    HashResult    varbinary(20)  NOT NULL   -- Hashed

First, let’s add the “salt” record.

	INSERT INTO dbo.SecretSalt (

Next, add records to Hash1. We run this code within a loop to create n records.

	SELECT @string1 = 'Simple' + CONVERT(varchar(11), @ct);
	INSERT INTO dbo.Hash1 (
	    EncryptByKey(Key_GUID (N'TestKey1'), @string1),
	    HashBytes('SHA1', @string1)

The code will add records that look like this:

	RecordID EncryptResult              HashResult
	-------- -------------------------- -------------------
	       1 0x00EC5555DE58...FFEDCA8C7 0xF71C033...4C307B5
	       2 0x00EC5555DE58...D73349886 0x7A66A82...A6B4DBC

Now, we can look up a single record that matches an expected string. We pick the string “5” which should match the fifth record and return a RecordID of 5.

	DECLARE @lookUp int;
	SELECT @lookUp = 5;     -- Pick some record to search for
	    CONVERT(int, CONVERT(varchar(11), DecryptByKey(h.EncryptResult)))
	        AS 'Decrypted Value'
	FROM dbo.Hash1            AS h
	CROSS JOIN dbo.SecretSalt AS s
	WHERE s.SaltID = 1
	  AND h.HashResult = HashBytes (
	    CONVERT(varchar(63), DecryptByKey(s.Salt)) + CONVERT(varchar(11), @lookUp)
	  )	-- This uses the index to find the record
	  AND CONVERT(int, CONVERT(varchar(11), DecryptByKey(h.EncryptResult))) = @lookUp;

The WHERE clause has three parts:

1.      Even though there is only one record in the SecretSalt table, tell it which one to use.

2.      Allow the optimizer to use the index on the varbinary hash column.

3.      Validate that the correct record or records are returned.

All these parts are required for an efficient and reliable look-up. This returns the exact record we expected.

What do you think? If you agree, fine. If not, provide feedback in the forum. I have added a new thread called MAC Code. This code is complex. If you find a more straightforward way of doing this, please share.

Limitations and Options

Message authentication codes work only for finding an exact match. It does not support range queries or wildcards.

Using a message authentication code column allows a hacker to determine values that are the same. As an example, if we salted and then hashed two credit card numbers that were the same, the ciphertext would be exactly the same.

You may choose to not use the full ciphertext returned. For instance, the varbinary result (16 or 20 bytes) can be converted to an integer (4 bytes) or bigint (8 bytes).

Whether you use the full ciphertext or a portion of it, you still have to validate that the encrypted value matches the value you are searching for because of the possibility of hash collisions. An option is to SELECT records that match the hashed value and only validate the encrypted data if the SELECT returns more than one record.

For Next Time

Next month, I will discuss the performance results of the combination of encryption and hashing. I will also update the SqlCredit code to include all these changes.


Download the files for this article.

Discuss this article in the SQLCredit Forum.

» See All Articles by Columnist Rob Garrison

SqlCredit - Developing a Complete SQL Server OLTP Database Project
- Performance Testing SQL 2008's Transparent Data Encryption
- SQL Server 2008's Change Data Capture - Tracking the Moving Parts
- Performance Testing - SQL Server 2008 versus SQL Server 2005
- Exploring SQL Server's Index INCLUDEs
- Exploring SQL 2005's Ranking Functions - NTILE() and ROW_NUMBER()
- SqlCredit - Part 19: Exploring SQL 2005's Ranking Functions - RANK() and DENSE_RANK()
- SqlCredit, Part 18: Exploring the Performance of SQL 2005's OUTPUT Clause
- SqlCredit - Part 17: Exploring SQL 2005's OUTPUT Clause
- SqlCredit - Part 16: The Cost of Bloat
- SqlCredit - Part 15: The Cost of Distribution
- SqlCredit - Part 14: The Cost of Translation
- SqlCredit - Part 13: More on Indexed Persisted Computed Columns
- SqlCredit - Part 12: Exploring Indexed Persisted Computed Columns
- SqlCredit - Part 11: Change Tracking Using History Records
- SqlCredit - Part 10: MAC Performance and Updating SqlCredit
- SqlCredit - Part 9: Message Authentication Codes
- SqlCredit Part 8: Comparing Encrypt/DecryptByCert and Encrypt/DecryptByKey
- SqlCredit Part 7: Performance Impact of EncryptByCert and DecryptByCert
- SqlCredit Part 6: Exploring EncryptByCert and DecryptByCert
- SqlCredit - Part 5: Adding Card, Vendor, and Purchase, Plus Much Refactoring
- SqlCredit - Part 4: Schema and Procedure Security
- SqlCredit - Part 3: Schema/Proc Updates and Automated Unit Testing
- SqlCredit - Part 2: Creating the Database, Tables, CRUD Procedures
- SqlCredit - Developing a Complete SQL Server OLTP Database Project

Mobile Site | Full Site