This is a continuation of string manipulation techniques. If you are interested, you can read the other articles,
though you do not need to read it before this one. These are mostly beginning
programming articles, but advanced T-SQL programmers may still find something useful here.
Part 1 deals with SUBSTRING and how it can be used to extract some information
from a field of data
Part 2 deals with CHARINDEX and how it can be used to extract some information
from a field of data when the data is delimited rather than stored in a particular format.
Part 3 deals with REPLACE and how it can be used to remove unwanted information
from a field of data when the data is not in a known format.
Part 4 deals with numeric conversions.
Continuing on with taming strings...
The first three parts of this series dealt with manipulating strings in response to some business issue, usually dealing with phone numbers or zip codes. This article continues in the same vein with another alternative for dealing with the formatting issues that occur with these types of data.
Usually I receive data in a variety of formats and need to force it into a standard format. In past articles, I have used various string techniques to remove unnecessary data and reformat the data into a standard format. A lack of validation of input data, or no control over the data being input is the cause, but it happens. However, once I remove all extraneous formatting to standardize the data, I sometimes need to add additional formating to standardize the data for a front end application(s).Suppose that I had scrubbed and cleansed all my phone number data from Part 1 of this series to look like the following:
phone ---------- 6055552862 5615552700 9045555680 N/A 5805555371 2815558368 2545558430 3365552797 3365557233 5925553181x4951 96615551222x249 447930555271Now I wish to format this data so that all phone numbers are stored in the following format:
(999) 999-9999for US phone numbers. For this article, assume that I have some method of determining which rows are US phone numbers and which are not. This leaves me with the following rows
phone ---------- 6055552862 5615552700 9045555680 5805555371 2815558368 2545558430 3365552797 3365557233 5925553181x4951 96615551222x249that I want to convert to:
phone ---------- (605) 555-2862 (561) 555-2700 (904) 555-5680 (580) 555-5371 (281) 555-8368 (254) 555-8430 (336) 555-2797 (336) 555-7233 (592) 555-3181x4951 (966) 155-51222x249
Once again, I decided to search my handy-dandy Books Online (for those of you with young kids, you will get the joke. For the rest of you look here) in the string functions area and I decided to use STUFF. This is a string function that allows you to delete characters from a string and insert a string value into another string with control over the positioning of the insertion. The difference from REPLACE is that this function uses a position in the string to make replacement rather than a pattern.
The format for STUFF is as follows:
STUFF ( character_expression 1, start , length , character_expression 2)
where the parameters are:
- Character_expression 1 - The string expression in which to insert data
- Start - The start position for the insertion
- Length - The number of characters in character_expression 1 to delete.
- Character_expression 2 - the string to insert into character_expression 1
The trick is to not delete any characters. If I use the following code:
declare @d char( 14) select @d = '6055552862' select stuff( @d,1 ,1,'('), @dI get
(055552862which is not what I want. Instead, I want to change the length parameter to a 0, so no characters are deleted. In this case, an insertion occurs with no characters being deleted. The code now looks like:
declare @d char( 14) select @d = '6055552862' select stuff( @d,1 , 0,'('), @dI get
(6055552862By expanding this to include all the characters that I need, I can reformat the entire string in one (messy) SQL statement. Here is it:
declare @d char( 14) select @d = '6055552862' select stuff( stuff( stuff( stuff( @d,1 ,0, '('), 5, 0, ')'), 6, 0, ' '), 10, 0, '-')which yeilds
This statement could have easily been done with the SUBSTRING command and a few concatenations. I think the STUFF command is cleaner and easier to read. Once you get used to using it's syntax. Of course, to update the table of information, I would have to be sure that all rows were in the same format or use a WHERE clause to limit the update to those rows that match the criteria.
I hope and am sure this function will really spark some ideas in many of you. Hopefully I will come in handy for you in solving some problem.
As always, I welcome feedback and please rate this article below (and any you read on Swynk). It helps to motivate and assist us authors in writing better columns.Steve Jones