SQL Server – Do You Know Soundex Functions?

Hi Folks,

I am sure, many of you would be wondering what this is all about, and some might have heard it also. So you don’t have to search it further. I am going to give you some heads up:

  • Soundex is a phonetic pattern-matching system created mainly for the American census. It was developed earlier by Margaret K. Odell and Robert C. Russell and also was awarded for their efforts.
  • The main function of Soundex is to order the same sounding names together, which also turns out to be very useful for dealing with the database application containing the contact informaion.
  • For e.g., was it Smith or Smythe?  Chapple, Chapel or Chapelle? The Soundex Functions makes it easy to recognize these difficulties.
  • This system uses a simple phonetic algorithm to reduce each name to a four character alphanumeric code.
  • The first letter of the code corresponds to the first letter of the name. The remainder of the code consists of three digits derived from the syllables of the word according to the following code:
    • 1 = B, F, P, V
    • 2 = C, G, J, K, Q, S, X, Z
    • 3 = D, T
    • 4 = L
    • 5 = M,N
    • 6 = R
  • The double letters with the same Soundex code, A, E, I, O, U, H, W, Y, and some prefixes are being disregarded. Like “Piyush” becomes “P200” by the following method:
    • The P is stored.
    • The i, y and u are disregarded.
    • The s is stored as Soundex code 2.
    • The h is disregarded.
    • As you know to reduce each name to a 4-character alphanumeric code, then in the rest two places put 0’s
  • You can see more examples like:
    • Elephant = E415 (l=4, p=1, n=5)
    • Computer = C513 (m=5, p=1, t=3)
    • Database = D312 (t=3, b=1, s=2)
  • SQL Server includes two Soundex-related functions, SOUNDEX () and DIFFERENCE (). We discuss each of them briefly:

Using the SOUNDEX () function:

  • The SOUNDEX (string) function calculates the Soundex code for a string in SQL Server as follows:

1_SQL_Server_Do_You_Know_Soundex_Functions

  • So we can use it easily in the SELECT statement and in the WHERE clause, as follows:
USE TEST  
SELECT FName, LName FROM dbo.Students
WHERE SOUNDEX('Pareshan') = SOUNDEX(FName);

The result can be seen as follows where similar sounding Names are in one column:

2_SQL_Server_Do_You_Know_Soundex_Functions

   
  • This implementation has the smaller impact on the data schema, but it will cause performance issues as the data size grows because the SOUNDEX() function must execute for every row in the database, due to which an index on the name column cannot be used with an efficient seek operation, but only with a much more expensive scan.
  • Thus, this issue can be cured if the name with the same first letter is selected in a WHERE clause. This allows SQL Server to use any indexes to narrow the search, so fewer rows must be read and the SOUNDEX() function must be performed only for rows selected by the index as follows:
USE TEST  
SELECT FName, LName FROM dbo.Students
WHERE SOUNDEX('Piyush') = SOUNDEX(FName)
AND FName LIKE 'P%';

The results can be seen as same as the above what the big difference these two queries are in performance terms.

The first query have taken of about 40 miliseconds to execute while second one have taken just about 7 milliseconds. So you can see the difference between these two queries, which can be turned out to be very handy while accessing big databases.

Using the DIFFERENCE () Soundex function:

  • The DIFFERENCE(String)  function returns the Soundex difference between the two strings in the form of a ranking from 0 to 4, with 4 representing a perfect Soundex match:
USE TEST  
SELECT FName, DIFFERENCE('Piyush',FName) AS DIFF_NAME 
FROM dbo.Students
ORDER BY DIFFERENCE('Piyush',FName) DESC;

The result can be seen as:

3_SQL_Server_Do_You_Know_Soundex_Functions

  • The main advantage of DIFFERENCE() function is that it broadens the search keyword beyond the first letters.

Well this was all about Soundex function.

Hope You like it 🙂

 

Regards

Piyush Bajaj

Like us on FaceBook Follow us on Twitter | Join the fastest growing SQL Server group on FaceBook

Follow me on Twitter  |  Follow me on FaceBook

   

About Piyush Bajaj

I am very passionate about SQL Server. I also did certification on MCSA – SQL Server 2012, Querying and Administering; MCTS – SQL Server 2008, Database Development; and MCTS – SQL Server 2005, Implementation & Maintenance, which helped me to get more knowledge and interest on this field.Please feel free to drop me any question online or offline, I will try to give you the best possible answer from my side.Right now I am working as a SQL Server developer in TCS. I have an experience of just 2.6 years, well I can only say that “If you have an interest and passion, experience might become a very small thing”.

View all posts by Piyush Bajaj →

10 Comments on “SQL Server – Do You Know Soundex Functions?”

  1. Useful post, thanks. I’m still trying to find the most effective way to create a “Google-like” search feature so this may assist in my efforts. FTI is another option but has its drawbacks.

  2. Hi piyush,
    Iam working in cmc limited , i have a doubt here,is there any function to implement “metaphone”(similar to soundex) by using sql

Leave a Reply

Your email address will not be published.