SQL Server: Some thoughts on Data Mining in SQL Server

Hi Friends,

Data Mining is seldom talked about. We all focus a lot on SQL Server DB engine and rightly so from the perspective of programming, manageability, availability, etc. Today, I would like to talk a bit about the Data Mining offering from SQL Server. What exactly is Data Mining? Why do u need it? How does it benefit an organization? I plan to write a series of articles on data mining, but let me start with the “WHY” part of it… I hope I am successful in writing a series  🙂

Data Mining is the process of identifying hidden relationships and patterns in your data. I know that sounds very bookish. Let me explain more. Sometimes, organizations have difficult questions that need to be answered. These difficult questions need intelligent answers. For example, your company sells products to millions of customers, and your company wants to know why does a customer purchase your product? What are the decision making criteria before a customer decides to buy your product? Which factors influence his decision the most? Now with millions of customers in your database, how would you answer such a question? This asks for an intelligent answer. Do human beings have the capability to comprehend or analyze such huge amount of data? Will you have the capacity to browse transaction by transaction (row by row) to fathom the attribute values in your customers/orders tables? For sure this is not practical. Data mining comes to rescue and the company needs data mining because either the data is too complex or the data is huge or both. And casual human observation cannot do the job. The company needs to know the answers to the difficult questions because they want take some action based on the discovering, based on the hidden relationships. This is basically the actionable information that the company wants. There are many scenarios where data mining can be used like:-

   

• Forecasting sales
• Targeting mailings toward specific customers
• Determining which products are likely to be sold together
• Finding sequences in the order that customers add products to a shopping cart

There are certainly more examples where US agencies use Data Mining to detect frauds, financial institutions use it to identify potential defaulters, etc..

The subject here that I am trying to talk about here is Predictive Analysis. Data Mining is also known as KDD process (Knowledge Discovery in Database).

In my next blog, I shall talk more about the Data Mining process and the project Life cycle. Stay tuned.

 

 

   

About Amit Bansal

Amit Bansal is always brainstorming around SQL Server. Despite working with SQL since 1997, he is amazed that he keeps learning new things every single day. SQL Server is AB's first love, and his wife does not mind that. He tries to share as much and spreads the SQL goodness. Internals and Performance Tuning excites him, and also gives him sleepless nights at times, simply because he is not a genius, but quite a hard worker and does not give up. It has been a long and exciting journey since 1997, you can read here: http://sqlmaestros.com/amit-bansal/ He is on Twitter: https://www.twitter.com/A_Bansal

View all posts by Amit Bansal →

2 Comments on “SQL Server: Some thoughts on Data Mining in SQL Server”

  1. Hi Amit, This will be a really interesting series (Over 350 view in few hours) and as usual I am sure you will present this so called “Complicated” topic (as thought many people) in simple lucid language.

    All the best

  2. Yes Amit, I will start working on a project probably so I will write more on DM. Your SSRS blog has received 200+ views – great going !!! Write more !

Leave a Reply

Your email address will not be published.