Hi Friends,

Data Mining is seldom talked about. We all focus a lot on SQL Server DB engine and rightly so from the perspective of programming, manageability, availability, etc. Today, I would like to talk a bit about the Data Mining offering from SQL Server. What exactly is Data Mining? Why do u need it? How does it benefit an organization? I plan to write a series of articles on data mining, but let me start with the “WHY” part of it… I hope I am successful in writing a series  :)

Data Mining is the process of identifying hidden relationships and patterns in your data. I know that sounds very bookish. Let me explain more. Sometimes, organizations have difficult questions that need to be answered. These difficult questions need intelligent answers. For example, your company sells products to millions of customers, and your company wants to know why does a customer purchase your product? What are the decision making criteria before a customer decides to buy your product? Which factors influence his decision the most? Now with millions of customers in your database, how would you answer such a question? This asks for an intelligent answer. Do human beings have the capability to comprehend or analyze such huge amount of data? Will you have the capacity to browse transaction by transaction (row by row) to fathom the attribute values in your customers/orders tables? For sure this is not practical. Data mining comes to rescue and the company needs data mining because either the data is too complex or the data is huge or both. And casual human observation cannot do the job. The company needs to know the answers to the difficult questions because they want take some action based on the discovering, based on the hidden relationships. This is basically the actionable information that the company wants. There are many scenarios where data mining can be used like:-

• Forecasting sales
• Targeting mailings toward specific customers
• Determining which products are likely to be sold together
• Finding sequences in the order that customers add products to a shopping cart

There are certainly more examples where US agencies use Data Mining to detect frauds, financial institutions use it to identify potential defaulters, etc..

The subject here that I am trying to talk about here is Predictive Analysis. Data Mining is also known as KDD process (Knowledge Discovery in Database).

In my next blog, I shall talk more about the Data Mining process and the project Life cycle. Stay tuned.