What is Data Mining? – Simple Definition
The amount of raw data stored in corporate databases is exploding. From trillions of point-of-sale transactions and credit card purchases to pixel-by-pixel images of galaxies, databases are now measured in gigabytes and terabytes. (One terabyte = one trillion bytes. A terabyte is equivalent to about 2 million books!) For instance, every day, Wal-Mart uploads 20 million point-of-sale transactions to an A&T massively parallel system with 483 processors running a centralized database. Raw data by itself, however, does not provide much information. In today’s fiercely competitive business environment, companies need to rapidly turn these terabytes of raw data into significant insights into their customers and markets to guide their marketing, investment, and management strategies.
What is Data Mining?
Data mining, or knowledge discovery, is the computer-assisted process of digging through and analyzing enormous sets of data and then extracting the meaning of the data. Data mining tools predict behaviors and future trends, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.
Data mining derives its name from the similarities between searching for valuable information in a large database and mining a mountain for a vein of valuable ore. Both processes require either sifting through an immense amount of material, or intelligently probing it to find where the value resides.
Although data mining is still in its infancy, companies in a wide range of industries – including retail, finance, heath care, manufacturing transportation, and aerospace – are already using data mining tools and techniques to take advantage of historical data. By using pattern recognition technologies and statistical and mathematical techniques to sift through warehoused information, data mining helps analysts recognize significant facts, relationships, trends, patterns, exceptions and anomalies that might otherwise go unnoticed.
For businesses, data mining is used to discover patterns and relationships in the data in order to help make better business decisions. Data mining can help spot sales trends, develop smarter marketing campaigns, and accurately predict customer loyalty. Specific uses of data mining include:
- Market segmentation – Identify the common characteristics of customers who buy the same products from your company.
- Customer churn – Predict which customers are likely to leave your company and go to a competitor.
- Fraud detection – Identify which transactions are most likely to be fraudulent.
- Direct marketing – Identify which prospects should be included in a mailing list to obtain the highest response rate.
- Interactive marketing – Predict what each individual accessing a Web site is most likely interested in seeing.
- Market basket analysis – Understand what products or services are commonly purchased together; e.g., beer and diapers.
- Trend analysis – Reveal the difference between a typical customer this month and last.
Data mining technology can generate new business opportunities by:
Automated prediction of trends and behaviors: Data mining automates the process of finding predictive information in a large database. Questions that traditionally required extensive hands-on analysis can now be directly answered from the data. A typical example of a predictive problem is targeted marketing. Data mining uses data on past promotional mailings to identify the targets most likely to maximize return on investment in future mailings. Other predictive problems include forecasting bankruptcy and other forms of default, and identifying segments of a population likely to respond similarly to given events.
Automated discovery of previously unknown patterns: Data mining tools sweep through databases and identify previously hidden patterns. An example of pattern discovery is the analysis of retail sales data to identify seemingly unrelated products that are often purchased together. Other pattern discovery problems include detecting fraudulent credit card transactions and identifying anomalous data that could represent data entry keying errors.
HOW IT WORKS (EXAMPLE):
So called because of the manner in which it explores information, data mining is carried out by software applications which employ a variety of statistical and artificial intelligence methods to uncover hidden patterns and relationships among sets of data. For instance, a data mining program might be able to uncover a relationship between high sales volumes and poor weather conditions.
WHY IT MATTERS:
Data mining software is able to perform complex calculations and analyses on sets of data in a very short time. For this reason, data mining is used by companies in strategic planning.