Copyright © Philip M. Parker, INSEAD. Terms of Use.

| Domain | Definition |
Computing | Data mining |
Source: compiled by the editor from various references; see credits. | |
(From Wikipedia, the free Encyclopedia)
Data mining is the practice of automatically searching large stores of data for patterns. To do this, data mining uses computational techniques from Statistics and Pattern recognition.
Used in the technical context of data warehousing it is neutral. However, it also has a wider, more pejorative usage that implies imposing patterns (and particularly causal relationships) on data where none exist.
Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data" [1] and "The science of extracting useful information from large data sets or databases" [2].
It is also known as knowledge-discovery in databases (KDD).
Used in this sense, "data mining" implies scanning the data for any relationships, and then when one is found coming up with an interesting explanation. The problem is that large data sets invariably happen to have some exciting relationships peculiar to that data. Therefore any conclusions reached by data mining are likely to be highly suspect. In spite of this, some exploratory data work is always required in any applied statistical analysis to get a feel for the data, so sometimes the line between good statistical practice and data mining is less than clear.
Here is an example. The insurance industry has found that people with good credit records tend to be more likely to make car insurance claims, and have therefore modified their pricing. While this appears to be a legitimate finding, politicians in the United States have queried its legitimacy, on the 'common-sense' grounds that how a person handles their credit card doesn't affect how they handle a car. So a finding that is statistically legitimate might not hold up to public scrutiny.
A more significant danger is finding correlations that do not really exist. An example of this is found at the investment website The Motley Fool. In the late 1990s the website had a suggested investment portfolio known as the Foolish Four, which was based on a data mining analysis of trends in the stock market. Further research in the early 2000s has highlighted that the correlations they found were an artifact of the particular data set they used, rather than reflecting reality. This experience is one of many similar false findings linked to the stock market.
There are also privacy concerns associated with data mining. For example, if an employer has access to medical records, they may screen out people with diabetes or have had a heart attack. Screening out such employees will cut costs for insurance, but it creates ethical and legal problems.
There are many legitimate uses of data mining. For example, a database of all prescription drugs taken by people can be used to find combinations of drugs with an adverse reaction. Since the combination may occur only in 100 people and the reaction in 10 of them, a single case may not raise a red flag. Such a database could find reactions and save lives. However, there is huge potential for abuse of such a database.
Basically, data mining gives information that wouldn't be available otherwise. It must be properly interpreted to be useful. When the data collected involves individual people, there are many questions concerning privacy, legality, and ethics.
[2] D. Hand, H. Manila, P. Smyth: Principles of Data Mining. MIT Press, Cambridge, MA, 2001.
Note: if you got here by looking for the rapper KDD, see KDD (rapper).See Also
[1] W. Frawley and G. Piatetsky-Shapiro and C. Matheus, Knowledge Discovery in Databases: An Overview. AI Magazine , Fall 1992, pgs 213-228.
Source: adapted by the editor from Wikipedia, the free encyclopedia under a copyleft GNU Free Documentation License (GFDL) from the article "Data mining."
Crosswords: DATA MINING |
| Specialty definitions using "DATA MINING": SP/2, SP2, SPSS, Inc.. (references) |
| Domain | Title |
Books |
|
Source: compiled by the editor from various references; see credits. | |
| Subject | Topic | Quote |
Business | U.S. networking companies will find substantial opportunities for products and services that can seamlessly integrate clinical data warehousing, data mining and disparate business applications, and provide distributed records between physicians, laboratories, imaging centers, pharmacies and hospitals. (references) | |
The best prospective B2B EC software solutions for the Taiwan market are supply chain planning and execution software, vendor management inventory (VMI), logistic and distribution administration/management software, customer relationship management (CRM), online procurement/ordering Software, data mining and warehousing, electronic bill presentment/payment (EBPP) and enterprise application integration (EAI). (references) | ||
With the emergence of integrated global logistics strategies, Taiwan firms continue to follow the lead of major foreign firms in deploying automated enterprise resource planning (ERP), supply chain management (SCM), customer relationship management (CRM), electronic data interchange over Internet (EOI), sales force automation (SFA), data mining and warehousing, demand chain management (DCM), on-line procurement systems, electronic bill presentment and payment (EBPP), enterprise application integration (EAI), distribution channel management (DCM), maintenance repair operation (MRO) and data analysis. (references) | ||
Economic History | Argentina | The best sales prospects for US companies in the computer software industry are: CRM, E-business software (E-commerce, e-procurement, e-government, e-education), Middle-ware applications/solutions, ERP (Enterprise Resources Planning), computer network management, electronic document management, electronic network security solutions, data warehousing, data mining. (references) |
Source: compiled by the editor from ICON Group International, Inc.; see credits. | ||
| The following statistics estimate the number of searches per day across the major English-language search engines as identified by various trade publications. Hyperlinks lead to commercial use of the expression at Amazon.com. |
| Language | Translations for "DATA MINING"; alternative meanings/domain in parentheses. | ||||||||||
Finnish | tiedonrikastus. (various references) | ||||||||||
French | data mining, extraction de données. (various references) | ||||||||||
German | Data Mining, themenbezogene Datensuche, gezielte Datensuche. (various references) | ||||||||||
Italian | estrapolazione dei dati. (various references) | ||||||||||
Japanese Kanji | データãƒ"ット長 (data bit length, data flow, data man, data modelling, data processing, data processor, database, data-file, datalink). (various references) | ||||||||||
Japanese Katakana | データマイニング . (various references) | ||||||||||
Pig Latin | ataday iningmay extracción de datos (data leakage, leakage of information, tempest). (various references) | ||||||||||
Scrabble® Enable2K-Verified Anagrams | |
| Words within the letters "a-a-d-g-i-i-m-n-n-t" | |
-1 letter: animating, mandating. | |
-2 letters: amanitin, maintain. | |
-3 letters: antiman, damning, dinting, ignatia, indamin, minding, minting. | |
-4 letters: aiding, aidman, aiming, amidin, angina, anting, dating, diamin, dining, indign, intima, magian, mantid, mating, mining, naming, niding, taming, tiding, timing, tining, tinman. | |
-5 letters: adman, admit, again, amain, amiga, amnia, anima, animi, atman, daman, digit, gamin, giant, manat, mania, manna, manta, matin. | |
| Words containing the letters "a-a-d-g-i-i-m-n-n-t" | |
+1 letter: deaminating. | |
+2 letters: delaminating. | |
+3 letters: animadverting. | |
+4 letters: administrating. | |
+5 letters: decontaminating, demagnetization. | |
| Source: compiled by the editor from various references; see credits. SCRABBLE® is a registered trademark. All intellectual property rights in and to the game are owned in the U.S.A and Canada by Hasbro Inc., and throughout the rest of the world by J.W. Spear & Sons Limited of Maidenhead, Berkshire, England, a subsidiary of Mattel Inc. Mattel and Spear are not affiliated with Hasbro. | |
| 1. Crosswords 2. Usage: Commercial 3. Quotations: Non-fiction 4. Expressions: Internet | 5. Translations: Modern 6. Anagrams 7. Bibliography |
Copyright © Philip M. Parker, INSEAD. Terms of Use.