Knowledge Discovery in Databases

Introduction

The first knowledge discovery in databases workshop was held in Detroit in August 1989 (Gregory Piatetsky-Shapiro, 1991; Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth, 1996). Attendees at the workshop participated in three sessions:

    • Data-driven discovery
    • Knowledge-based approaches
    • Systems and applications

Gregory Piatetsky-Shapiro (1991) in his summary of the workshop observed:

The growth in the amount of available databases far outstrips the growth of corresponding knowledge. This creates both a need and an opportunity for extracting knowledge from databases.

He concluded that "knowledge discovery in databases is an idea whose time has come".

What is Knowledge Discovery in Databases?

William Frawley, Gregory Piateetsky-Shapiro and Christopher Matheus (1992, p.58) proposed that knowledge discovery is:

the nontrivial extraction of implicit, previously unknown, and potentially useful information from data.

They added:

A pattern that is interesting (according to a user-imposed interest measure) and certain enough (again according to the user's criteria) is called knowledge (p.58).

They identifed four main characteristics for knowledge discovery in databases:

    • High-level language
    • Accuracy
    • Interesting (novel, potentially useful, nontrivial)
    • Efficiency

Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth, (1996) distinguish between knowledge discovery in databases (KDD) and data mining. They propose:

KDD refers to the overall process of discovering useful knowledge from data, and data mining refers to a particular step in this process (p.39).

They note that KDD involves data preparation, data selection, data cleaning, incorporation of appropriate prior knowledge, and proper interpretation of the results of data mining.

Recommended Reading

William Frawley, Gregory Piateetsky-Shapiro and Christopher Matheus (1992). Knowledge Discovery in Databases: An Overview.

Suggested Reading

Craig Dalton and Jim Thatcher (2014). What does a critical data studies look like, and why do we care? Seven points for a critical approach to ‘big data’. Society and Space Open Site.

Photo Credit

Champions (@sarah_hunter8)