ERCIM News No.25 - April 1996 - CWI

KESO - Data Mining for Professional Statisticians

by Arno Siebes

KESO (Knowledge Extraction for Statistical Offices) is a three year ESPRIT-IV project from Eurostat. Its goal is to produce a prototype Data Mining System that solves the needs of the analysts of statistical datasets. The project is coordinated by CWI and runs on a total budget of 2.5 MECU with an EU-funding of 1.5 MECU.

The public and private providers of statistical data (``Statistical Offices'') collect and analyse complex data on many important domains. Their surveys, panels, and banks of time series contain information increasingly important for the economic competitiveness of European industries and countries, e.g., to tackle the unemployment problem. It is of crucial importance for the Statistical Offices and their clients to extract the knowledge contained in these complex datasets. Versatile and efficient Data Mining Systems are required that partially automate analytical processes on statistical databases.

The KESO project is organised around two streams of activities, technology development and prototype assessment. The driving force of this project is a stream of studies performed by the Statistical Offices with the various releases of the Data Mining system under development. Their needs and observations are taken into the development stream to enhance subsequent releases.

The feasibility of the DM approach has been demonstrated with current technology. The main challenge of the project is to develop the required additional functionality, to scale it to the complexity of statistical data sets.

In the first year the project will focus on the well-studied area of simple relational data (micro data in statistical parlance). In the second and third year the focus will be on the more challenging problems of aggregated data sets and time-series databases.

The partners in the project are:

National Statistical Offices:

Finland
Greece (via FORTH)
The Netherlands

Private Statistical Office:

Infratest Burke (Germany)

IT Company:

Data Distilleries (The Netherlands)

Research Organisations:

CWI
GMD
University of Helsinki

Please contact:
Arno Siebes - CWI
Tel: +31 20 592 4139
E-mail: Arno.Siebes@cwi.nl

return to the contents page