by Arno Siebes
KESO (Knowledge Extraction for Statistical Offices) is a three year ESPRIT-IV project from Eurostat. Its goal is to produce a prototype Data Mining System that solves the needs of the analysts of statistical datasets. The project is coordinated by CWI and runs on a total budget of 2.5 MECU with an EU-funding of 1.5 MECU.
The public and private providers of statistical data (``Statistical Offices'') collect and analyse complex data on many important domains. Their surveys, panels, and banks of time series contain information increasingly important for the economic competitiveness of European industries and countries, e.g., to tackle the unemployment problem. It is of crucial importance for the Statistical Offices and their clients to extract the knowledge contained in these complex datasets. Versatile and efficient Data Mining Systems are required that partially automate analytical processes on statistical databases.
The KESO project is organised around two streams of activities, technology development and prototype assessment. The driving force of this project is a stream of studies performed by the Statistical Offices with the various releases of the Data Mining system under development. Their needs and observations are taken into the development stream to enhance subsequent releases.
The feasibility of the DM approach has been demonstrated with current technology. The main challenge of the project is to develop the required additional functionality, to scale it to the complexity of statistical data sets.
In the first year the project will focus on the well-studied area of simple relational data (micro data in statistical parlance). In the second and third year the focus will be on the more challenging problems of aggregated data sets and time-series databases.
The partners in the project are:
National Statistical Offices:
Private Statistical Office: