by Hans Voss, Natalia Andrienko and Gennady Andrienko
Public access to the immense volume of existing geo- and geo-referenced thematic data and their exploitation is of significant value for the development of an open and democratic information society and a true global market. The widespread use of such data and GIS (geographical information system) will promote general public awareness and further social cohesion. Publicly available geo-data is, however, of little use unless people can easily access and easily exploit it. CommonGIS, a research project coordinated by the Knowledge Discovery Team of GMD Institute for Autonomous Intelligent Systems, is developing Web-based tools for access and analysis of geo-data that can be utilized likewise by skilled and casual, non-expert users.
Geo-data encompass various thematic or statistical data on demography, economy, education, culture, history, etc., which are associated with objects and locations in space. Probably the best way to explore such data is to visualize them on maps. The key-thought of CommonGIS (http://commongis.jrc.it) is thus to make geo-data commonly accessible and usable for everyone, from everywhere, by providing a Web-based Geographical Information System (GIS) with knowledge-based functions for the automatic generation of thematic maps. To a very large extent, the user should not at all need to worry about visualization issues but rather focus on the problem solving process and analytical task.
Ordinary GIS are not well suited for this because they suffer from at least one of three intrinsic pitfalls: they tend to be architecturally closed, monolithic, and costly environments; their use requires specific technical skills; they require the user to think more about obtaining visually nice presentations than about the selection of appropriate data and solving problems with their use. The CommonGIS project is now (as of February 2000) in its 16th month, and current achievements include:
In the following we will address these achievements in some more detail. A basic requirement was that CommonGIS should easily connect to existing data sets, which could be stored in large databases or kept in spreadsheets, just to name extreme options. For this purpose, CommonGIS provides a set of converters for various table data formats as well as data base connectivity. Internally, all geographic and thematic data are stored in a data base (ODBC/JDBC interface). Aside from this base functionality, the proper goal of providing smart support also to non-expert users can only be satisfied if the application program itself incorporates knowledge: knowledge about the data of the specific application, and knowledge about principles of cartographic visualization and analysis.
As basis for the latter we utilize the knowledge base for cartographic visualization from Descartes (http://borneo.gmd.de/and/java/iris/), which was developed at GMDs AiS institute (see ERCIM News, July 1998). In addition, some ideas are being incorporated from the cartographic visualization system VIZARD from the Fraunhofer Institute FhG-IGD. Regarding knowledge about the application data, the original Descartes system also included some means to describe the semantics of the data. However, the language used for defining characteristics of and relationships between attributes was somewhat ad-hoc, not documented, and its usage was not supported by suitable editors and checkers. So one specific task of CommonGIS was to develop a language that is applicable to any domain of spatially related thematic data, and to provide tools that make it easy for data providers to build their own applications.
As a result, a so-called data characteri-zation scheme (DCS) was developed. In contrast to other work known from literature, the DCS provides a rich arsenal of concepts, data structures and operators for defining semantic domain models of the given data. The reason for this particular emphasis is that the focus of previous work was mostly on the visual presentation of data on then static maps, while in CommonGIS we are dealing with very interactive maps. In addition, we want to support users in handling more complex analytical tasks. For example, in the context of exploring some demographic data the CommonGIS system should become able to automatically identify, formulate, and support analytical tasks like Compare gender structure of population in different age groups or Look at the distribution of a specific age group (0-14 years, 15-64 years, or older than 64) across the countries. In the latter example, it could also propose that using relative values (age group in percentages of the total population) would probably make more sense than using absolute numbers.
The DCS was defined using UML, and was studiedly kept on a conceptual level. It can thus be used as a schema that can be instantiated by different syntactical variants. For use in the CommonGIS software one such language was XML, and corresponding parsers were generated. This instantiated schema is called DCL (data characterization language).
The CommonGIS software is developed as an open, object-oriented, distributed system with a client-server architecture. The user interface is realized as Java-applets, thus providing comfortable access and interactivity while only requiring a standard (Java-enabled) Web-browser. Currently, what the user sees after starting CommonGIS in the Internet looks like something that one would expect to see when running a typical GIS on a local PC (see figure 1). In fact, what was taken as the base technology is the Java-based Lava/Magma GIS from PGS (see. http://www.pgs.nl). A salient feature of Lava/Magma is that it performs sophisticated caching of data for optimization of performance. This is particularly needed when using larger maps that would take too long to download as a whole. One achievement of CommonGIS was to make the Descartes system running from within Lava/Magma. The user can thus select and define a desired map with ordinary GIS operations, and then select certain thematic variables for visual display on this map. Thereby the full interactivity of Descartes stays available as the systems were redesigned so that Descartes can use the Lava/Magma display methods for presenting the results of interactive manipulations.
A demonstrator of CommonGIS using Portuguese Census Data is available in the Internet. It may be run from either web site, http://commongis.jrc.it/ commongis/ sw/commongis_first/CommonGIS/ or http://borneo.gmd.de/descartes/CommonGIS/. After the applet is loaded, it connects to the server, retrieves information, and displays a map. The HTTP protocol is used for communication between the client and the server. This allows the system to run in the Internet or in Intranets without being disabled by firewalls. If a geographic layer is associated with thematic information, this information can be visualized on a thematic map. For this purpose one pushes the T (Thematic data) button. In response the system will list all available thematic variables. After the selection of one or several variables CommonGIS will propose one or more suitable visualization methods that are selected on the basis of the available semantic information about the data.
Using the Portuguese example, one may select four variables with population numbers of different age groups. CommonGIS will take into account relationships among the variables: it knows that they are comparable and all together make the total population. On this basis, one of the proposed visualizations is a pie-chart presentation (see figure 2). The sizes of circles are proportional to the total population in the municipalities of Portugal, and the sizes of the segments show proportions of different age groups.
From the sizes of the pie charts one can observe that population numbers in the capital, Lisboa, and in its surroundings are much larger than in other districts. This makes the signs in those districts too small for seeing proportions of different age groups. As a remedy, all visualization techniques are supplied with specially designed controls for the interactive manipulation of the display. In the example one may decide to not show signs for the largest districts just by moving a slider, and thereby implicitly enlarge the small signs so that their structure becomes better visible (see figure 3).
Currently, two pilot applications are running. The above mentioned one was developed by CNIG (Centro Nacional de Informacao Geografica, Portugal), which is also responsible for the overall usability analysis of CommonGIS. Another application is under development by Dialogis GmbH together with the City of Bonn. Other applications for customers of PGS and InGeoForum (which is affiliated to ZGDV sister institution to FhG-IGD and to the Hessian State Surveying office) will soon follow. The two industrial partners of the consortium (PGS and Dialogis) are developing and will commonly market a commercial version of the software. They will also incorporate results of CommonGIS in their current commercial products, Lava/Magma and DialoGIS, respectively, where DialoGIS is a commercial version of Descartes. Furthermore, the consortium is taking efforts towards the standardization of the data-characterization schema. It is expected that a standardization in this field will promote intelligent GIS software interoperability, thus broadening a potential market. The Joint Research Center of the European Commission EC-JRC, Institute ISIS, is co-managing the whole project effort with GMD, and together with its subcontractor GISIG, will be instrumental in the dissemination and standardization process.
Homepage of the project: http://commongis.jrc.it/
Further publications about the project: http://borneo.gmd.de/descartes/
Hans Voss - GMD
Tel: +49 2241 14 2532