Joint Research Centre of the EC to use Semantic Web for New Privacy Agent
by Giles Hogben and Marc Wilikens
The Joint Research Centre of the EC is building an experimental privacy protection agent using Semantic Web technology and the Worldwide Web Consortium's P3P protocol. The agent automates the process of protecting a user's privacy through parsing and comparing privacy policies and user preference sets.
The Cybersecurity team of the EC's Joint Research Centre (JRC) in Ispra, Italy has recently completed work on a fully compliant implementation of the W3C's new privacy protocol, P3P, which uses XML files for the automated exchange and interpretation of website privacy policies and their matching against user preferences.
In the next stage of our research, the JRC will use this base platform to investigate how Semantic Web (SW) technology can improve this system. The project has been divided into a series of distinct stages.
The first stage is already under way and consists of the development and agreement of an ontology for data protection.One of the most crucial improvements that the SW can make to the P3P platform will be to provide an interoperable, machine interpretable ontology for data protection. This will provide the following benefits:
- a common, interoperable vocabulary, which reduces misinterpretation of basic principles by technologists and legal experts and enhances interoperability between different systems
- clear understanding of terms allowing ease of translation between alternative ontologies
- clear separation of vocabulary and syntax meaning that the same vocabulary can be plugged into different data protection systems.
- a clearly documented development process offering clarity, authority and common agreement on terms by a large group of stakeholders.
An agreed vocabulary should be the basis for any privacy system. Significant problems were found in P3P's first version, mainly arising from inadequacies in its expression of data protection concepts such as those found in regulatory frameworks (eg the EU data protection directive and the OECD guidelines).
The JRC is developing a consensus process for capturing the knowledge of domain experts such as data commissioners. This is in conjunction with researchers from the University of Aberdeen, whose prototype ontology capture process will be used. This process includes input from psychologists and conceptual modeling research and has been put through several test cases.
Despite incompleteness in certain respects, P3P provides a sound foundation for such an ontology. Although P3P is not expressed in standardised ontology syntax, it represents, through the W3C processes that have underpinned it, a five year consensus process for a data protection vocabulary. Given a formally documented consensus process and improvements in the data typing schema, and purpose and recipient taxonomies, the existing version of P3P provides a very useful starting point.
The second stage will be to ensure that the proposed extensions are backward compatible. For this purpose, an XSLT stylesheet will be created to perform a translation between the two versions. The concepts of the ontology will also be placed within a clear W3C style specification.
- attempt to match conditions from each rule in turn
- when first rule matches, perform behaviour specified by that rule
- no rules fired is an error (therefore a catch-all rule must be included).
At this point, there will be a working system, as in the Figure.
|Semantic-Web-enabled privacy agent.
The fourth stage will be to refine the rule system and policy language using test cases from European law. In the first phase of the JRC's P3P implementation, work was done on expressing the European Data Protection Directives using P3P vocabulary. This revealed that there are some inadequacies in the vocabulary, particularly when describing whether data is given to recipients outside of the EU and in the description of the purposes of data collection. The forthcoming SW version will, it is hoped, have sufficient conceptual flexibility to express the European Data Protection Directives. In any case, these will act as a litmus test for the system.
Finally, the research will cover ways of using the SW to make privacy information more accessible to other systems. At present, P3P functions only within the context of http transactions. The use of SW technology, as a standardised knowledge transfer medium, may help other areas of technology to use P3P. Examples might include ubiquitous computing, smart card technologies and IRC chat rooms. These developments put P3P in a good position as SW technology becomes more widespread. As knowledge sharing moves increasingly towards the use of the SW, the existence of an integrated privacy framework will facilitate association of privacy preferences and policies with data as they move between heterogeneous sources.
More information and a demonstration of our current platform: http://p3p.jrc.it
W3C's P3P Privacy protocol home page: http://www.w3.org/P3P
RDF schema for P3P: http://www.w3.org/TR/p3p-rdfschema/
OWL Web ontology home page: http://www.w3.org/2001/sw/WebOnt/
Giles Hogben, European Commission
Joint Research Center, Italy
Tel: +39 033 278 9187