spacer back contents
ERCIM News No.49, April 2002


Regulating Access to Web-published Data

by Pierangela Samarati

The overall goal of the FASTER project (Flexible Access to Statistics Tables and Electronic Resources) is the development of a flexible and open system for controlled dissemination of statistical information. The system includes two major security components: a Statistical Disclosure component, for the sanitization of sensitive tables, and an Access Control component, allowing enforcement of protection requirements on published data.

Today’s society places great demand on the dissemination and sharing of information. With the development and wide-spread use of the Internet and the World Wide Web, organizations in the private and public sectors are increasingly required to make their data available to the outside world. A growing amount of data is being collected by statistical agencies and census bureaus for analysis and subsequent distribution to the general public or to requesting organizations (eg, research institutions, government offices). Data producers can release their data directly, as in the case of national statistical institutions, or through the mediation of archive institutions (data publishers) that collect data from various sources for subsequent distribution.

This data distribution process is clearly selective: data cannot just be released to anybody. For instance, certain sensitive data can only be released to authorised individuals and/or for authorised purposes (eg, health data). Some data is subject to time restrictions and can only be released to the general public after a certain period; some data can be released only for non-commercial purposes; other data can only be released on payment. These few examples already give an idea of the variety of protection requirements that may have to be enforced. There is thus the need for a powerful and flexible access control system able to enforce the different requirements that the data producers (or publishers) may want to impose on the data access.

In the context of the FASTER project, we have developed an Access Control System for specifying and enforcing protection requirements on published data, such as statistical tables that have already undergone a statistical disclosure control process, or survey results, etc.

The access control component is based on a simple, expressive language for the specification of protection requirements. The approach has the following features:

  • Abstractions/classifications support: this supports access rules based on the typical abstractions used by data producers and publishers, which can define categorizations of users, purposes of use, types of operations, and data objects.
  • Metadata-dependent access rules: the authorization language allows the expression of access rules based on conditions on metadata describing (meta)properties of the stored data and the users, which can be represented through system-maintained profiles. For instance, an access control rule could grant EU citizens access to all census data more than 20 years old, where both the citizenship of the requesters and the age of data are represented as metadata.
  • Dynamic condition support: access to certain data may depend on conditions that can be only evaluated at run-time, possibly via interaction with the user. Examples of such conditions, which must be associated with procedural calls executing the necessary actions, are agreement acceptance (that can be as simple as clicking an 'ok' button on a pop-up window), payment fulfillment, registration, or form filling.
  • Expressiveness: the language is based on flexible rules specifying the accesses to be granted via boolean expressions evaluating properties of the requestor (metadata), of the data being accessed, as well as of the context (dynamic conditions). It also allows the expression of two kinds of access rules: authorizations and restrictions. Authorizations correspond to traditional permissions specifying sufficient conditions for an access, whereas restrictions make it possible to express conditions necessary for an access. The combined support of the two kinds of rules provides a natural fit for the types of protection requirements examined in real world scenarios known to the partners.
  • Declarative: the language also has a simple declarative form, making it easy to use for nonspecialists in the field.

The Access Control System has been implemented and integrated in the FASTER architecture, which has been developed in a collaboration between the Data Archive at Essex University (UK), the Information Technology Dept. of the University of Milan (Italy), the Norwegian Social Science Data Services (Norway), the Dansk Data Arkiv(Denmark), the CentraalBureau voor de Statistiek (Netherlands), the Central Statistics Office (Ireland), the Statistik Sentralbyra (Norway), and the Centre National de la Recherche Scientifique (France). The access control language and component developed at the University of Milan are being adopted by the project partners to express and enforce protection requirements on the data they make available.

The approach used to develop an access control for web-publishing is now being extended to the support of credentials and certified statements (instead of requiring them to be stored at the server as metadata) and to policy composition. Policy composition refers to the controlled combination of access constraints independently specified by different authorities (eg, data respondent, publisher, producer, and privacy advocates and regulators).

Faster project:
Security Group at the Information, Technology Department, University of Milan:

Please contact:
Pierangela Samarati,
University of Milan, Italy