Semantic Web Emerges as Commercial-Grade Infrastructure for Sharing Data on the Web
by Eric Miller
In February 2004, the Wide Web Consortium announced final approval of two key Semantic Web technologies, the revised Resource Description Framework (RDF) and the Web Ontology Language (OWL). RDF and OWL are Semantic Web standards that provide a framework for asset management, enterprise integration and the sharing and reuse of data on the Web. These standard formats for data sharing span application, enterprise, and community boundaries - all of these different types of 'user' can share the same information, even if they don't share the same software.
This announcement marked the emergence of the Semantic Web as a broad-based, commercial-grade platform for data on the Web. The deployment of these standards in commercial products and services signals the transition of Semantic Web technology from what was largely a research and advanced development project over the last five years, to more practical technology deployed in mass market tools that enables more flexible access to structured data on the Web.
Semantic Web-enabled software using RDF and OWL include:
- Content creation applications: Authors can connect metadata (subject, creator, location, language, copyright status, or any other terms) with documents, making the new enhanced documents searchable
- Tools for Web site management: Large Web sites can be managed dynamically according to content categories customized for the site managers
- Software that takes advantage of both RDF and OWL: Organizations can integrate enterprise applications, publishing and subscriptions using flexible models
- Cross-application data reuse: RDF and OWL formats are standard, not proprietary, allowing data reuse from diverse sources.
Many specific examples of commercial applications and enterprise scale implementations of these technologies are detailed in the RDF/OWL testimonial page, RDF Implementation and OWL Implementation pages (see Links).
How the Semantic Web Pieces Fit Together - XML, RDF and OWL
The design of Semantic Web is more characteristic of Web Evolution than Revolution. The Semantic Web is made through incremental changes, by bringing machine-readable descriptions to the data and documents already on the Web. XML, RDF and OWL standards enable the Web to be a global infrastructure for sharing both documents and data, which make searching and reusing information easier and more reliable as well.
W3C's Semantic Web Activity builds on other work such as those defined by W3C's XML and URI Activities. Its focus is to develop standards and technologies, which use XML for syntax and URI for naming, that facilitate the sharing and reuse of data on the Web.
At the foundation, XML provides a set of rules for creating vocabularies that can bring structure to both documents and data on the Web. XML gives clear rules for syntax; XML Schemas then serve as a method for composing XML vocabularies. XML is a powerful, flexible surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents.
RDF - the Resource Description Framework - is a standard way for simple descriptions to be made. What XML is for syntax, RDF is for semantics - a clear set of rules for providing simple descriptive information. RDF Schema then provides a way for those descriptions to be combined into a single vocabulary. RDF is integrated into a variety of applications including:
- library catalogs
- world-wide directories
- syndication and aggregation of news, software, and content
- personal collections of music, photos, and events.
In these cases, each uses XML as an interchange syntax and URIs for naming. The RDF specifications provide a powerful framework for supporting the exchange of knowledge on the Web.
RDF is a standard a way for simple descriptions to be made; RDF Schema provides a way for those descriptions to be combined into a single vocabulary. What's needed next is a way to develop subject - or domain - specific vocabularies. That is the role of an ontology. An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share subject-specific (domain) information - like medicine, tool manufacturing, real estate, automobile repair, financial management, etc. Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them. They encode knowledge in a domain and also knowledge that spans domains. In this way, they make that knowledge reusable.
OWL - the Web Ontology Language provides a language for defining structured, Web-based ontologies which delivers richer integration and interoperability of data among descriptive communities. Where earlier languages have been used to develop tools and ontologies for specific user communities (particularly in the sciences and in company-specific e-commerce applications), they were not defined to be compatible with the architecture of the World Wide Web in general, and the Semantic Web in particular.
OWL builds on RDF and RDF Schema to add the following capabilities to ontologies:
- ability to be distributed across many systems
- scalability to Web needs
- compatibility with Web standards for accessibility and internationalization
- openness and extensibility.
OWL builds on RDF and RDF Schema and adds more vocabulary for describing properties and classes: among others, relations between classes (eg disjointness), cardinality (eg 'exactly one'), equality, richer typing of properties, characteristics of properties (eg symmetry), and enumerated classes.
The W3C Web Ontology Working Group is comprised of industrial and academic expertise, lending the depth of research and product implementation experience necessary for building a robust ontology language system. OWL is based the DAML+OIL language, which was developed by an international team funded by the US Defense Advanced Research Projects Agency (DARPA) and the European Commission's Information Society Technologies program. The release of the OWL recommendations represent the maturation of this previous work shaped by the members of the World Wide Web Consortium.
Future Work - W3C launches phase 2 of Semantic Web Activity
With the recent publication of the revised RDF and the new OWL specifications we are seeing a growing number of application developers appling these technologies in new and inovative application areas. In March 2004, the W3C Membership approved two new Working Groups, the Best Practices and Deployment and RDF Data Access to facilitate this development and ease the sharing of data located across distributed collections.
The newly chartered Semantic Web Best Practices and Deployment Working Group in focused on providing hands-on support for developers of Semantic Web applications. This Working Group will help application developers by providing them with 'best practices' in various forms, ranging from engineering guidelines, ontology/vocabulary repositories to educational material and demo applications.
The RDF Data Access Working Group whose focus will be to evaluate the requirements for a query language and network protocol for RDF and defined formal specifications and test cases for supporting such requirements. A recommended query language will reduce redundancy and enhance interoperability as SQL did for relational databases and help make it as easy to 'join' data on the Web as it is to merge tables in a local relational database.
With these two new Working groups being chartered through January 2006, the W3C Semantic Web Activity will continue to foster many more developments and through the Semantic Web Interest Group investigate additional areas of standardization that will strengthen the Semantic Web.
Eric Miller is the W3C Semantic Web Activity Lead and a Research Scientist at MIT's Computer Science and Artificial Intelligence Laboratory.
Semantic Web Home Page: http://www.w3.org/2001/sw/
RDF Home Page: http://www.w3.org/RDF/
RDF Core Working Group Home Page: http://www.w3.org/2001/sw/RDFCore/
Web Ontology Working Group Home Page: http://www.w3.org/2001/sw/WebOnt/
ERCIM is the European host of W3C.