ERCIM News No.35 - October 1998
XML and the World-Wide Web Consortium Leverage Action Project
by Brian Matthews
The World-Wide Web is based on some very simple technologies. In particular, the Hypertext Markup Language (HTML), is a simple language for describing documents. However, HTML is severely limited as a information management medium. HTMLs mix of structure and presentation means that reformatting the data to give different views is hard. Further, the lack of domain specific data modelling in HTML has made accurate searching for information on the Web difficult and has made it hard to interact with databases. Thus the very features which led to the widespread acceptance of HTML are limiting the utility of the Web itself.
In response the World-Wide Web Consortium (W3C) has developed the Extensible Markup Language (XML) (http://www.w3.org/XML/). XML is not intended as a replacement of HTML, but rather as a more flexible alternative for the representation of data across the Web. XML is intended to allow new data formats to be defined while maintaining the universality of HTML.
XML is based on the existing Standard Generalised Markup Language (SGML). The key concept brought to XML from SGML is that of a Document Type Definition (DTD). This is a declaration of the correct markup structure for a class of XML documents against which documents can be validated. Thus the logical structure of a class of valid documents is defined and used by applications to manipulate a document.
Thus XML can be used to generate new document markup which is closer to the intended use of the document in a flexible yet universally interpretable way. DTDs can then be given for a wide variety of application domains and data formats.
The W3C-LA project between INRIA and RAL and also the W3C offices at SICS, GMD, CWI, and FORTH, has been exploring the use of XML within several different demonstrators:
- MathML - an XML based method of representing mathematics across the web has been implemented in the Amaya reference browser at INRIA
- Hyperglossaries - RAL has been collaborating with the Virtual Hyperglosssary Group in using XML to provide glossary information for terms within web documents
- RDF - RAL is producing a demon-stration of the use of the W3Cs XML based metadata description language RDF within a workflow application
- SMIL - RAL, in collaboration with CWI, has produced a demonstrator of the vendor-neutral Synchronised Multi-media Integration Language, which uses the XML standard to transmit multi-media across the Web
- Schematic Graphics Markup Language - RAL has submitted a proposal to the W3C to provide an XML based standard for representing graphical objects using a schematic representation, as opposed to a binary format. This proposal has been prototyped in the Amaya browser at INRIA
- XML Browser - Work is underway in producing a general XML browser based within Amaya
The common driving force behind these initiatives is the desire to transmit and present new kinds of information across the WWW in a flexible and open way. They also demonstrate the widely differing application domains offered by XML, and its potential to enhance the capacity of the WWW. Further information on W3C and W3C-LA activities can be found by contacting the W3C at INRIA, or the W3C offices established at RAL, SICS, GMD, CWI, and FORTH.
Brian Matthews - CLRC
Tel: +44 1235 44 6648