Web Images and Blind Users

ERCIM News No.41 - April 2000 [contents]

Web Images and Blind Users

by Zdenûk Mikovec, Martin Klíma, Du‰an Pavlica and Pavel Slavík

The Blind Information System (BIS) is a system for interpretation of graphical information for blind users designed and developed at the Czech Technical University. The project is developing a picture description methodology based on XML which characterizes pictures and charts by structure. It also looks at objects and the relations among themansd then makes this information accessible to blind users.

User-friendly communication with computers is a very important issue. Development of user interfaces is at the centre of attention of the software industry in the few last years and due to technological development new methods of communication are brought into everyday use. The past 15 years were characterised by intensive development of graphical user interfaces that increased the user-friendliness and thus the efficiency of communication with computers. This allowed a wider audience to use computers. Nevertheless there is one user group that has been seriously affected by the introduction of graphical user interfaces – blind users.

The amount of graphically presented information is steadily increasing and very often the textual information can not be understood without perception of the attached image. It is obvious that blind users have to get access to graphical information to be able to understand the information presented. The solution is to create an alternative textual description the blind users can work with. The easiest way is to create the list of objects that the picture consists of which the user can search or perform other operations on.. This approach does not provide the user with any structural information about relations among objects in the picture investigated.

The aim of our BIS (Blind Information System) was to create a web tool that would allow the user to browse graphical information. The basic idea is to create an alternative picture description in textual form that would be appended to optional section of some image formats like GIFor PNG.

This approach would then allow the picture to be displayed by means of standard tools that are an integral part of common browsers like Netscape, Internet Explorer, etc. Using a browser module that allows the user to browse the textual description of a picture it is possible to obtain information about picture both the structural one and the list of objects in the picture.

Besides the ‘pure’ pictorial information the picture also contains information of a semantic nature. This type of information covers relations like ‘a person talks to another person’, ‘a girl waves to a boy’, etc. Our system allows the creation of both types of descriptions and to browse them. In both cases a hierarchical description of a picture is created. The appropriate formalism by which it is possible to create such a description is a grammar that can describe both structures (describing structural and semantic view of the picture). The description grammar was implemented in XML. The use of XML has several advantages. Because XML is a standard the system is easily portable into different environments. Another advantage is that XML can be processed in web environments (XML was specially designed for use on Internet, and all newly developed web applications are able to work with XML documents). In the figure it is possible to see our approach to the picture perception process. The system that allows the creation of picture descriptions and picture browsing has been successfully implemented and tested (for testing purposes the system is implemented as a stand-alone Java application). The response from our blind volunteers has been very positive. The approach used allowed them to get relatively easy orientation in pictures. It is important to note that the use of the system is limited to pictures that can be characterised by structure, objects and relations among them. These can only be described by the formalism implemented.

Future work will be devoted to the development of the plug-in version for Internet browsers (eg Netscape) and the extension of the approach described into 3D. This means that scenes described by means of VRML could be both displayed in a normal way and investigated by means of textual queries concerning the scenes internal structure. In such a way the blind users will get access to information on the web currently not available to them.

Links:
http://sgi.felk.cvut.cz/cgi-bin/toASCII/Research/bis/

Please contact:
Pavel SlavÌk - Czech Technical University
Tel: +420 2 2435 7480
E-mail: slavik@felk.cvut.cz