Evaluation in Interactive Safety-Critical Contexts

ERCIM News No.46, July 2001 [contents]

Evaluation in Interactive Safety-Critical Contexts

by Fabio Paternò and Carmen Santoro

Interactive safety-critical applications have specific requirements that cannot be completely captured by traditional evaluation techniques. At the Human-Computer Interaction group of CNUCE-CNR scientists have developed a method that performs a systematic inspection-based analysis, aimed at improving both usability and safety aspects of an application, by analysing a system prototype and the related task model. The goal is to evaluate what could happen when interactions and behaviours occur differently from what has been assumed by the system design.

Research on model-based design and evaluation of interactive applications aims at identifying models to support design, development, and evaluation of interactive applications. In particular, task models describe activities that have to be performed in order to achieve a user’s goals, where a goal is a desired modification of the state of an application.

Various approaches have been studied to specify tasks. In our work we consider task models that have been represented using the ConcurTaskTrees notation and the associated tool freely available at http://giove.cnuce.cnr.it/ctte.html. In ConcurTaskTrees the activities are described hierarchically, at different abstraction levels, and represented graphically in a tree-like format, using a rich set of operators to describe different temporal relationships (concurrency, interruption, disabling, iteration, option and so on).

Task models can also be useful in supporting design and evaluation of interactive safety-critical applications. The main goal of such applications is to control a real-world entity, fulfilling a number of requirements while avoiding that the entity reaches hazardous states. Many examples of safety-critical systems exist in real life (air traffic control, railway systems, industrial control systems) and a number of crucial issues arise. One example is when it is impossible to undo a user action; in this case, how to appropriately design the user interface to cope with user errors acquires special importance.

The Method
In our work, we consider the system task model: how the design of the system assumes that tasks should be performed. The goal is to identify the possible deviations from this plan. A set of predefined classes of deviations are identified by guidewords, which are words or phrases referring to a specific type of abnormal system behaviour. Interpreting the guidewords in relation to a task allows the analyst to systematically generate ways the task could potentially deviate from the expected behaviour, analysing the impact on the current system and generating suggestions on how to improve the current design. The method consists of three steps:

Development of the task model of the application considered; in order to identify how the system design requires that tasks are performed.
Analysis of deviations related to the basic tasks; the basic tasks are the leaves in the hierarchical task model, tasks that the designer deems should be considered as units.
Analysis of deviations in high-level tasks; these tasks allow the designer to identify groups of tasks and consequently to analyse deviations that involve more than one basic task.

It is important that the analysis of deviations be carried out by interdisciplinary groups where such deviations are considered from different perspectives. The analysis follows a bottom-up approach (first basic tasks, and then high-level tasks are considered) allowing designers initially to focus on concrete aspects and then to widen the analysis to more logical steps.

We investigated the deviations associated with the following guidewords:

None, the unit of analysis has not been performed or it has been performed but without any result. This is decomposed into three types of deviation: lack of input, missing task performance, missing result.
Other than, the tasks considered have been performed differently from the intentions specified in the task model. Three sub-cases can be distinguished (less, more or different) and each can refer to the analysis of the input, performance or result of a task.
Ill-timed, the tasks considered have been performed at the wrong time: we distinguish between early or late performance with respect to the planned activity.

For each task analysed the following information can be stored in a table: Task; Guideword; Explanation; Causes; Consequences; Protection; Recommen-dation. The explanation is classified in terms of which phase of the interaction cycle (according to Norman’s model of intention, action, execution, perception, interpretation and evaluation) can generate the problem.

For example, consider the Check deviation task (the user checks whether aircraft are following the assigned path in the airport) and the class of deviation None. For example, the No input case (lack of information about the current state of the aircraft) can have multiple causes (controller distracted, system fault) but the same consequence: the controller does not have an updated view of the air traffic. The No performance case can occur for example when controllers do not interpret correctly the displayed information on the traffic, whereas in No output case the controllers find a deviation but they forget about it as they are immediately interrupted by another activity.

For high-level tasks, the interpretation of each guideword has to be properly customised depending on the temporal relationships between its subtasks (a paper with details is available on request).

Experience
The method described was tested in a real case study in the European project MEFISTO (http://giove.cnuce.cnr.it/ mefisto.html). CNUCE-CNR was the coordinator of this project which has seven other partners: University of York, University of Toulouse, University of Siena, Alenia Marconi Systems, DERA, CENA and ENAV (the Italian association of air traffic controllers).

We applied the method to a prototype provided by Alenia Marconi Systems for air traffic control in an aerodrome. The main purpose was to support data-link communications handling aircraft movement, where data link is a technology allowing asynchronous exchanges of digital data coded according to a predefined syntax.

Controllers exchange messages with pilots by interacting with graphical user interfaces (see figure) showing in real-time the traffic within the airport and using the so-called enriched flight labels, which only display essential flight information permanently (standard mode) and show additional information interactively (selected mode).

The user interface of the evaluated prototype.

We carried out the exercise with a multidisciplinary team in which software developers, final users (air traffic controllers in this case) and experts in user interface design were involved. During these exercises, many interesting issues arose that generated a list of suggestions for improving the user interface in order to better support usability and safety.

Our experience has shown the effectiveness of the method despite some social constraints that often occur in software development enterprises (developers tend to defend every decision taken, users tend to digress, pressure of time, etc).

Link:
http://giove.cnuce.cnr.it/

Please contact:
Fabio Paternò or Carmen Santoro — CNUCE-CNR
Tel: +39 050 315 3066
E-mail: {F.Paterno, C.Santoro}@cnuce.cnr.it