ERCIM News No.23 - October 1995 - INESC

Design Quality Metrics for Object-Oriented Software Systems

by Fernando Brito e Abreu

The adoption of the Object-Oriented paradigm is expected to help produce better and cheaper software. The main structural mechanisms of this paradigm, namely, inheritance, encapsulation, information hiding or polymorphism, are the keys to foster reuse and achieve easier maintainability. However, the use of language constructs that support those mechanisms can be more or less intensive, depending mostly on the designer ability. We can then expect rather different quality products to emerge, as well as different productivity gains. Advances in quality and productivity need to be correlated with the use of those constructs. We then need to evaluate this use quantitatively to guide OO design.

The author proposed the MOOD (Metrics for Object Oriented Design) set which includes the following metrics:

Each of those metrics refers to a basic structural mechanism of the object-oriented paradigm as encapsulation (MHF and AHF), inheritance (MIF and AIF), polymorphism (PF) and message-passing (CF) and are expressed as quotients. The numerator represents the actual use of one of those mechanisms for a given design. The denominator, acting as a normalizer, represents the hypothetical maximum achievable use for the same mechanism on the same design (i.e. considering the same number of classes and inheritance relations). As a consequence, these metrics are expressed as percentages, ranging from 0% (no use) to 100% (maximum use) and are dimensionless, which avoids the often misleading, subjective or "artificial" units that pervade the metrics' literature. Being formally defined, the MOOD metrics avoid subjectivity of measurement and thus allow replicability. In other words, different people at different times or places can yield the same values when measuring the same systems. These metrics also proved to be system size independent. Size independence allows inter-project comparison, thus fostering cumulative knowledge. The MOOD metrics' definitions make no reference to specific language constructs. However, since each language has its own constructs that allow for implementation of OO mechanisms in more or less detail, a binding for several OO languages (e.g. C++, Eiffel) has already been done. This expected language independence will broaden the applicability of this metric set by allowing comparison of heterogeneous system implementations. Each MOOD metric quantifies a distinct feature of an OO system. Experimental data analysis has also shown that the MOOD metrics are fairly size independent, which was one of our goals.

The MOOD metric set enables expression of some recommendations for designers. An Electronic Engineering analogy ("the filters' metaphor") was used for representing our design heuristics. Theoretically, a high-pass filter is not expected to affect signal frequencies above a certain value (the cutoff frequency). Below that value, the filter acts as a hindrance for frequency. By analogy, a high-pass heuristic is the one that suggests that there is a lower limit for a given metric. Going below that limit is a hindrance to resulting software quality. For those who do not like thresholds, we may say that the analogy is even better if we realise that "real" filters do not have them. Indeed their shape is not a step but a curve with a bigger slope near the cutoff zone. Resulting software quality characteristics are also expected to be strongly attenuated (or increased, depending on the direction) as we reach the cutoff values. The reasoning for a band-pass heuristic is similar, except that we have two cutoff zones (a lower and a higher one).

MOODKIT, a simplified tool for MOOD metrics' extraction from source code, was developed and tested. Version 1.1 supports the collection on C++ code. It was built using ANSI C and scripts with standard UNIX commands (awk, grep, find, etc). It is worth mentioning that the first attempt to collect the MOOD metrics (for instance on the MFC library) was done manually. It took an effort of about two man.weeks (two persons during a full week) and it became clear that the collection process is really a repetitive, tedious, boring, time-consuming and expensive task for humans! By using MOODKIT, the effort to do the same job was cut to a half, something around 5% of the manual collection effort. A validation experiment using MOODKIT V1.1 was carried out on a sample consisting of a collection of class libraries written in C++: Microsoft Foundation Classes (MFC) from Microsoft Corporation, GNU glib++ (GNU) from Free Software Foundation / Cygnus Support, ET++ library (ET++) from UBILAB / Union des Banques Suisses (Switzerland), NewMat library (NMAT) from Robert B. Davies (Victoria University - New Zealand), MotifApp library (MOTIF) from Douglas A. Young (add-on to his book). The sample had almost 600 classes, around 10K methods and 164K LOC. Initial thresholds for triggering the designer's attention were derived in this experiment. For instance, if the Coupling Factor exceeds the upper cut-off region, the designer should be warned somehow (supposing that he is using a design tool with embedded metrics capture). He would then realise that his design lies outside the boundaries of good practice and that the consequences are a reduction in encapsulation and potential reuse and an increase in complexity that will limit understandability and maintainability. Besides this outlier identification, the MOOD metrics can also help decide among alternative design implementations by helping to rank them.

A beta-test version (V1.2) of the MOODKIT tool was already disclosed for public domain use. Professor Vic Basili team at the University of Maryland (USA) is already using it in a validation experiment whose results will be published soon. A completely reengineered version 2 is being designed. Its core (metrics' definition dictionary, metrics' storage, human-machine interface) will be based on a language independent central repository with storage, retrieval and graphical capabilities (Motif). It will use specific "stubs", based on language parsers, for metrics capture from distinct OO language source code. An Eiffel stub is being built and a Smalltalk one is also planned for the coming year.

Object-orientation is not "the" silver bullet but it seems to be the best bullet available today to face the pervasive software crisis. Keeping on the evolution track means we must be able to quantify our improvements. Metrics will help us to achieve this goal.

Please contact:
Fernando Brito e Abreu - INESC
Tel: +351 1 3100226

return to the contents page