In-Depth
Supporting Design by Contract in C++
Most object-oriented (OO) practitioners recognize the value of Design by Contract as a methodology for improving software quality. It is at the heart of the Eiffel language and of the Object Constraint Language (OCL) of UML. However, there are precious few languages that provide such intrinsic support for Design by Contract, and C++ is one language that does not. This article presents two new mechanisms for emulating Design by Contract in C++ that take advantage of language features defined in the recent standard, and for which widespread compiler support is now becoming available. One mechanism has been used in the development of software components in Computational Physics and is based on the Standard Template Library (STL); the other provides Design by Contract support across an inheritance hierarchy in C++ in accordance with the Liskov Substitutability Principle(LSP).1
Click here to download the Appendix to this article.
Contracts document the benefits and obligations of an agreement involving two parties. The document protects both parties. It protects the client by specifying the minimum that should be done and states that the client is entitled to receive a certain result. It protects the supplier by specifying how little is acceptable and that the contractor must not be liable for failing to carry out tasks outside of the specified scope. When things go wrong, the contract indicates where the blame lies.
Design by Contract relies on the programmer making assertions about the behavior of software, both at the level of individual function execution and object state, and also at higher levels. These assertions may be considered to form a contract between the software supplier and the software client.
There are three principal types of assertion in a software contract: postconditions, preconditions, and invariants. Functions must have desirable properties in order to attract clients, and these are documented in the function's postcondition. Functions are also able to state the conditions under which those desirable properties will be delivered; these are documented in the function's precondition. Properties that remain unaltered after function execution (although they may not be satisfied at some intermediate stage) are captured by invariants.
The notion of assertions was developed by Hoare,2,3 Floyd,4 and Dijkstra,5 after initial work (according to Hoare) by Alan Turing. Unfortunately, many programming languages are very limited in the support they provide for assertion checking, and C++ is one of them. However, C++ is a rich and diverse language supporting a multi-paradigm approach to design. The more recent features of the language provide new opportunities to add Design by Contract support. In this article, following a survey of previous suggestions, two techniques for supporting Design by Contract in C++ are outlined.
The first technique deals with the need to observe the Liskov Substitutability Principle when applying Design by Contract across an inheritance hierarchy. Out of necessity, it makes some rudimentary requirements on the declaration of the classes involved. The second technique, while being less general (it is intended for classes whose internal data structure is based on the containers of the STL), makes no requirements on the client. It can be extended to automatically generate (simply by instantiation) an object hierarchy that parallels the data sub-structure of an existing object, and can provide a platform from which to administer tasks such as constraint monitoring, object update, and object display.6
There is no consideration given to the issue of what is to be done once constraint violation is detected: an exception is thrown as a default action, and the opportunity to define what action should be taken is provided. This is not because the issue is insignificant, but because it is beyond the scope of the discussion.
Previous work on supporting Design by Contract in C++
There have been a number of different approaches to supporting Design by Contract in C++: use the existing language constructs to provide a support mechanism; use macros to support the inclusion and monitoring of constraints; effectively extend the syntax of the language by writing a PreProcessor that converts non-C++ constraint expressions into C++ before compilation; and proposing language extensions.
The Percolation pattern7 uses existing language constructs in a straightforward structure that yields the same checking as built-in assertion inheritance. The pattern relies on the redefinitions observing a protocol in which they incorporate calls to base class functions and observe the Liskov Substitutability Principle in doing so. The Liskov Substitutability Principle states the conditions under which the constraints on a class method of a derived class may be redefined. If the derived class is to be considered a subtype of the base class (and an instance of it can therefore be substituted wherever an instance of the base class occurs), the following must apply: preconditions get no stronger, invariants and postconditions get no weaker. The Percolation pattern requires discipline on the part of the programmer that can have implications for maintenance.
A simulation of the Eiffel mechanism using macros has been put forward by Welch and Strong.8 Macros are defined that serve the purpose of the corresponding Eiffel language constructs.
App.9 is an Annotation PreProcessor for C/C++ programs developed in Unix environments. It recognizes assertions that appear as annotations of the source text, written using extended comment indicators. It recognizes three assertion constructs: assume (preconditions), promise (postconditions), return (specifies a constraint on the return value of a function), and a fourth construct, assert, that specifies a constraint on an intermediate state of a function body. It also provides three constructs additional to C/C++ syntax to enhance the expressivity of constraints: all, some, and in provide universal quantification, existential quantification, and reference to the value of a variable prior to function execution respectively.
Gnu's Nana10 is a library that provides support for assertion checking and logging in a space- and time-efficient manner. The aim is to put common good practice into a library that can be reused. It was inspired by App. Assertion checking and logging code can be implemented using a debugger, rather than inline code with a large saving in code space. There is some STL support in that there is a special version of the macros involved that uses the iteration protocol to process containers.
Annotated C++11 was a research project undertaken by Marshall Cline and Doug Lea. A++ was intended to provide designers with a means to express a variety of semantic constraints and properties of C++ classes. It extends the base type system to support several forms each of preconditions, postconditions, assertions, and invariants, along with a richer set of "primitive" types and constructs that are useful for expressing such constraints.
Although exlC++12 is a language extension that bears a lot of similarity to much of the work discussed so far, the work of Porat and Fertig includes a number of new and interesting observations. Some of these are a result of taking the freedom to work outside of the existing language. Subsequent to the agreement of the ISO/ANSI C++ Standard, this decision means exlC++ is unlikely to make further contributions to the debate.
None of the approaches outlined save the user from having to explicitly deal with handling the consequences of inheritance on assertions. The approach outlined in this article overcomes this by using constrained genericity, utilizing new language features such as partial specialization and inner template classes.
Table 1. Characteristics of the different approaches to design by contract. |
|
Percolation (Binder) |
App (Rosenblum) |
A++ (Cline & Lea) |
exlC++ (Porat & Fertig) |
Nana (GNU) |
Macros (Welch & Strong) |
UNDESIRABLE |
|
|
|
|
|
|
Requires explicit coding to support the Liskov Substitutability Principle. |
Yes |
Yes |
|
|
Yes |
Yes |
Clutters the class namespace with additional names for precondition and
postcondition functions. |
Yes |
|
|
|
|
|
Can affect program execution (potentially has side-effects-"letting the
imperative fox into the applicative chicken coop"). |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Does not offer granularity of control over which assertions are executed. |
Yes |
Yes |
|
|
|
|
DEBATABLE |
|
|
|
|
|
|
Recognizes and supports the distinction between legality and coherence. |
No |
No |
|
No |
No |
No |
Groups semantic information |
No |
|
|
|
No |
No |
DESIRABLE |
|
|
|
|
|
|
Implementation readily available. |
|
No |
No |
No |
|
|
Provides postconditions with access to the return value of the function. |
No |
|
|
|
No |
No |
Provides postconditions with access to the state of the object at the
commencement of the function (the old value) without explicit coding. |
No |
|
|
|
No |
No |
Checking of the invariant is carried out in the precondition and
postcondition. |
No |
No |
|
|
No |
No |
Accommodates functions other than regular class member functions
(static members, friends, global functions). |
No |
|
No |
|
|
|
Supports arbitrary numbers of parameters to member functions without
requiring extension. |
No |
|
|
|
No |
No |
Takes advantage of non-strictness in the evaluation of preconditions
and postconditions. |
No |
No |
No |
No |
No |
No |
Table 1 summarizes the characteristics of the different approaches. Consideration of Table 1 permits general observations to be made, such as:
- Automatically providing postconditions with access to the return value of the function (or the old value of a data item) has only been possible with language extensions.
- It is not possible to provide full expressivity in the assertion-checking mechanism without using functions that may have side-effects on program execution ("letting the imperative fox into the applicative chicken coop").
- Automatically checking the invariant in conjunction with precondition and postcondition evaluation only makes sense in the OO schemes (i.e., those that were not originally C-based).
The contents of Table 1 as a whole serve as a wish list for any new attempt to provide support for Design by Contract in C++.
Supporting DBC across an inheritance hierarchy
This section develops generic classes that can be provided in library form, and that minimize the amount of effort required by a programmer to enjoy Design by Contract support across an inheritance hierarchy of arbitrary depth in C++. The effort involved when there is no intrinsic language support for Design by Contract is twofold: first, the same skeleton code structure must be rewritten every time a function needs to be predicated (see Figure 1).
Figure 1. Structure of a predicated function.
Figure 1 shows the execution of the body of the function must be preceded by code that checks the precondition and invariant, and must be followed by code that checks the postcondition and invariant. In addition, the current object value must be stored prior to execution of the body, and this value—along with the return value of the function—must be made available to the postcondition.
Secondly, repetitive effort is required in maintaining the discipline of checking effective rather than local constraints. Thus, for a function precondition, adherence to the LSP can only be assured by defining the effective precondition to be the disjunction of the local preconditions at each level of the hierarchy (see Figure 2).
Figure 2. Precondition disjunction.
This process becomes tedious and potentially error-prone for a large number of methods across various hierarchies. It can be made much more straightforward and shorter by using templates that rely on two ideas:
- The functionality required to provide precondition and postcondition monitoring can be encapsulated into a generic function object;
- A single primary template can be used to generate classes that parallel an inheritance hierarchy of arbitrary depth.
Defining a Function Object
To support assertion checking, the execution of a function needs to be preceded and followed by particular actions, and all three stages are interconnected. This interconnection is most apparent in that values that are available in the first two stages (the value of the object prior to function execution and the return value of the function) are required by the third stage. Hence, it is clear that some encapsulating of functionality is required, and this suggests the use of function objects.
To construct a suitable function object, the standard approach is to compare two or more examples of specific cases of functions that deal with all the stages of precondition and postcondition handling, and to create an abstraction by identifying similarities and differences. The template code captures the common features of predicated methods, while the differences are recognized in the template's type parameters.
Therefore, consider the two code examples in Listing 1. Clearly, the similarity lies in the steps that are performed and in the order in which they take place. The differences are in the types of the objects involved: first, the return type of the function; and second, the function's formal parameters. In addition, the class of which this function is a member constitutes a type parameter, because it determines the type of old. Since the number of formal parameters is variable, the total number of type parameters on which the function depends is variable. There is no mechanism in C++ to handle an arbitrary number of parameters in one template, so different templates will be required for each possible number of function parameters (within reason). The definition of the function object and associated operator() (..) for the case of a function with a single parameter is shown in Listings 2, 3, 4, and 5, along with an example of its use (the definition of the class Assertion is given in Appendix I).
In order to use the function object in the examples shown above, it is necessary to rename the original function. This function is called during the course of the execution of the function object if the precondition has been found to be satisfied; the code remains unaltered. This arrangement can easily be delimited by conditional preprocessing directives so that the checking can be turned off if desired and the original code executed in its original form.
The body of the method Complex::operator/(..) must declare an instance of the function object, passing pointers to the current object and the renamed function to the constructor, and the function object must then be called passing the function parameters. If brevity is paramount, it is possible to combine these two steps by providing additional function templates that take the parameters to the function object constructor and the parameters to the call of the function object. (This can be seen in the sample code in Maley.13)
A convention of naming the object the same as the function except for capitalizing the first letter is suggested (This of course assumes that the convention of using lower case for function names has been adopted in the first place). In addition, names of user-redefined, built-in operators cannot use the operator symbols, hence, the name OperatorDivide.
An alternative to renaming the original function is to define a member body(..) in the function object, and move the code of the body of the existing function to the body of the function body(..). However, referring to the example, since this function would not be a method of class Complex, it must be given access to the Complex object through a pointer, and the code must be adjusted accordingly (i.e., method calls must be prefixed by _object->). It may therefore be necessary to declare the function object class to be a friend of Complex if access to non-public members in the body of the function is required.
There is only one constraint on the instantiating types: the enclosing class must define the function invariant(..).
Specific predicates are defined by the client by providing an encoding of the pre(..), post(..) functions; the compiler automatically selects these specializations over the general definitions given in the primary template that are universally satisfied. These functions are provided with a means of accessing the object and the parameters of the method being called, and the postcondition is provided with the "old" object (the state of the object before function execution) and the return value of the function. The integrity of the conditions themselves may be subjected to improved integrity assurance.14 There is no way to ensure that the checking of constraints is side-effect free. The functions to be specialized are declared const (and therefore so must the specializations provided by the client), but it is the function object that cannot be changed, the actual object is referenced through the _object pointer, which could be const_casted by the client. Not all languages with Design by Contract support a built-in guarantee of freedom from side effects. For example, the Object Constraint Language of UML does, but Eiffel does not.
Using Genericity to Span an Inheritance Hierarchy
A single primary template can be used to generate classes that parallel a single inheritance hierarchy of arbitrary depth; for example, consider the generic class:
template
class DBC : public
DBC { /* .. */ };
It displays constrained genericity, in that the instantiating class
Derived must declare a name
(Base) in order for instantiation to succeed at compile time. If the name
Base is defined to be the name of the class
Derived is derived from, which can be done by using a
typedef, then the template defines a generic class that inherits from that instantiation of itself which is instantiated using the base class of the class that instantiates it.
Clearly, the construction requires a mechanism by which it can be "terminated" in the case of the root class. The root class is not derived from any other class, and the chain of inheritance comes to an end. This can be accomplished by introducing an additional type parameter Terminator into the definition:
template
class DBC : public DBC { /* .. */ };
and then defining a partial specialization of the primary template that will be used in the root case:
template
class DBC { /* .. */ };
The
Terminator parameter need only be included explicitly in one instantiation, where it is used to identify the root of the hierarchy: the default suffices in derived classes.
The generic class DBC<..> will be used as the basis for an inner template class of the function object template. It brings the name of the base class into the scope of the derived class, which facilitates the template definition, eliminates the need for much explicit coding, and shifts the work from the client to the template supplier.
The specific precondition and postcondition functions pre(..) and post(..) used in the definition of operator()(..) are replaced by effectivePre(..) and effectivePost(..), respectively. These functions evaluate the necessary disjunction and conjunction of local preconditions and postconditions, respectively. Likewise for invariants. Inspection of these functions in the templates in Appendix I shows that great care must be taken with scoping at intermediate levels in the inheritance hierarchy. The complete function object for any method may have an arbitrarily long chain of ancestors. In the primary template (denoted here as PT), a call to method pre(..) in the definition of effectivePre(..) might appear, for a method Bar(..) and an intermediate class Foo, to be a call to PT::Bar::pre(..). However, polymorphism dictates that if the class at the top of the hierarchy (the one that has no derivative) is Fum, then it is PT::Bar::pre(..) that is called. Hence, a scoping operator must be introduced into the definition.
The additional template parameter Top is included. Its default value is false, and the top of the hierarchy is indicated by including the value true for this parameter. Identifying the top of the hierarchy is necessary to handle the "old" object efficiently. The "old" object is referenced through a pointer in the base class of the function object, but this points to an object of the most derived type in the class hierarchy. Therefore, at each level in the hierarchy, an "old" object of the right type for that level can be accessed.
For a given method, the template client optionally provides definitions for the methods pre() and post() for each class in the hierarchy. When a function is called for a given object in the hierarchy, the body defined for objects of that class is executed; in addition, the full precondition and postcondition across the whole of the hierarchy from that class to the root is checked. (Obviously, if the precondition is not satisfied then the body may not, in fact, be executed).
In the combined template scheme, it is no longer necessary for the type of the current object to be a template parameter of the function object template (the outer class), because it is provided by the inner class.
In order to support class invariant, method precondition, and method postcondition monitoring across an inheritance hierarchy, it is necessary to provide a typedef in each class in the hierarchy to define the name "Base." This step brings the name of the base class into the scope of the derived class, making it accessible within the template. For all classes except the root of the hierarchy, the name "Base" is defined to be the class from which the class is derived. For the root class, it is defined to be the class itself. This distinction is vital when utilizing partial specialization to identify the root of the hierarchy.
Optionally, the following typedefs can be added to make function names shorter and clearer:
- A global typedef for each predicated method, which instantiates the outer template class with the types of the return value and the parameters of the predicated method.
- A typedef per class, per predicated method that instantiates the inner template class with the types in the inheritance hierarchy.
Distinguishing between methods with identical signatures
If two methods in the same hierarchy have the same signature, clients of the templates must be aware that they must differentiate between them. Consider, for example, a definition:
bool Derived2::MethodA::pre(..) { /* ... */ }
The use of the global
typedef
typedef ::DBC1 MethodA;
and the local
typedef ::MethodA::DBC MethodA;
in class
Derived213 mask the fact that this definition is actually
bool ::DBC1::DBC::pre(..) { /* ... */ }
which would be indistinguishable from the definition of another method in the same hierarchy with the same signature. Therefore, in such a case, it would be necessary to include an integer value in the type parameters of one instantiation of
DBC1<..> in order to make the distinction apparent, for example:
typedef ::DBC1 MethodC;
The
MethodNumber type parameter, which defaults to zero, is provided for this purpose.
Void functions
The current state of compiler technology means different solutions to the problem caused by functions with no return value must be considered, and the appropriate templates for the capabilities of the compiler chosen. In the standard template definitions, the type parameter ReturnType is used to declare the type of the parameter used to pass the return value of the function to the postcondition, in the form const ReturnType&. However, if ReturnType is instantiated to void, this declaration when instantiated is illegal.
There are a number of ways around this. Ideally, a specialization of the templates is written that the compiler can automatically select whenever a function with no return value is declared as a function object. The return value parameter can be eliminated from the declaration of post(..) in this specialization. However, not all compilers will recognize the specialization as valid. The alternatives involve defining the templates in terms of a pointer to the return value instead of to the value itself.
Point of declaration
It is possible to declare the supporting function objects in the class, rather than in the class method (and not declare the class methods at all). This can reduce considerably the amount of housekeeping code that needs to be written. However, it does have drawbacks. First, polymorphic behavior is lost unless wrapper methods are also declared, since function objects cannot be declared virtual. Second, all class constructor definitions must be amended to ensure that the object pointer is passed to each of the declared function objects.
Multiple inheritance
It is possible to extend this scheme to multiple inheritance hierarchies, and the best way to accomplish this is still being investigated. The definition of the name "Base" in each class is no longer useful, since methods no longer share a common heritage. As a result, some of the declarations are lengthier, as function objects must be instantiated with essentially their entire ancestry. In each case, this is a chain of single inheritance, but the chain is not common to all methods. As with the practical limit on the number of formal parameters, the template writer must impose a practical limit on the depth of the inheritance lattice.
Supporting the STL
Many data types are now based on hierarchies of containers, and the containers are provided by the STL.15 In such cases, there need be no requirements on the client for DBC support to be provided. Monitoring constraints on these types can be handled by extending the capabilities of the STL containers. In brief, this is done by creating new containers that inherit from the originals in a new namespace. Support for verifying preconditions and postconditions on the operation of functions is provided by redefining the operations so that in the new operation, the precondition is checked prior to a call to the original operation, and similarly the postcondition is checked afterwards. Invariants can be handled in a straightforward manner, as each generic container definition can include code that calculates the conjunction of the invariants of the elements it contains, together with its own invariant (ordering in a vector, for example).
There are really two separate cases to consider for preconditions and postconditions. First, there is the case of preconditions and postconditions that are inherent to a container operation because of the properties of the container. For example, it is an error to try to erase an element from an empty vector, so there should be a precondition on erasing from a vector that states the iterator parameter lies in the range of iterators of the vector. Second, there are the preconditions and postconditions that are specific to the instantiation of an operation for a particular type. Both can be accommodated.
Each container operation for which a client may wish to define a precondition or postcondition (i.e., those that modify the container; no client is likely to want to stipulate a specific precondition for a query operation such as size()) is redefined within the namespace 'stdpp' to check conditions that can be redefined for specific type instantiations by the client. For example, methods pre_erase(..) and post_erase(..) are declared in the redeclaration of vector (see Listing 6). This leads to the definition of stdpp::vector::erase(iterator) as shown in Listing 7.
The client then has the option of supplying specific definitions of stdpp::vector::pre_erase() const or stdpp::vector::post_erase() const as necessary. Clearly, granularity of control over which checks are employed can be introduced.
One potential issue concerns versions of the STL. There is no recognized canonical STL implementation, and even if there were, there is no guarantee it could not be superseded. The library adaptation described above works with both the HP and the SGI implementations of the STL.
Conclusion
This work has been the result of endeavoring to apply modern software engineering techniques to problems in Computational Physics. Formal specifications of the problems were written in order to state precisely and unambiguously the functionality of the components.
While it was desired to make maximum use of existing code and libraries in the implementation of the component, it was recognized that libraries such as the STL only provided support for the statement of data structure given in the specification, not for the monitoring of the stated constraints on that data. Similarly, the generic algorithms of the STL provide an immediate implementation of some of the operations defined in the specification, but no mechanism for verifying their behavior is that which is required.
It was therefore necessary to provide Design by Contract support for the chosen implementation language, C++, which has been done by taking advantage of new language features that reduce the effort required by the programmer to a minimum. These new features constitute part of the multiparadigm design capability of C++.
The genericity paradigm was employed in tandem with one of the most basic techniques of analysis—identifying similarities and differences, and creating abstractions—to develop templates that capture the common features of predicated methods, while the differences are recognized in the template's type parameters; constraints on which types can instantiate a given template indicate that some of the commonality that has been identified is not embedded in the code of the algorithm, but in the type requirements. This approach results in a significant amount of implementation simply by declaration.
Referring back to the table of desirable and not-so-desirable characteristics of Design by Contract support mechanisms (Table 1), the templates shown provide most of the significant players in the wish list and, with minor adaptations could provide more if required. The one desirable characteristic that cannot be ensured is that checking of constraints is side-effect free.
References
- Liskov, Barbara H. and Jeannette M. Wing. "A Behavioral Notion of Subtyping," ACM Transactions on Programming Languages and Systems, 16(6):1811–1841, Nov. 1994.
- Hoare, C.A.R. "An Axiomatic Basis for Computer Programming," Communications of the ACM, 12(10):576–583, Oct. 1969.
- Hoare, C.A.R. "Proof of Correctness of Data Representations," Acta Informatica, 1(4):271–281, 1972.
- Floyd, Robert F. "Assigning Meanings to Programs," Proceedings of the American Mathematical Society Symposium in Applied Mathematics, Volume XIX, 19–31, 1967.
- Dijkstra, Edsger W. A Discipline of Programming, Prentice Hall, Englewood Cliffs, New Jersey, 1976.
- Maley, David and Ivor Spence. "Emulating Design by Contract in C++," TOOLS 29, IEEE.
- Binder, Robert V. Testing Object-Oriented Systems: Models, Patterns and Tools, Addison–Wesley, Reading, MA, 1999.
- Welch, David and Scott Strong. "An Exception-Based Assertion Mechanism for C++," JOOP, July/Aug 1998.
- Rosenblum, David. "A Practical Approach to Programming with Assertions," IEEE Transactions on Software Engineering, 21(1), Jan. 1995.
- See http://www.gnu.org/software/nana/manual/nana.html.
- Cline, Maurice and Doug Lea. "Using Annotated C++," Proceedings of C++ at Work, Sept. 1990.
- Porat, Sara and Paul Fertig. "Class Assertions in C++," Journal of Object-Oriented Programming, 8(2):30–37, May 1995.
- See www.stmarys-belfast.ac.uk/~d.maley/dbc.cpp.
- Maley, David and Ivor Spence. "But Who Will Guard the Guardians?," TOOLS USA 2000, IEEE.
- Plauger, P. J., Ed., et al. C++ Standard Template Library, Prentice Hall, Englewood Cliffs, NJ, 1995.