Michael Ball is a Distinguished Engineer in the Developer Environments and Tools department of Sun Microsystems Inc., and also the principal representative on J16, the ANSI C++ Committee.
He can be contacted at [email protected].
Stephen Clamage is a Staff Engineer in the Developer Environments and Tools department of Sun Microsystems Inc., and chair of J16, the ANSI C++ Committee. He can be contacted at [email protected].
As we explained in our previous (April) column, last year's advent of the C++ Standard made the "C++ Oracle" somewhat redundant. We decided to let this Oracle, like those of old, disappear quietly into the dusk. But speaking of dusky things, C++ retains some pretty dark corners, which we'd like to illuminate in this new series of columns.
Although the corner of C++ we'll explore in this column isn't particularly dark, many programmers find it confusing. We'll look at ways to cope with dynamic memory management (new and delete) when converting old code to standard-conforming code, or when you must write portable code to work with a variety of compilers. We don't have space to go into all the gory details in this column. In particular, we'll save a discussion of how to write your own allocation and deallocation functions for another time.
First, let's have a quick review to eliminate some common sources of confusion.
MEMORY ALLOCATION REVIEW You can create an object of type T dynamically by using a new-expression such as:
T* t = new T(args); // example 1
You can also allocate "raw" memory—uninitialized and untyped—by calling a version of the overloadable function operator new, as in:
void* p = operator new(sizeof(T)); // example 2
A new-expression causes a version of operator new to be called, followed by the invocation of the appropriate constructor for the type, along with any supplied arguments.
Under the original C++ rules in the ARM,1 an operator new returned a null pointer if memory could not be allocated. The predefined versions of operator new did not throw any exceptions.
The C++ Standard2 introduces additional predefined forms of operator new and operator delete and modifies the way they work. One set of functions always reports failure by throwing an exception; it never returns a null pointer. Another set never throws an exception and reports failure by returning a null pointer. The default versions obtained by code as in example 1 throw an exception on failure, changing the behavior of old code. Whether this is good or bad depends on the code you must write and maintain. The C++ Committee agonized over this decision for about two years.
Under the rules in both the ARM1 and the standard,2 there is one form of new-expression for single objects and another for arrays of objects. You can overload operator new with versions that take additional arguments; these are called placement-new forms. A special form of new-expression invokes those functions. For example:
T* q = new (location) T; // example 3
Under both sets of rules, you can create class member versions that will be invoked when a new-expression creates an instance of that class.
An operator new can fail when it is unable to allocate space, or possibly for additional reasons specific to your own version of the operator. Before reporting failure, it can invoke a new-handler function if one has been installed via the standard function set_new_handler. The new-handler can try to fix the condition that caused the failure, exit via an exception, or end the program.
MEMORY DEALLOCATION REVIEW When you want to deallocate a heap object, you use a delete-expression such as:
delete t; // allocated in example 1
You can delete raw memory by calling an operator delete directly, as in:
operator delete(p); // allocated in example 2
A delete-expression first invokes the destructor for the object pointed to, then calls an operator delete to deallocate the memory. The memory must have been allocated by a matching operator new, and it must not have been deallocated already. The predefined versions do not throw exceptions.
Although the ARM1 did not allow "placement" versions of operator delete, the standard2 does. There is no way to invoke one directly from a delete-expression, but if a placement-new-expression exits via an exception from the constructor, the matching placement version of operator delete will be called automatically.
To get the inverse effect of a placement-new, you first call the destructor explicitly, then call the placement version of operator by supplying all its arguments. Example:
q->~T(); // destroy object allocated in example 3
operator delete(q, location);
ARRAY FORMS REVIEW The focus of this column doesn't allow us to go into a lot of detail about array forms. Let's simply note that to allocate an array, you specify the dimension(s) as in a variable declaration. For example, to allocate an array of N objects of type T, you write:
T* a = new T[N];
In this form of allocation, you cannot specify constructor arguments, so the type must have a constructor that can be invoked without arguments.
To deallocate the array, you must use this form of delete-expression:
delete [] a;
The empty brackets tell the compiler that an array was allocated, which the compiler could not otherwise know when it sees only a pointer. Because the allocation functions for single objects and arrays can be separately overloaded or replaced (hence they might use different memory pools), it is essential to use the correct form of delete-expression—single object or array.
STANDARD ALLOCATION AND DEALLOCATION FUNCTIONS The standard versions of operator new and operator delete are all declared in standard header , which looks like Listing 1.
Notice how the operators are in the global namespace and come in various pairs:
- One version of operator new can throw standard exception bad_alloc, and one throws no exception.
- One version of each operator is for single objects, and one is for arrays.
The placement versions also come in pairs, but none throw exceptions. These standard placement versions take a single extra argument: the address of memory that has already been allocated. The placement operator new does not allocate memory, but simply returns the pointer passed in. The placement operator delete does nothing.
You can replace the nonplacement versions of these functions with your own versions, or you can put overriding versions into classes. You must follow some complicated rules if you do so; we'll save them for another column.
EXCEPTIONS If you do not specify a "nothrow" placement version of a new-expression, the operator new reports failure by throwing an exception. In that case, the remainder of the new-expression is bypassed, and control passes to the exception handler, if any.
If the allocation portion of any new-expression succeeds, but the constructor exits via an exception, ordinarily you would not be able to tell what happened or what sort of cleanup would be required. Consequently, the compiler and runtime system conspire to do the cleanup for you. In this case, any fully constructed parts are destroyed by their destructors, and the operator delete that matches the operator new is called. Control then passes to the exception handler, if any.
As always, if no handler exists for a thrown exception, the standard function terminate is called, which calls the standard function abort by default. You can replace the default version of terminate via standard function set_terminate.
EXCEPTION HANDLING: MATCHING new AND delete When an operator delete must be called automatically due to an exception being thrown in a new-expression, the compiler determines which one should be called. If the new-expression includes the global scope qualifier, as in ::new T, the compiler looks in the global namespace for the operator delete. Otherwise, if the operator new is a member of class T, the compiler looks first in the scope of T for a matching operator delete, and next in the global namespace if one is not found.
If a placement new-expression is used, the compiler looks for a matching placement version of operator delete. Every operator new has a first parameter of type size_t, and every operator delete has a first parameter of type void*. The match is determined by the parameters following the first. If there is no matching operator delete, none is called for this case of dealing with exceptions.
MODIFYING YOUR CODE As with many aspects of programming, the complications arise when you need to deal with errors; the straight-line processing is often simple. Suppose your original code did not check for allocation failure and looked like this:
T* t = new T(args);
// use t
You don't necessarily need to change this code. If allocation had failed, you would have tried to use a null pointer, resulting in some sort of program failure. Often the failure point is far removed from the allocation. Compiled with a standard-conforming implementation, this code will result in program termination at the point of the failed allocation, due to the bad_alloc exception not being caught. That result is probably no worse than the original result.
On the other hand, maybe you conscientiously checked for allocation failure, like this:
T* t = new T(args);
if( t == 0 ) {
// deal with failure
}
// use t
Compiled with a standard-conforming compiler, the code will abort on failure, and you will never get to your failure-handling code. You basically have two choices.
- Use the "nothrow" form of the new-expression. This approach is the simplest; the only change is a minor addition to the new-expression itself:
T* t = new (std::nothrow) T(args);
if( t == 0 ) {
// deal with failure
}
// use t
- Rewrite the code to catch exceptions. The change is syntactic, not structural:
T* t = 0;
try {
t = new T(args);
}
catch( const std::bad_alloc& ) {
// deal with failure
}
// use t
What about code that must work with both old and new compilers? We recommend strongly against littering your code with conditional compilation. Such code is very hard to maintain. Even in these simple examples, the result would be unreadable.
At the risk of being branded as heretics, let us say that it might make sense to ignore allocation failure. On some common platforms, it's unlikely that you'll run out of memory, but if it happens, there is nothing you can do about it. The out-of-memory situation renders the system too unstable to attempt any sort of graceful shutdown, or even to display an error message. Allowing the program to abort may be no worse than the attempt at a graceful exit.
Of course, there are situations in which it does make sense to check for a failed allocation. You can then write a function to do your allocation for you. Typically, you don't have a lot of different types being allocated in one program, so you won't have many different functions. In our example, we'd write two versions of one function. For concreteness, let's assume we want to pass two arguments of types T1 and T2 to the T constructor. We would then have pairs of alternative functions, something like this:
// for old compilers
T* create_T(T1 arg1, T2 arg2) throw()
{
return new T(arg1, arg2);
}
// for new compilers
#include
T* create_T(T1 arg1, T2 arg2) throw()
{
return new (std::nothrow) T(arg1, arg2);
}
You can use separate files, one for old-style functions and one for new-style functions, then compile one or the other according to what your compiler supports. If you prefer to use one file with conditional code, you can put #if directives around the entire functions, or just around their bodies.
We would use the function like this:
#include "create.h" // prototypes
...
T* t = create_T(a1, a2);
if( t == 0 ) {
// deal with failure
}
// use t
If the failure-handling code is common to all uses, it can be moved inside the create_T function.
Why not use templates instead of writing a version of this function for each type? The template version of the create function takes a template parameter of type T, but it does not have a function argument of type T. Any compiler that accepts such a template would have the new, not old semantics for new-expressions, and the create functions wouldn't be needed! If the compiler has the old semantics, it's unlikely it would support this new template feature.
References
- Ellis, M. and B. Stroustrup. The Annotated C++ Reference Manual, Addison-Wesley, Reading, MA, 1990.
- International Standards Organization. Programming Languages—C++, ISO/IEC publication 14882:1998, 1998.
Quantity reprints of this article can be purchased by phone: 717.560.2001, ext.39 or by email: [email protected].