The Assignment Operator Revisited

The Assignment Operator Revisited
by Richard Gillam
Advisory Software Engineer, Text & International
IBM Center for Java Technology–Silicon Valley

If you think you know it all in the C++ world, it must mean you’re not talking to your colleagues very much. If I had any pretensions to knowing it all when I wrote my assignment-operator article ("The Anatomy of the Assignment Operator," C++ Report, Nov/Dec 1997), they didn’t last long afterwards.

The assignment-operator article drew a huge response, with a lot of people sending me corrections and disagreements of various kinds. The issues have been mounting up, so I thought maybe a follow-on article to discuss the issues would be appropriate.

The big mistake

One I heard about almost instantly from several people (and which I’m really glad I heard about before delivering a talk on this subject at C++ World) was a rather serious mistake. When I first did this article, I had the big "right answer" like this:

TFoo&
TFoo::operator=(const TFoo& that)
{
    if (this != &that) {
        TSuperFoo::operator=(that);
        TBar* bar1 = 0;
        TBar* bar2 = 0;

        try {
            bar1 = new TBar(*that.fBar1);
            bar2 = new TBar(*that.fBar2);
        }
        catch (...) {
            delete bar1;
            delete bar2;
            throw;
        }

        delete fBar1;
        fBar1 = bar1;
        delete fBar2;
        fBar2 = bar2;
    }
    return *this;
}

This was wrong, and it was caught in the review process. The problem here is that if you’re trying to transactionize the assignment, so that either all of it happens or none of it happens, this breaks that. If an exception occurs trying to new up bar1 or bar2, the TFoo part of the object won’t have changed, but the TSuperFoo part will have. The call to TSuperFoo::operator=() can’t go at the top of the function.

As I said, this was caught during the review process. So when the article ran, this example looked like this:

TFoo&
TFoo::operator=(const TFoo& that)
{
    if (this != &that) {
        TBar* bar1 = 0;
        TBar* bar2 = 0;

        try {
            bar1 = new TBar(*that.fBar1);
            bar2 = new TBar(*that.fBar2);
        }
        catch (...) {
            delete bar1;
            delete bar2;
            throw;
        }

        TSuperFoo::operator=(that);
        delete fBar1;
        fBar1 = bar1;
        delete fBar2;
        fBar2 = bar2;
    }
    return *this;
}

Unfortunately, that’s wrong too. The problem is we’re still hosed if TSuperFoo’s assignment operator can also throw an exception, which is a reasonable thing to expect. If we succeed in creating our TBar objects, but TSuperFoo::operator=() fails to create whatever he needs to (presumably he’s also transactionized), the object will correctly be left untouched, but we’ll leak the new TBars we created. So the right answer (he said sheepishly) is this:

TFoo&
TFoo::operator=(const TFoo& that)
{
    if (this != &that) {
        TBar* bar1 = 0;
        TBar* bar2 = 0;

        try {
            bar1 = new TBar(*that.fBar1);
            bar2 = new TBar(*that.fBar2);
            TSuperFoo::operator=(that);
        }
        catch (...) {
            delete bar1;
            delete bar2;
            throw;
        }

        delete fBar1;
        fBar1 = bar1;
        delete fBar2;
        fBar2 = bar2;
    }
    return *this;
}

The call to TSuperFoo::operator=() has to go inside the try. Notice that it goes after we create the new TBars. We want to make sure creating the TBars has succeeded before we call TSuperFoo::operator=() because TSuperFoo::operator=() might succeed, changing the object, and we only want to change the object if we can carry out the whole assignment operation.

One interesting consequence of this is that you can imagine a class with a fairly deep inheritance chain where every class up the chain has other objects it owns. You’d call an assignment operator low in the chain, it’d create the objects it needs, then it’d call its parent, which would create the objects it needs and call its parent, and so on up the chain. Eventually, all of the new objects would have been created and would be pointed to by temporary variables on the stack. Then, at the root level, the assignment would finally begin to be carried out, with objects being deleted and the object’s data members being changed as each function returned. So the allocations happen in one order and the assignments and deletions happen in reverse order, which feels kind of awkward at first glance, but it gets the job done. It also means that there has to be enough free memory to hold two instances of every subobject, but there really isn’t a safe way around allocating all the extra memory.

By the way, I’ve also had several people question my assumption that the delete operations won’t throw exceptions. Technically they’re right, but I’d strongly counsel against letting this happen. I think it’s wise to declare "Destructors will not throw exceptions, nor will they allow exceptions thrown by functions they call to propagate out of them" to be a program invariant. The reason for this is that destructors are called in the course of handling exceptions. If exception-handling code can throw more exceptions, it’s extremely difficult, if not downright impossible, for everyone to properly clean up after himself, and extremely difficult for the program to completely recover from the error condition and go on. Therefore, throwing or propagating exceptions from within destructors is not a good idea.

The magic three

In my previous article, I singled out C++’s "magic three" functions, the default constructor, copy constructor and default assignment operator and said that one should always define them. This has raised a few hackles.

First, several people correctly pointed out that the default constructor is only defined by the compiler when you don’t create any other constructors. This is indeed true; I left this fact out for simplicity. In retrospect, I shouldn’t have.

Several people took exception to my statement that every class should define the "magic three." They were disturbed by the suggestion that every object should have a default constructor. They’re right. There are probably more objects for which it isn’t appropriate to have a default constructor than there are for which it is. Oftentimes, you can’t initialize an object to a meaningful state without some data being supplied from outside, or you can only do it by adding special-case code just to support a default constructor you don’t really need.

Occasionally, you even have a default constructor forced on you. Taligent’s CommonPoint system did this: its serialization facilities required a default consructor to work right, one of the bigger architectural gaffes in that system, in my opinion (of course, now I’ll get angry letters from ex-Taligent people explaining why it had to be that way).

I think what I really meant to say in the original article didn’t come through strong enough: You should always declare the magic three functions. This way, you make an explicit statement that you are not accepting the default implementations of these functions. If a default constructor isn’t appropriate for your class, don’t write one just for the sake of writing one; declare it private and give it an empty implementation. But be sure you declare it. Same goes for the copy constructor and assignment operator.

A number of people also suggested an improvement to my original advice: "If you don’t want it, declare it private and give it an empty implementation." You actually don’t have to give an unwanted function an implementation at all. You can declare the function private and not define it. The declaration will suppress the compiler-generated version of the function, but not defining it saves you from having to supply dummy code that doesn’t actually do anything and will never get called. Furthermore, while declaring the function private will prevent outside classes from calling it, it won’t prevent the same class from calling it. If you don’t supply an implementation, the class will get a link error if it calls its own unwanted magic functions. This is somewhat nonintuitive to debug, but it’s better than having the compiler silently let the caller get away with calling a function nobody’s supposed to call.

I also had people take rather violent exception to my suggestion that one should always define the copy constructor and assignment operator, even when they really do what you want them to do. They pointed out that it’s a lot of wasted boilerplate code, which is ugly and a pain to maintain. Furthermore, it’s possible for the compiler to perform optimizations on the default functions that it might not be able (or willing) to perform on user-written code. Most importantly, if you add or delete members from the class, the default copy constructor and assignment operator pick up the changes automatically. If you define these functions yourself, you have to remember to maintain them when the class definition changes, or you’ll have compiler errors or runtime bugs.

This is all very true, but I’ll stand by my original advice just the same. Boilerplate copy constructors and assignment operators are ugly code and a hassle to maintain, but being in the habit of always writing the copy constructor and assignment operator also puts you in the habit of thinking about just what the correct copy behavior is for all the members of your class. If all the members are integers, this probably isn’t a big deal, but if they’re pointers, it’s a very big deal. Getting into the habit of accepting the defaults without taking the time to think about it can also lead to bugs down the road if you mistakenly accept the default when it doesn’t do the right thing.

And, of course, you have to rely on comments to explain that you know about the default and are failing to define these functions on purpose. I’m always a little uncomfortable with relying on documentation for things like that.

Virtual assignment

Finally, several people, including my own manager here at IBM, disputed my advice to make the assignment operator of a class non-virtual. Let’s take a closer look at this issue.

Consider the following simple example:

X* x;

void setX(const X& newX) {
    x = &newX;
}

This will work right, but only if X is a monomorphic class. But let’s say X is polymorphic. Pretend it has an inheritance hierarchy like this:

  X
 / \
Y   Z

That is…

class X {
    // ...
}

class Y : public X {
   // ...
}

class Z : public X {
    // ...
}

Now, if either x or newX points to an object of class Y or Z, we’ll slice. Only the members defined in X will get copied. If x is an instance of Y or Z, the members defined by Y or Z won’t get led in with new values. If newX is an instance of Y or Z, the members defined by Y or Z won’t get copied into x. Bad news.

The problem here, of course, is that we’re calling X’s assignment operator even when x isn’t an instance of X. The obvious solution, therefore, would be to make X’s assignment operator virtual. Then the correct assignment operator would be called. If we do this, the assignment operators would look like this:

X& X::operator=(const X& that) {
    // copy X’s members...
    return *this;
}

X& Y::operator=(const X& that) {
    Y& y = dynamic_cast<Y&>(that);

    X::operator=(that);
    // copy Y’s members using y
    return *this;
}

X& Z::operator=(const X& that) {
    Z& z = dynamic_cast<Z&>(that);

    X::operator=(that);
    // copy Z’s members using z
    return *this;
}

Now, if x and newX are actually both instances of Y, Y’s assignment operator will get called and everybody will work right. Big improvement, right?

Well, consider the situation where x is a Y and newX is a Z. In this case, the dynamic_cast will fail, throwing a bad_cast exception. Now we have a problem.

The bad_cast exception is good, in a way, because it traps the mismatched classes and causes an error, rather than just slicing silently. But now we have an error condition we have to handle.

Remember that after an assignment succeeds, the objects on either side of the = are to be computationally equivalent. That is, all of their visible state and their behavior should be the same. This implies that they should be the same class. What you really want is for it to look like x morphed from whatever class it was to the same class newX is. X, Y, and Z’s assignment operators can’t do this; there’s no way to morph an existing object from one class to another (well, there kind of is, but we’ll get to it later). Instead, setX() has to deal with this:

void setX(const X& newX) {
    try {
        x = &newX;
    }
    catch (bad_cast&) {
        X* temp = newX.clone();
        delete x;
        x = temp;
    }
}

Remember clone()? This is the polymorphic copy constructor. If you need polymorphic copy on a group of related classes, you define a virtual function called clone() and every class in the tree overrides it to call his own copy constructor. You can’t just call X’s copy constructor for the same reason you can’t just call X’s assignment operator.

Another alternative is that setX() doesn’t handle this condition, but some other class up the inheritance chain will have to, probably by doing the same thing we’re doing here: deleting the old X and creating a new one of the right class. (There might be other meaningful ways of handling the exception, but they’d be more application-specific.)

The other possibility is that nobody handles the exception. We could just declare "assignment operators shall always be called with like classes on either side of the equal sign" as a program invariant. In other words, we declare heterogeneous assignment to be a condition which Should Never Happen.

Violations of program invariants ("Should Never Happen" conditions) are programmer errors; they’re things you’re assuming you’ll never run into at runtime. An exception shouldn’t be thrown for a violated invariant; since you’re not expecting it to happen at runtime, you don’t want to waste time putting in lots of extra code to handle it; the program is just malformed. And if you throw an exception that nobody catches, this simply causes your program to terminate. Quietly. With no error messages.

If your program’s going to terminate, you want it to terminate loudly, proclaiming to the world that Something Went Wrong. The way you do this is with the assert() macro. You pass to assert() an expression you expect to always evaluate to true. If it evaluates to false, it prints an error message that usually contains the text of the offending expression and the line number of the assert, and then the program terminates. (You can also cause asserts to be compiled out in production versions of your program, which will cause them to fail silently instead.)

So then instead of the dynamic cast, you can do a static cast and precede it with an assert:

X& Y::operator=(const X& that) {
    assert(typeid(that) == typeid(*this));

    Y& y = static_cast<Y&>(that);
    X::operator=(that);
    // copy Y’s members using y
    return *this;
}

By the way, my original attempt at this was

assert(typeid(that) == typeid(Y));

You don’t want to do it this way, because then when Y::operator=() calls X::operator=(), X::operator=() will choke because that isn’t an instance of X. You’re not concerned that "that" is some particular static type; you’re concerned that "this" and "that" are the same type, whatever that type is.

So anyway, using the assert is one way around the heterogeneous-assignment problem, and it has a lot to recommend it, in situations where you really know that this invariant can hold.

But let’s go back to the previous answer for a minute and assume we’re going to catch the exception and finagle the assignment in setX(). To refresh our memory, setX() now looks like this:

void setX(const X& newX) {
    try {
        x = &newX;
    }
    catch (bad_cast&) {
        X* temp = newX.clone();
        delete x;
        x = temp;
    }
}

Let’s consider our possibilities here, ignoring Z for a moment. If x and newX are both instances of X or both instances of Y, we’re cool. If x is an instance of Y and newX is an instance of X, we’re also cool. Y::operator=() with throw a bad_cast exception, and we’ll catch it, delete x, and new up a fresh new Y to assign to x.

But what if x is an instance of X and newX is an instance of Y? In this case, we’ll end up in X’s assignment operator, and the dynamic cast will succeed. Y is a subclass of X, so dynamically casting a reference to a Y to a reference to an X is legal. Every Y is also an X. But because we’re in X’s assignment operator, we’ll only copy over the members of newX that were defined in X. It’s our old friend slicing again.

What we’d have to do to avoid this is manually check for like class in each assignment operator and throw the bad_cast ourselves, rather than relying on dynamic_cast to do it for us.

Instead, my original solution to this problem was to avoid using the assignment operator in the first place:

void setX(const X& newX) {
    X* temp = newX.clone();
    delete x;
    x = temp;
}

I still like this. It’s simple and clear, and it works correctly with no extra hassles even when x and newX are instances of different classes. The other solution, with the try/catch blocks, has an advantage in situations where the cost of deleting and newing the destination object is large and relatively rare (the try costs nothing in most modern compilers, so you in effect fast-path the case of like classes, but an actual throw can be quite expensive, so you achieve this fast-path effect at the expense of the different-classes case).

If the fast-path option makes sense for your application, I’d suggest avoiding the exception and doing it yourself like this:

void setX(const X& newX) {
    if (typeid(*x) == typeid(newX))
        x = newX;
    else {
        X* temp = newX.clone();
        delete x;
        x = temp;
    }
}

Now if you avoid using the assignment operator in situations where slicing may be a problem, we’re still left with the question of whether it makes more sense to make the assignment operator virtual or non-virtual. I’m tending now to come down on the side of making the assignment operator virtual with an assert to check for the different-classes condition (since there’s no way to handle that in the assignment operator itself and therefore the calling function already has to be aware of the possibility of polymorphism and handle it).

However, there’s another problem here. I remembered Taligent’s coding guidelines discouraging virtual assignment operators, so I went back to see why it recommended that. I wish I had done that before. It turns out Taligent’s guidelines weren’t hard and fast on the subject. Instead they point out that defining

virtual X& Y::operator=(const X& that);

won’t keep the compiler from defining

Y& Y::operator=(const Y& that);

In other words, an override of an inherited assignment operator doesn’t suppress the compiler-generated default assignment operator. You’d still have to do that manually by declaring it private and not giving it an implementation.

And actually, this won’t even work because C++’s overload resolution rules will cause the suppressed version to win in some types of call. For instance, consider a class like this:

class Y : public X {
    public:
        virtual X& operator=(const X& that);
        // other method definitions…
    private:
        Y& operator=(const Y& that);
}

Now consider this code snippet:

Y someY(/*agruments*/);
// do something with someY
Y someOtherY(/*arguments*/);
someY = someOtherY;

Since both someY and someOtherY are instances of Y, the overload resolution rules will declare the nonvirtual version of operator=() to be the "winner," instead of the inherited virtual operator=(). Since the nonvirtual operator=() is private, you’ll get an access-violation error at compile time.

Instead, you’d have to define the default assignment operator to call the virtual one. In every class that inherits the virtual one. Of course, this means defining it non-virtual. To see why, imagine if Y in the above example had a subclass called Z. If Y’s operator=() was virtual, Z would have to override it, then it would have to override X’s operator=(), and then it would have to replace its own default assignment operator. Cutting any corners here risks creating situations where the "winning" function, according to the overload-resolution rules, is a function that is not accessible or isn’t implemented. Clearly, this gets ridiculous quickly as the inheritance hierarchy gets deeper.

One side effect in either case is that you have to define an override of the virtual operator=() even when you don’t strictly need one; otherwise, the "default" one will hide the virtual one.

So there you go. A truly foolproof method of handling polymorphism in assignment operators involves declaring both a virtual and a non-virtual assignment operator in every class (except the root class of each inheritance hierarchy), with the non-virtual calling the virtual and the virtual asserting that both objects involved are the same class. Any time a calling function couldn’t guarantee the invariant would hold, it would have to avoid using the assignment operator and manually delete the object referenced by the target variable and new up a new one of the proper type.

Beautiful, huh?

Other ways of morphing

Before I wrap this up, one more thing: I alluded earlier to the idea that there are ways of making an object look like it’s morphed from one class to another. There are two ways to do this, neither of which is really all that much of a winner.

One option is not to change the class of the object on the left-hand side. It’s perfectly reasonable to define assignment operators that take different types on the left and right-hand side. The operator in this case performs some kind of meaningful conversion of the incoming data as part of the assignment process. The result isn’t really a copy, but it may produce completely appropriate results. This solution is definitely the right way to go for some classes in some applications, but it’s not a general solution. Be sure to consider whether it’s appropriate for your classes before going to all the trouble above.

The other option is to fake inheritance using containment. In this case, the objects on the left and right-hand sides of the equal sign are the same class, but they behave like members of different classes because they own objects of different classes. The simplest version of this idea is a smart pointer that knows about polymorphism for a certain group of classes and does the right thing. All you’re really doing here is encapsulating in this object’s assignment operator the delete/new code you’d otherwise have to put in client code, but hiding junk like this in a smart-pointer class is very often a useful and effective way to go. (This is the essence of the State pattern, by the way.)

Conclusion

I don’t know about you, but there’s something really scary to me about a language where copying state from one object to another is this complicated. By now, I suspect at least a dozen or two programmers have contributed something new to this discussion. If it takes this many programmers to write a simple assignment operator, think how complicated writing code that actually does something meaningful must be!

The devil truly is in the details, especially in C++ programming.

I’m sure there are still other issues, both with the original article and this one, that I’ve missed. I’ve learned a lot about this, and I’m interested in continuing to lean. Keep those cards and letters coming!