Not A Grammar: C++ Assignment Operators Considered Optional

Summary

Conventional C++11 wisdom states that if a class needs a custom copy-constructor, it probably needs a custom copy-assignment-operator as well. The same goes for move-construct + move-assignment-operators as well. It's widely known as the rule-of-3/5/0. The rule is onerous, as you are effectively required to implement the same functions repeatedly.

As it turns out, any class with a noexcept move-constructor is assignable in terms only of its constructors, via a function I dub auto_assign<T>. The impact is not limited to simplifying class design (though that cannot be overstated enough). Reference-members and const-members of a class are only assignable by class-constructors, which preclude them from use in container-types like std::vector. auto_assign-aware containers and libraries will afford a vast new degree of flexibility, with literally no down-side.

Why do we need this?

Quite simply, there are things that just don't work in C++ today. For example, you cannot stick this class in a std::vector.

struct ConstructOnly
{
    int& value;

    ConstructOnly(int& value_)
        : value(value_)
    {}
};

Seriously, you have to change that int& value to a pointer int* pValue just to make it copyable and movable. Why? Because reference member-variables are only assignable by the constructor. You'll have similar grief with const members.

And of course, who doesn't like doing half the work? We can centralize all our copying and moving logic into constructors, rather than worrying about how to factor the code between constructors and operator=.

The Basic Idea

The basic idea behind auto_assign, is that C++ allows us to in-place destruct, then construct, any object. Let's look at it for a concrete Foo, and copy-construction, to simplify matters.

void Usage()
{
    Foo foo;                // assume succeeded
    Foo rhs = ...;          // assume succeeded
    foo_assign(lhs, rhs);
}                           // <-- lhs.~Foo() invoked by compiler

We can naively define foo_assign as follows:

Foo& foo_assign(Foo& lhs, Foo& rhs)
{
    lhs.~Foo();             // <-- explicit destructor call
    new (&lhs) Foo(rhs);    // <-- placement-new into lhs, with copy-constructor
}

Let's call this version "destruct-then-construct". The big problem here, is that foo_assign is not exception-safe ... unless Foo's copy-constructor is noexcept. If Foo's copy-constructor throws an exception, then we won't have a valid foo object on the closing brace of Usage(), resulting in two destructor calls in a row on the same object. And that is undefined behavior.

noexcept move-constructors come to the rescue! We can rewrite foo_assign to be both correct and exception-safe:

Foo& foo_assign(Foo& lhs, Foo& rhs)
{
    Foo tmp(rhs);           // <-- copy-construct into temporary [may throw]
    lhs.~Foo();             // <-- explicit destructor call
    new (&lhs) Foo(std::move(rhs));  // <-- placement-new into lhs, with noexcept move-constructor
}

Let's call this version "copy-destruct-move", which is a flavor of the well-known "copy-and-swap idiom". It is slightly less efficient than destruct-then-construct, since an extra constructor and destructor are invoked, so we should only use it with noexcept(false) constructors.

Definition of auto_assign<T>

The general purpose auto_assign<T> must handle the following cases:

Copy, where type T is assignable via operator=.
Move, where type T is assignable via operator=.
Copy, where type T is not assignable via operator=.
Move, where type T is not assignable via operator=.

We want to handle these cases as follows, keeping "the basic idea" in mind:

lhs = rhs; // use operator if it exists
lhs = std::move(rhs); // use operator if it exists
If the copy-constructor is noexcept, destruct-then-construct. Otherwise, use copy-destruct-move.
Destruct-then-construct.

A full working version is available as part of my toy project here, Getting the noexcept specifiers and template deduction was tricky, but ultimately the code is pretty straightforward.

auto_assign<T>-aware swap()

This one's easy -- just replace "=" calls with auto_assign(). As a bonus, we'll also delegate to a member-swap function if it exists.

        template <class T>
        void swap(T& lhs, T& rhs, typename std::enable_if<!has_swap<T>::value>::type* = nullptr)
        {
            T tmp(std::move(lhs));
            auto_assign(lhs, std::move(rhs));
            auto_assign(rhs, std::move(tmp));
        }
        template <class T>
        void swap(T& lhs, T& rhs, typename std::enable_if<has_swap<T>::value>::type* = nullptr)
        {
            lhs.swap(rhs);
        }

auto_assign<T>-aware containers

The real power will come when containers support construct-only types. This is no trivial exercise. I hope to have some of these working "soon".

When is it safe to use auto_assign<T>?

It's safe to use in any context where you would legitimately use operator=. For example, any time you have a fully-formed object that you wish to copy or move. If you have a virtual operator= on your class, it'll work just fine too.

As an example of what not to do: Do not use auto_assign to implement your own operator=. When implementing operator=, follow conventional wisdom.

In an auto_assign<T>-aware world, should I ever implement operator= ever again?

IMO, the answer for most classes is "no, you do not need a custom operator=, but you should spend your effort on writing noexcept move-constructors instead".

operator= ends up becoming a mere optimization, to be applied where needed. Performance-wise:

auto_assign-moves are equal in the number of implied destructor and constructor calls.
auto_assign-copies invoke one additional move-constructor and trivial-destructor.
BUT the compiler may have certain in-built optimizations for operator= that it cannot apply with destruct-then-construct sequences. This one's hard to quantify.

Concluding Remarks

Without wide-ranging library support, it's too early to start switching all your code over. But, if this catches on with boost and/or STL, we'll be living in a brave new world of C++.

Not A Grammar

Tuesday, January 12, 2016

C++ Assignment Operators Considered Optional