This time we will discuss virtual inheritance in C++ and find out why one should be very careful using it. See other articles of this series: N1, N2, N3.
Initialization of Virtual Base Classes
First let's find out how classes are allocated in memory without virtual inheritance. Have a look at this code fragment:
class Base { ... }; class X : public Base { ... }; class Y : public Base { ... }; class XY : public X, public Y { ... };
It's pretty clear: members of the non-virtual base class Base are allocated as common data members of a derived class. It results in the XY object containing two independent Base sub-objects. Here is a scheme to illustrate that:
Figure 1. Multiple non-virtual inheritance.
When we deal with virtual inheritance, an object of a virtual base class is included into the object of a derived class only once. Figure 2 shows the structure of the XY object in the code fragment below.
class Base { ... }; class X : public virtual Base { ... }; class Y : public virtual Base { ... }; class XY : public X, public Y { ... };
Figure 2. Multiple virtual inheritance.
It is at the end of the XY object that memory for the shared sub-object Base is most probable to be allocated. The exact implementation of the class depends on the compiler. For example, the classes X and Y may store pointers to the shared object Base. But as far as I understand, this practice is out of use nowadays. A reference to a shared sub-object is rather implemented through offset or as information stored in the virtual function table.
The "most derived" class XY alone knows where exactly a sub-object of the virtual base class Base is to be allocated. That's why it is the most derived class which is responsible for initializing all the sub-objects of virtual base classes.
XY constructors initialize the Base sub-object and pointers to it in X and Y. After that, all the remaining members of the classes X, Y and XY are initialized.
Once the XY constructor has initialized the Base sub-object, the X and Y constructors are not allowed to re-initialize it. The particular way it will be done depends on the compiler. For example, it can pass a special additional argument into the X and Y constructors to tell them not to initialize the Base class.
Now the most interesting thing which causes much confusion and a lot of mistakes. Have a look at the following constructors:
X::X(int A) : Base(A) {} Y::Y(int A) : Base(A) {} XY::XY() : X(3), Y(6) {}
What number will the base class's constructor take as an argument - 3 or 6? None!
The constructor XY initializes the virtual sub-object Base yet does that implicitly. It is the Base constructor which is called by default.
As the XY constructor calls the X or Y constructor, it doesn't re-initialize Base. That's why Base is not being called with an argument passed into it.
Troubles with virtual base classes don't end here. Besides constructors, there are also assignment operators. If I'm not mistaken, the standard tells us that an assignment operator generated by the compiler may assign values to a sub-object of a virtual base class multiple times or once. So, you just don't know how many times the Base object will be copied.
If you implement your own assignment operator, make sure you have prevented multiple copying of the Base object. The following code fragment is incorrect:
XY &XY::operator =(const XY &src) { if (this != &src) { X::operator =(*this); Y::operator =(*this); .... } return *this; }
This code leads to double copying of the Base object. To avoid this, we should add special functions into the X and Y classes to prevent copying of the Base class's members. The contents of the Base class are copied just once, in the same code fragment. This is the fixed code:
XY &XY::operator =(const XY &src) { if (this != &src) { Base::operator =(*this); X::PartialAssign(*this); Y::PartialAssign(*this); .... } return *this; }
This code will work well, but it still doesn't look nice and clear. That's the reason why programmers are recommended to avoid multiple virtual inheritance.
Virtual Base Classes and Type Conversion
Because of the specifics of how virtual base classes are allocated in memory, you can't perform type conversions like this one:
Base *b = Get(); XY *q = static_cast<xy *="*">(b); // Compilation error XY *w = (XY *)(b); // Compilation error
A persistent programmer, though, will achieve that by employing the operator reinterpret_cast:
XY *e = reinterpret_cast<xy *="*">(b);
However, the result will hardly be of any use. The address of the beginning of the Base object will be interpreted as a beginning of the XY object, which is quite a different thing. See Figure 3 for details.
The only way to perform a type conversion is to use the operator dynamic_cast. But using dynamic_cast too often makes the code smell.
Figure 3. Type conversion.
Should We Abandon Virtual Inheritance?
I agree with many authors that one should avoid virtual inheritance by all means, as well as common multiple inheritance.
Virtual inheritance causes troubles with object initialization and copying. Since it is the "most derived" class which is responsible for these operations, it has to be familiar with all the intimate details of the structure of base classes. Due to this, a more complex dependency appears between the classes, which complicates the project structure and forces you to make some additional revisions in all those classes during refactoring. All this leads to new bugs and makes the code less readable.
Troubles with type conversions may also be a source of bugs. You can partly solve the issues by using the dynamic_cast operator, but it is too slow and if you have to use it too often in your code it means that your project's architecture is probably very poor. Project structure can be almost always implemented without multiple inheritance. After all, there are no such exotica in many other languages, and it doesn't prevent programmers writing code in these languages from developing large and complex projects.
We cannot insist on total refusal of virtual inheritance: it may be useful and convenient at times. But always think twice before making a heap of complex classes. Growing a forest of small classes with shallow hierarchy is better than handling a few huge trees. For example, multiple inheritance can be in most cases replaced by object composition.
Good Sides of Multiple Inheritance
OK, we now understand and agree with the criticism of multiple virtual inheritance and multiple inheritance as such. But are there cases when it can be safe and convenient to use?
Yes, I can name at least one: Mix-ins. If you don't know what it is, see the book "Enough Rope to Shoot Yourself in the Foot" [3]
A mix-in class doesn't contain any data. All its functions are usually pure virtual. It has no constructor, and even when it has, it doesn't do anything. It means that no troubles will occur when creating or copying these classes.
If a base class is a mix-in class, assignment is harmless. Even if an object is copied many times, it doesn't matter: the program will be free of it after compilation.
References
- Stephen C. Dewhurst. "C++ Gotchas: Avoiding Common Problems in Coding and Design". - Addison-Wesley Professional. - 352 pages; illustrations. ISBN-13: 978-0321125187. (See gotchas 45 and 53).
- Wikipedia. Object composition.
- Allen I. Holub. "Enough Rope to Shoot Yourself in the Foot". (You can easily find it on the Internet. Start reading at section 101 and further on).