您的位置：首页 > 编程语言 > C语言/C++

C++菱形继承内存深度探究

2010-04-30 16:39 232 查看

作者：winterTTr（转载请注明）

不得不承认，C++的内存分配，的确是个未解之谜。

更何况，C++标准中，对内存分配的方式完全没有强制要求，只要厂商自行实现即可。

于是，更引起了我对其一摊究竟之冲动，折腾了快2天的时间，虽然没有什么实质性了解，看来具体的实现等等考虑问题，还是交给《深入探索C++对象模型》这本书好了。不过，至少对现实中的C++对象有一个直观的认识。

首先，我们要选择几个采样环境。

好吧，介于个人机器能力有限，下面只采样如下2中典型的环境，不过，都是Windows下的。

1. GCC3.4.5 for MinGW

代码伺候：

#include <iostream>
using namespace std;
class Base
{
public:
Base():m_1(1){}
virtual void f() { cout << "    Base::f" << endl; }
virtual void g1() { cout << "   Base::g1" << endl; }
private:
int  m_1;
};
class LeftChild: virtual public Base
{
public:
LeftChild():m_2(2){}
virtual void f() { cout << "   LeftChild::f" << endl; }
virtual void g2() { cout << "   LeftChild::g2" << endl; }
private:
int  m_2;
};
class RightChild: virtual public Base
{
public:
RightChild():m_3(3){}
virtual void f() { cout << "   RightChild::f" << endl; }
virtual void g3() { cout << "   RightChild::g3" << endl; }
private:
int  m_3;
};
class GrandChild: public LeftChild , public RightChild
{
public:
GrandChild():m_4(4){}
virtual void f() { cout << "   GrandChild::f" << endl; }
virtual void g4() { cout << "   GrandChild::g4" << endl; }
private:
int  m_4;
};

int main()
{
GrandChild o;
int ** address = reinterpret_cast< int **> ( &o );
for ( int i = 0 ; i < 18 ; ++i )
{
cout << address[0] + i - 3 << " : " << address[0][i-3] << endl;
}
typedef void (*FunPtr) ();
cout << "object size:" << sizeof( o ) << endl;
for( int i = 0 ; i < sizeof( o )/sizeof(int) ; ++i )
{
cout << i << " : " << address[i] << endl;
}
cout << "vptr :" << address[0] << endl;
reinterpret_cast<FunPtr>( address[0][0] )();
reinterpret_cast<FunPtr>( address[0][1] )();
reinterpret_cast<FunPtr>( address[0][2] )();
cout << "    " << address[0][3] << endl;
cout << "    " << address[0][4] << endl;
cout << "    " << address[0][5] << endl;
cout << "vptr :" << address[2] << endl;
reinterpret_cast<FunPtr>( address[2][0] )();
reinterpret_cast<FunPtr>( address[2][1] )();
cout << "    " << address[2][2] << endl;
cout << "vptr :" << address[5] << endl;
reinterpret_cast<FunPtr>( address[5][0] )();
reinterpret_cast<FunPtr>( address[5][1] )();
cout << "    " << address[5][2] << endl;
cout << "GrandChild Pointer   :" << &o << endl;
cout << "As LeftChild Pointer :" << static_cast< LeftChild * >( &o ) << endl;
cout << "As RightChild Pointer:" << static_cast< RightChild * >( &o ) << endl;
return 0;
}

运行结果：

0x4462e0 : 20

0x4462e4 : 0

0x4462e8 : 4478240

0x4462ec : 4287312

0x4462f0 : 4287816

0x4462f4 : 4287356

0x4462f8 : 12

0x4462fc : -8

0x446300 : 4478240

0x446304 : 4462784

0x446308 : 4287564

0x44630c : 0

0x446310 : -20

0x446314 : -20

0x446318 : 4478240

0x44631c : 4462860

0x446320 : 4287704

0x446324 : 0

object size:28

0 : 0x4462ec

1 : 0x2

2 : 0x446304

3 : 0x3

4 : 0x4

5 : 0x44631c

6 : 0x1

vptr :0x4462ec

   GrandChild::f

   LeftChild::g2

   GrandChild::g4

    12

    -8

    4478240

vptr :0x446304

   GrandChild::f

   RightChild::g3

    0

vptr :0x44631c

   GrandChild::f

   Base::g1

    0

GrandChild Pointer   :0x22ff30

As LeftChild Pointer :0x22ff30

As RightChild Pointer:0x22ff38

结果分析（部分为作者无责任推断）：

我首先打印了大约18个整数位的内容，可能很奇怪这些是什么。

这些正是内存中标记GrandChild类型的一张表格，表格中包含了类型的定义信息，以及很重要的虚函数地址。

我来分析一下：

其中的

0x4462e8 : 4478240

0x446300 : 4478240

0x446318 : 4478240

其实分别标明了这个对象的类型，可以看到，地址不同，但是内容相同，这个正是该类型的类型标识（作者推断）。

那为什么有3个呢？我们可以推断，这正是因为有三个基类。当这个类中，变换为基类对象时（其实就是内存指针地址的偏移，从该例子最后的试验代码可以看出），就是无论使用Base,LeftChild,RightChild哪种指针（或者引用）来使用这个类对象，其实，在内部它的类型都是固定的，即都是GrandChild类的标识。所以我们可以看出，实际上在内存中是有明确表明的类信息的。（推断dynamic_cast正式利用这个来进行类型有效性检验的，当然，这纯属推断）。

那么，每个类型标识符的前面，一般都会存在两个数字，对这两个数字，我也是无从得知用处。

不过，依照汇编代码等来看，这些应该是某种内存偏移量的标志，如果有人对此内容有了解的话，希望周知我，甚是感激。

那么，这些类型标识后面的，也跟随着很多数字，看着像地址，没错，那些正是地址，而且，正是虚函数们的地址。

我随后打印了该类对象的内存内容，我们可以看到，其中，正是分布着各个基类的虚函数表加上对应的成员变量的。

同时，我通过函数指针的方式，强制调用这些指针的内容，可以看到，验证了虚函数的推断。但是，为何内存是这样排列的，以我现在的知识无法给出明确的解释。

同时，指明一点，gcc编译器结果中只有一个指针保存着虚函数表和类型信息。这个执着默认指向首个虚函数地址。然后通过逆向偏移量（就是负数的偏移量）来取得类信息的办法，将单张表分配为两个不同的部分。

总结来说：

我们可以看到，对于一个菱形继承的对象来说，内存中的信息实际上是相当丰富的，这些信息保存着对当前类的详细描述，以及一份我们一直困惑无法看到的虚表的存在。面对C++编译器做了如此之多的不为人知的事情，多少有些感慨为什么精通C++是如此困难的事情了。

同时，对于下一个例子来说，我还要指出一点很重要的区别，那就是，GCC（g++）编译出来的结果中，只是使用了一个虚指针，这点和cl的编辑结果是完全不同的，后面的例子可以看到，cl（MS的编译器）编译出来的类对象，会将类信息和虚函数表放在两个不同的表中，分别对应着虚类表和虚函数表，所以，cl编译出来的对象，也是比gcc大一些的。请看下面的分析。

2. VS2010 Express环境

上代码：

#include <iostream>
using namespace std;
class Base
{
public:
Base():m_1(1){}
virtual void f() { cout << "    Base::f" << endl; }
virtual void g1() { cout << "   Base::g1" << endl; }
private:
int  m_1;
};
class LeftChild: virtual public Base
{
public:
LeftChild():m_2(2){}
virtual void f() { cout << "   LeftChild::f" << endl; }
virtual void g2() { cout << "   LeftChild::g2" << endl; }
private:
int  m_2;
};
class RightChild: virtual public Base
{
public:
RightChild():m_3(3){}
virtual void f() { cout << "   RightChild::f" << endl; }
virtual void g3() { cout << "   RightChild::g3" << endl; }
private:
int  m_3;
};
class GrandChild: public LeftChild , public RightChild
{
public:
GrandChild():m_4(4){}
virtual void f() { cout << "   GrandChild::f" << endl; }
virtual void g4() { cout << "   GrandChild::g4" << endl; }
private:
int  m_4;
};

int main()
{
GrandChild o;
int ** address = reinterpret_cast< int **> ( &o );
typedef void (*funptr) ();
cout << "object size:" << sizeof( o ) << endl;
for( int i = 0 ; i < sizeof( o )/sizeof(int) ; ++i )
{
cout << i << " : " << address[i] << endl;
}
typedef void (*FuncPtr)();
reinterpret_cast<FuncPtr>( address[0][0] )();
reinterpret_cast<FuncPtr>( address[0][1] )();
cout << address[1][0] << endl;
cout << address[1][1] << endl;
cout << address[1][2] << endl;
cout << address[2] << endl;
reinterpret_cast<FuncPtr>( address[3][0] )();
cout << address[4][0] << endl;
cout << address[4][1] << endl;
cout << address[4][2] << endl;
cout << address[5] << endl;
cout << address[6] << endl;
cout << address[7] << endl;
reinterpret_cast<FuncPtr>( address[8][0] )();
reinterpret_cast<FuncPtr>( address[8][1] )();
cout << address[9] << endl;
cin.get();
return 0;
}

同时要指出的是：

cl添加了一个特殊的编译选项，可以打印类的内存布局内容，这样的确对我们的分析，有了很大的帮助。

这个cl编译选项就是/d1reportSingleClassLayout+类名，对于我这个例子来说，就是/d1reportSingleClassLayoutGrandChild

我从编译结果中挑出我们要的类内存分布信息。

1> main.cpp

1> class GrandChild    size(40):

1>     +---

1>     | +--- (base class LeftChild)

1>   0    | | {vfptr}

1>   4    | | {vbptr}

1>   8    | | m_2

1>     | +---

1>     | +--- (base class RightChild)

1> 12    | | {vfptr}

1> 16    | | {vbptr}

1> 20    | | m_3

1>     | +---

1> 24    | m_4

1>     +---

1> 28    | (vtordisp for vbase Base)

1>     +--- (virtual base Base)

1> 32    | {vfptr}

1> 36    | m_1

1>     +---

先不看结果，但从这个编译报告的分析来说，我们可以看到上文指出的，cl已经将类表和虚函数表分开在不同的指针区域了，这个很鲜明的证明了这点。然后结果奉上：

object size:40

0 : 00417864

1 : 0041787C

2 : 00000002

3 : 00417858

4 : 00417870

5 : 00000003

6 : 00000004

7 : 00000000

8 : 00417848

9 : 00000001

   LeftChild::g2

   GrandChild::g4

-4

28

0

00000002

   RightChild::g3

-4

16

0

00000003

00000004

00000000

   GrandChild::f

   Base::g1

00000001

结果分析来看，虚类表中的首个信息，应该是和上文中的差不多的类型标识吧（-4），当然这完全是我的推断。然后后面跟随的内容应该就是类型内偏移量，至少我个人计算的结果，似乎是正确的，但是这也完全是推断。

然后，虚函数表的信息，通过代码已经证明了，这点应该是很确凿的事实。

同时发现了另一点的不同：

gcc编译器是将虚函数f覆盖各个表的内容，而cl编译器，只是覆盖了虚基类的内容，并没有进行各个表的添加覆盖。

推断，这对于cl编译结果来说，对于f的访问，每次都要多进行一次基类指针的寻址，是否会增加运行速度呢

好了，我能说的暂时就是这么多。

我们看到了虚类表，看到了虚函数表，感觉到了曾经无法捉摸的东西，历历在目，很兴奋。

但是，同时带来的，是更多的有关编译器决策性的推断，很多内存具体意义的疑问，对于C++，我们需要了解的还很多。

这里暂时抛砖引玉吧，希望大家能更多的去了解C++的本质，这样，才能更好的利用这个工具。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： c++ 编译器 class object gcc express

相关文章推荐

新的分享

章节导航