C++ 隐式转换引发的 Bug

2019.11.22

今天遇到一个 Segmentation Fault，最简化复现的代码如下：

#include <functional>
#include <iostream>

struct Op {
  const std::function<int(void)> &func;
  Op(const std::function<int(void)> &func) : func(func) {}
};

int main() {
  int a = 1;

  auto func = [&] {
    printf("%d\n", a);
    return 0;
  };

  Op op(func);
  op.func();
}

点击此处查看在线执行结果。GCC 全系会挂掉，Clang 开启 -O2 后会挂掉。如果改成下面这样，是 OK 的：

#include <functional>
#include <iostream>

struct Op {
  const std::function<int(void)> &func;
  Op(const std::function<int(void)> &func) : func(func) { this->func(); }
};

int main() {
  int a = 1;

  auto func = [&] {
    printf("%d\n", a);
    return 0;
  };

  Op op(func);
}

使用 const std::function<int(void)> & 是避免对象复制，但恰恰是这样的操作引发了这个极其隐藏的 Bug。有兴趣可以再看看代码，想想是哪里的问题，下面将公布答案。

Bug 分析

C++ 中的 Lambda 表达式并非是 std::function 类型，它的类型是独一无二的，可以通过 typeid 打印查看：

#include <iostream>
#include <typeinfo>

template <typename T>
void print_type(T obj) {
  std::cout << typeid(obj).name() << std::endl;
}

int main() {
  print_type([]() {});  // Z4mainEUlvE_
  print_type([]() {});  // Z4mainEUlvE0_
  return 0;
}

点击此处查看在线执行结果。所以构造 Op 对象时，参数 const std::function<int(void)> &func 会引发 Lambda 表达式执行一次隐式转换，而 func 成员也成为了这个临时转换对象的引用。当 Op op(func) 执行完成后，临时对象也会消亡，也就导致了最后在 Lambda 表达式里参数捕获到非法值、引发 Segmentation Fault。

修改的话也很简单，从一开始就让它转换到 std::function 对象即可：

#include <functional>
#include <iostream>

struct Op {
  const std::function<int(void)> &func;
  Op(const std::function<int(void)> &func) : func(func) {}
};

int main() {
  int a = 1;

  std::function<int(void)> func = [&] {
    printf("%d\n", a);
    return 0;
  };

  Op op(func);
  op.func();
}

隐式转换再踩坑

如下，结论就是别玩火。

#include <cassert>
#include <iostream>
#include <memory>

struct A {};
struct B : public A {};

void func(std::unique_ptr<A> &&a) {}

int main() {
  auto a = std::make_unique<A>();
  func(std::move(a));
  assert(a != nullptr);

  auto b = std::make_unique<B>();
  func(std::move(b));  // 这里有隐式转换
  assert(b == nullptr);
}

再来一个

#include <iostream>
#include <string>

void Func(bool) { puts("A"); }

void Func(const std::string &) { puts("B"); }

int main() {
  Func("");  // 输出 A
}

One More

构造了一个内存池，使用 Thread Local 缓存加快相同类型的分配速度，在分配内存后会调用对应的构造函数，类似：

template <typename... Args>
static std::unique_ptr<T> pop(Args &&...args) {
  void* item = cache().pop();
  new (item) T(std::forward<Args>(args)...);
  return std::unique_ptr<T>(reinterpret_cast<T*>(item));
}

当 T 类型为 std::array<char, N> 或者 std::aligned_storage<N>::type 时，N 较大的时候会有严重的性能问题。经排查，是默认构造函数的原因。这两种类型均为 POD，调用带括号的构造函数 new T() 时会使用 0 进行初始化。这里最终将 T 类型改为下方的 Memory 后修复。

class Memory {
  Memory() {}  // 显式地声明构造函数，不执行 POD 类型的初始化
  std::array<char, N> data;
};

One more time, One more chance

在历史代码中偶遇：

const std::string &a = "hello world";

第一直觉是后面的字符串会转成一个临时的 std::string 对象，前面 a 引用了一个临时对象，这行过后 a 非法。但历史代码至今没炸过，十分好奇然后搜索了下，得到如下规范：

The temporary object to which the reference is bound or the temporary object that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference if the glvalue to which the reference is bound was obtained through one of the following:

a temporary materialization conversion ([conv.rval]) ...

C++，不愧是你！

类似的情况还有：

std::string &&a = std::string("hello world");  // valid
std::string &&a = std::move(std::string("hello world"));  // invalid

std::string("hello world") 会产生一个临时对象，第一行使用一个右值引用指向它，同时也延长了该临时对象的生命周期；第二行显式地使用 std::move 将其转为临时的右值引用，但该引用在本行结束后生命周期就结束了，对应的临时对象也析构了。