Fantastic pointers and how to std::launder
them
C++ has a wide variety of memory-management options, offering many different levels of abstraction. You do so manually using new
and delete
, forgoing the need to keep track of how much memory to allocate. You can utilize smart pointers to take advantage of RAII principles to automatically release memory. Even until C++ 23 there was support for garbage collection.
These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how std::launder
acts as a back door when the compiler doesn't know how to handle certain uses of placement new
.
First, a brief overview of placement new
and transparent replaceability. The familiar call to operator new
(full documentation) is of the form new <type> <initializer>
. For example:
struct Foo {
int bar;
int baz;
}
// Allocate the memory and initialize the object
Foo* a = new Foo{1, 2};
This syntax both allocates memory and initializes it with the supplied arguments. However, if one wishes to decouple the memory allocation from its initialization, a different syntax called placement new
exists for that purpose.
new (address_to_store_memory_at) <type> <initializer>
Cppreference provides an example of such:
struct C {
int i;
void f();
const C& operator=(const C&);
};
const C& C::operator=(const C& other)
{
if (this != &other) {
this->~C(); // lifetime of *this ends
new (this) C(other); // new object of type C created
f(); // well-defined
}
return *this;
}
C c1;
C c2;
c1 = c2; // well-defined
c1.f(); // well-defined; c1 refers to a new object of type C
We've reused the same memory that was allocated for c1
instead of allocating new memory. There are many reasons outside the scope of this article as to why one would want to separate allocation from initialization, but some simple ones are:
- It's faster to reuse pre-allocated memory than it is to allocate new memory
- When writing code for an embedded system that has memory-mapped hardware, one needs to reuse the same fixed address
The above operations were well-defined because object c1
was transparently replaceable by c2
.
According to Cppreference:
If a new object is created at the address that was occupied by another object, then all pointers, references, and the name of the original object will automatically refer to the new object and, once the lifetime of the new object begins, can be used to manipulate the new object, but only if the original object is transparently replaceable by the new object.
Object
x
is transparently replaceable by objecty
if: - the storage fory
exactly overlays the storage location whichx
occupied -y
is of the same type asx
(ignoring the top-level cv-qualifiers) -x
is not a complete const object - neitherx
nory
is a base class subobject, or a member subobject declared with [no_unique_address] - either -x
andy
are both complete objects, or -x
andy
are direct subobjects of objectsox
andoy
respectively, andox
is transparently replaceable byoy
.
Some of these definitions are outside of this article's scope, but the most important non-obvious definition is that of subobjects. You can think of a base class subobject as a class that other classes derive from, for which memory must be allocated inside the derived class.
These requirements suggest that transparent replacability is rather strict. The example was also a tad contrived: why go through all the trouble of writing a special copy assignment operator when C & c2_ref = c2
works just as fine?
If we break the rules of transparent replaceability, it allows for more general memory-reuse. Recall our Foo
struct:
struct Foo {
int bar;
int baz;
}
Suppose we wanted to reuse the memory of anything with the same size and alignment as a Foo
. We can do that by using an unsigned char[]
or std::byte []
:
struct Foo {
int bar;
int baz;
}
// Allocate enough memory to hold a Foo with the proper alignment
alignas(Foo) unsigned char buf[sizeof(Foo)];
// alignas(Foo) std::byte buf[sizeof(Foo)];
Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into the
// pre-allocated storage at memory address of `buf`
// and returning a pointer to that memory foo_ptr.
Note: the
alignas
specifier ensures that the byte-boundaries of the buffer are the same as that of aFoo
(full documentation).
Pre-allocating a chunk of bytes is more generic than in the first example, but it comes at the cost of some correctness. Suppose we tried to access that memory via the first pointer to it with reinterpret_cast<Foo *>(&buf)->bar
. This is actually undefined behavior: the underlying type of &buf is unsigned char*
and doesn't point to a Foo
object. This means that an unsigned char[]
is not transparently replaceable by a Foo
. Even if &buf
and foo_ptr
point to the same address, their differing types mean that we cannot safely use them interchangeably. To solve the problem of not satisfying transparent replaceability, we must use std::launder
.
Launder has an esoteric definition on Cppreference:
template <class T>
[[nodiscard]] constexpr T* launder(T* p) noexcept; // Since C++20
"Provenance fence with respect to
p
. Returns a pointer to the same memory thatp
points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given - the pointerp
represents the addressA
of a byte in memory - an objectx
is located at the addressA
-x
is within its lifetime - the type ofx
is the same asT
, ignoring cv-qualifiers at every level ? - every byte that would be reachable through the result is reachable throughp
(bytes are reachable through a pointer that points to an objecty
if those bytes are within the storage of an objectz
that is pointer-interconvertible withy
, or within the immediately enclosing array of whichz
is an element).Then
std::launder(p)
returns a value of typeT*
that points to the objectx
. Otherwise, the behavior is undefined. The program is ill-formed ifT
is a function type or (possibly cv-qualified)void
."
How does this arcane definition apply in this example?
&buf
points to an address A- an object (let's call it
foo
) is located at A - this object is within its lifetime (i.e. memory is allocated for it and it has been initialized)
- the type of
foo
isFoo
- every byte in the returned pointer is reachable through
&buf
In order to safely reach the memory through &buf
, we must wrap the cast in a call to launder:
std::launder(reinterpret_cast<Foo *>(&buf))->bar;
This informs the compiler that we can access the memory through that pointer because a call to launder effectively treats that pointer as if it were a freshly made object (similar to a normal call to new
). The full example is below:
#include <new>
struct Foo {
int bar;
int baz;
}
// Stack-allocate enough memory to hold an int with the proper alignment
alignas(Foo) unsigned char buf[sizeof(Foo)];
Foo* foo_ptr = new(&buf) Foo{0, 1}; // Construct a `Foo` object, placing it into the
// pre-allocated storage at memory address of `buf`
// and returning a pointer to that memory foo_ptr.
foo_ptr->bar = 2; // Ok, normal access
reinterpret_cast<Foo *>(&buf)->bar = 3 // Undefined behavior
std::launder(reinterpret_cast<Foo *>(&buf))->bar = 4 // Ok, treated as a pointer to a fresh object
You're likely wondering, "why not use foo_ptr
? We already called placement new
!" There may be scenarios where we call placement new
without saving its return value to a fresh pointer. Consider another example from Cppreference:
struct Base {
virtual int transmogrify();
};
struct Derived : Base {
int transmogrify() override {
new(this) Base;
return 2;
}
};
int Base::transmogrify() {
new(this) Derived;
return 1;
}
static_assert(sizeof(Derived) == sizeof(Base));
int main() {
// The new object failed to be transparently replaceable because
// it is a base subobject but the old object is a complete object.
Base base;
int n = base.transmogrify();
// int m = base.transmogrify(); // Undefined Behavior
int m = std::launder(&base)->transmogrify(); // OK
assert(m + n == 3);
}
Here is another case where transparent replacability fails: we attempt to replace a type Base
with Derived
, but Base
is a base class subobject.
The first call to transmogrify
changes the underlying type of base
from Base
to Derived
. However, the compiler views base
as a Base
object and doesn't know which call to transmogrify
to use the second time. It assumes that the "pointer" to the memory at base
and the actual type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use std::launder
to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that not storing the result of placement new
caused.
What's the solution here? Unless we absolutely must use placement new
, it's likely a better option to let each variable point to its own memory and/or to use higher-level memory-management options like smart pointers. In cases where we must use placement new, a good way to forgo this indirection is to save the result of placement new
somewhere since we'll need to eventually call std::launder
if we do not. Although std::launder
's use is niche, its necessity comes about when the compiler cannot reason about the memory lifetime of objects.
Resources used
- https://en.cppreference.com/w/cpp/utility/launder
- https://en.cppreference.com/w/cpp/language/new
- https://en.cppreference.com/w/cpp/types/byte
- https://en.cppreference.com/w/cpp/language/lifetime
- https://en.cppreference.com/w/cpp/language/object