C++: Rvalue References

Paul M Watt

0/5 (0 vote)

Jun 22, 2015

CPOL

20 min read

9744

Rvalue references were introduced with C++11, and they are used to implement move semantics and perfect-forwarding. Both of these techniques are ways to eliminate copies of data parameters for efficiency. There is much confusion around this new feature that uses the && operator, because its meaning

Rvalue references were introduced with C++11, and they are used to implement move semantics and perfect-forwarding. Both of these techniques are ways to eliminate copies of data parameters for efficiency. There is much confusion around this new feature that uses the && operator, because its meaning is often based on the context it is used. It is important to understand the subtleties around rvalue references in order for them to be effective. This entry will teach you how to use the rvalue reference with plenty of live-demonstrations.

Move it!

When I first learned of move semantics, I expected that this feature would be more or less automatic, much like the copy constructor. As it turns out, there are common programming practices that will actually hinder the compiler's ability to generate and use move operations. The concept of move semantics and perfect-forwarding are very simple. However, without understanding a few of the nuances of rvalue references, these idioms will seem fickle when you try to put them to use.

It is important to have a basic understanding of the fundamental components of C++ that have shaped how this new feature was added to the language, and why the explicit steps are required. Therefore, let's start with some background information and vocabulary, then work our way to the main topic.

Lvalue and Rvalue

Syntax expressions are evaluated and assigned both a type and a value category. We are concerned with the differences between the different value categories as we try to understand rvalue references. Specifically we are interested in the lvalue and rvalue categories.These terms are derived from the arguments on each side of the assignment operator. 'L' for left, to which values are assigned, and 'R' for right that contains the value to be assigned. However, this is only a simplification of their definition.

Another way to look at these terms is how they manifest in the final program. Lvalues are expressions that identify non-temporary objects. Essentially, they have addressable storage for loading and storing data. An rvalue is an expression that refers to a temporary object, or a value that is not associated with any object.

An lvalue is not necessarily modifiable. A good example is a constant expression qualified with the const keyword. After its initialization, the expression has storage that can be addressed, but the value cannot be modified. Therefore, lvalues are further distinguished by modifiable lvalues and non-modifiable lvalues.

Here is a list of items that are lvalue expressions:

Non-modifiable:
- String literals
- Constant expressions
Modifiable:
- The name of a variable
- Function calls that return lvalue references
- Pre-increment and pre-decrement operators
- Dereference and assignments
- Expressions cast to lvalue reference type

Here is a list of items that are rvalue expressions:

Literal values: true, 27ul, 3.14 (except string literals)
Function call expressions that do not return a reference
Expressions composed from arithmetic, relational, logical and bit-wise operators
The post-fix increment and decrement operators
Cast expression to any type other than a reference type
Lambda expressions

Does it have a name?

There is a simple way that can help you determine if you are dealing with an lvalue or an rvalue.

Can you refer to the expression by name?

A value that can be referenced by name is an lvalue. This is not an absolute, but it is a good rule of thumb to help you generally reason about your data values. An example of an exception is a member-function. Also, this does not cover all expressions that are considered lvalues. Examples of lvalue expressions that do not have names are string literals and function call expressions that return an rvalue reference.

xvalues, prvalues, glvalues...

In the cursory overview of expression values, I have left out the description of some of the exceptions to the rules and sub-categories for lvalues and rvalues. These other categories that capture the remaining situations. However, going even deeper into the nuances digresses from the original topic, and will only add more confusion. Therefore I will simply leave you with the knowledge that these other categories exist, and a reference of where you can learn more about them. Value categories at cppreference.com[^]

& (lvalue reference)

An lvalue reference is what we generally call a reference. It is also important to note that it is a type. This is in contrast to value categories, which I described in the previous section. Here is a brief review of the concepts associated with references:

A reference is an alias to an object or a function that already exists
A reference must be initialized when it is defined
It cannot be re-seated (reassigned) after it is created
It is not legal to create arrays, pointers or references to references (except with templates)

The most common use for an lvalue reference is to pass parameters by-reference in function calls.

void LogMessage(std::string const &msg)
{
  // msg is an alias for the input parameter at the call site.
  // Therefore, a copy of the string is avoided.
}

I prefer to use references over pointers, except when there is a possibility to receive an empty pointer. The logic becomes much simpler when writing safe production-quality code. The need to verify pointer input parameters is eliminated. In some cases, after I verify a pointer parameter, I will dereference it and assign it to a reference. A similar situation is when I perform some type of cast on a pointer I usually dereference and assign it to a reference of the new type.

C++

// I know what you're thinking...
// I interface with a lot of C and legacy C++
int process_state(
  const SystemInputs *p_inputs,
  void* p_context
)
{
  if ( !p_inputs
    || !p_context)
  {
    return k_error_invalid_parameter;
  }
 
  SystemInputs& input = *p_inputs;
  SystemState&  state = *reinterpret_cast< SystemState* >(p_context);
 
  // ...
}

If a function returns an lvalue reference, then the function call expression will be an lvalue expression. This is use of references is used to implement the at and operator[] member-functions of std:: vector.

C++

// Where reference is an alias for T&
reference operator[]( size_type pos )
{
  // Return the requested element
  // from the heap-allocated data array
  return p_data[pos];
}

Dangling References

Although references do make code easier to work and reason with, they are not perfect. Similar to a pointer, the possibility still exists for the object that was used to initialize a reference is destroyed before the reference is destroyed. This leaves you with a dangling reference, which leaves your code executing in the unspecified behavior territory.

The stack is one of the safest places to create a reference. That is with the assumption that the new reference will go out of scope before or at the same time as the object used to initialize the reference.

This is the reason why you do not return a reference from a function call, in which you return a locally created variable. Either your object was created on the stack and will be destroyed after the return statement is evaluated, or your object was dynamically allocated, which you would have no way to free the memory when you were done.

C++

	std::string& FormatError(int errCode)
{
  std::string errorText;
  // Populate the string with the proper error message.
 
  return errorText;
  // errorText is now destroyed.
  // The caller receives a dangling reference.
}

&& (rvalue reference)

Prior to C++11, it was not possible to declare an rvalue as a reference. The only place it was legal to declare a reference was with an lvalue expression. C++11 introduces the && operator, which now allows references to be defined for rvalue expressions. An rvalue reference is a type.

Remember that one type of rvalue is an expression that refers to a temporary object. An rvalue reference is used to extend the lifetime of a temporary object. The most compelling place to apply rvalue references are with object construction and assignment. This allows compilers to replace expensive copy operations with less expensive moves. The formal name given to this feature is move semantics. Another exciting use is applied to template function parameters, in which the technique known as perfect-forwarding is used.

In overload resolution for function calls, the rvalue reference type is given precedence of lvalue reference.

Move Semantics

Allows you to control the semantics of moving your user-defined types. It is actually possible to accomplish this with classic C++. However, you would have to forego the copy constructor. With the rvalue reference, it is now possible to provide both a move constructor and a copy constructor within the same object.

Perfect-Forwarding

Makes it possible to create function templates that are able to pass their arguments to other functions in a way that allows the target function to receive the exact same objects.

[Intermission]

I presented that long and detail-oriented introduction up front so you would have context with most of the details to understand why this movement isn't always automatic. Also, hopefully I have presented the details in a memorable order to help you remember the proper actions required for each situation. We will continue to introduce details gradually, and I will summarize with a set of rules to lead you in a successful direction.

Reference Collapsing

Reference collapsing is part of the type deduction rules used for function templates. The rules are applied based upon the context of the function call. The type of argument passed to the specific instantiation is considered when determining the type for the final function call. This is necessary to protect against unintentional errors from occurring where lvalues and rvalues are concerned.

I mentioned earlier in the section regarding references that it was not legal to create a reference to a reference, with the exception of templates. It's time to demonstrate what I mean:

C++

int   value      = 0;     // OK: Fundamental type
int&  ref        = value; // OK: Reference to type
int& &ref_to_ref = ref;   // Error: Reference to reference not allowed
 
// Now we have rvalue references
int&& rvalue_bad = ref;   // Error: Rvalue reference cannot bind to lvalue
                          // Remember, if it has a name, it is an lvalue
int&& rvalue_ref = 100;   // OK: A literal value is an rvalue

Templates follow a set of type-deduction rules to determine what type should be assigned to each parameterized value of the template. Scott Meyers provides a very thorough description of these rules in Item 1 of "Effective Modern C++". Suffice to say, the important rules to note are:

If an argument is a reference, the reference is not considered during type deduction
Lvalue arguments are given special consideration in certain circumstances (this is where reference collapsing applies)

The rules of reference collapsing

The rules are actually very simple. The rules have the same output of the AND truth table; where an lvalue reference, &, is 0 and an rvalue reference, &&, is 1. I think it is subtly fitting, given the other meaning of the && operator. This should make it easier to remember the rules as well.

Reference Collapsing Truth Table

Truth Table: Reference Collapsing Rules - & := 0, && := 1

New Rules for compiler generated functions

Hopefully you are well aware that the compiler may generate four special member-functions for a class as needed with classic C++. If not, it's never too late to learn. The four functions are:

Default Constructor (If no other constructor has been defined)
Destructor
Copy Constructor
(Copy) Assignment Operator

Two additional functions have been added to this set to properly manage the new concept of move operations.

Move Constructor
Move Assignment Operator

The default behavior of a generated move function is similar to the copy-based counterparts. A move operation for each member of the object. However, the compiler is much more conservative about automatically choosing to generate these new functions when compared to the others. The primary reason is the notion that if the default move behavior is not sufficient that you elect to implement your own, then the default copy behavior most likely will not be sufficient either. Therefore it will not automatically generate the copy-based functions when you implement either of the move functions.

Furthermore, if you implement only one of the move operations, it will not automatically implement the other move operation for the same logic. In fact, no compiler generated move operations will be created if the user-defined type has implemented its own destructor or copy operation. When move operations are not defined, the copy operations will be used instead.

If you are in the habit or even feel the compulsion to alwaysdefine a destructor, even if it is an empty destructor, you may want to try to change that behavior. There is now actually a better alternative. Similar to how you can delete the compiler generated defaults, you can also explicitly specify that you would like to use the defaults. The syntax is the same as delete, except you use the default keyword.

C++

	class UserType
{
public:
// This declaration will not preclude
// a user-type from receiving a
// compiler-generated move constructor.
~UserType() = default;
};

Specifying default, will also allow you to continue to use the compilers copy operations even when you implement your own move operations. If you would like to read a full account of the rules and reasoning for changes, refer to Item 17 in "Effective Modern C++".

std::move("it!");

std::move is a new function has been added to the Standard Library and it can be found in the <utility> header. std::move does not add any actual executable code to your program because it is implemented as a single cast operation. Yet this function is very important because it serves two purposes:

Explicitly communicates your intentions to move an object
Provides hint (enables actually) the compiler to apply move semantics

Here is the implementation of std::move:

C++

template< class T >
typename std::remove_reference<T>::type&& move(T&& t)
{
  return static_cast<typename std::remove_reference<T>::type&&>(t);
}

This function is a convenient wrapper around a cast that will unconditionally convert rvalue references to rvalue expressions when passing them to other functions. This makes them capable of participating in move operations. std::move is the explicit nudge that you supply to the compiler when you want to perform a move assignment rather than a copy assignment.

It is necessary to use std::move inside of a move constructor, because all of your values in the rhs object that you will move from are lvalues. As I mentioned, std::move unconditionally converts these lvalues into rvalue references. This is the only way the compiler would be able to differentiate between a move assignment and a copy assignment in this context.

The only operations that are valid to perform on an argument that has been supplied to std::move, is a call to its destructor, or to assign a new value to it. Therefore, it is best to only use std::move on the last use of its input for the current scope.

Extremely Important!

If your class is a derived class, and implements a move operation, it is very important that you use std::move on the parameters that you pass to the base class. Otherwise the copy operations will be called in the base implementation.

Why?

Because the input parameters to your move operation are lvalue expressions. Maybe you are objecting with "no they're not! They are rvalue references!" The parameters are rvalue references, however, your arguments have been given a name for you to refer to. That makes them lvalue expressions, which refer to rvalue references.

The bottom line is that calling a base class implementation requires the same attention that is required for all other move operations in this context. Just because you happen to be in a move operation for your derived class, does not mean that the compiler can tell that it needs to call the same move operation for the base class. In fact, you may not want it to call the move operation. This now allows you to choose which version is called.

Move operation implementations

We now have enough knowledge to be able to constructively apply the principles of move semantics. Let's apply them to implement a move constructor for an object.

The functions below implement the copy operations for a class called ComplexData that is derived from a base class called BasicData.

Derived Move Constructor

ComplexData(ComplexData&& rhs)
  : BasicData(std::move(rhs))
{
  // Move operations on ComplexData data members
  complex_info = std::move(rhs.complex_info);
}

Derived Move Assignment Operator

C++

ComplexData& operator=(ComplexData&& rhs)
{
  BasicData::operator=(std::move(rhs));
 
  // Move operations on ComplexData data members
  complex_info = std::move(rhs.complex_info);
 
  return *this;
}

Observe move operations

Number

The class used in this demonstration is called Number. It implements each of the special member-functions of the class. It also provides a way to set and get the value of the object. This example lets you observe when move operations are performed versus copy operations.

The implementation is very simple only holding a single data member int m_value. I do not use the call to std::move inside of the move operations because it is not necessary. Similarly, if we had allocated pointers and were moving them between two objects, we would copy the pointer to the destination class, and set the source class pointer to nullptr. I will set the number to -1 in this version to differentiate an invalid state from 0.

Number move assignment operator

C++

Number& operator=(Number&& rhs)
{
  cout << "Move  Assignment Operator\n";
  m_value = rhs.m_value;
  rhs.m_value = -1;
  return *this;
}

Move Example

C++

int main(int argc, char* argv[])
{
  std::cout << "Construct three Numbers:\n";
  Number value(100);
  Number copied(value);
  Number moved(std::move(value));
 
  std::cout << "\nvalue:  " << value
            << "\ncopied: " << copied
            << "\nmoved:  " << moved;
  std::cout << "\n\nCopy and move:\n";
  value  = 202;
  moved = std::move(value);
  copied = value;
  std::cout << "\nvalue:  " << value
            << "\nmoved:  " << moved
            << "\ncopied: " << copied;
}

Output

Construct three Numbers:
Value Constructor
Copy  Constructor
Move  Constructor

value:  -1
copied: 100
moved:  100

Copy and move:
Value Assignment Operator
Move  Assignment Operator
Copy  Assignment Operator

value:  -1
moved:  202
copied: -1

std::forward<perfect>("it!");

As the title of this section implies, there is a new function in the Standard Library called std::forward. However, it is important to understand why it exists, because it is designed to be used in a special situation. The situation is when you have a type-deduced function argument that you would like to move, also called forward, as the return value, or to another sub-routine.

In classic C++, the way to move function arguments through function calls is by using call-by-reference. This works, but it is inconvenient because you must make a decision on trade-offs. You either choose to give up flexibility on the type of arguments that can be used to call your function, or you must provide overloads to expand the range of argument types that can be used with your function call.

Can be called by lvalues, however, rvalues are excluded:

C++

	template< typename T >
T& process(T& param);
 
int val = 5;
process(val);       // OK: lvalue
process(10);        // Error: Initial value to ref of
                    //        non-const must be lvalue
process(val + val); // Error: Initial value to ref of
                    //        non-const must be lvalue

Now supports rvalues, however, move semantics are no longer possible:

C++

template< typename T >
T& process(T const& param);
 
int val = 5;
process(val);       // OK: lvalue
process(10);        // OK: Temporary object is constructed
process(val + val); // OK: Temporary object is constructed
 
                    // However, none of these instances
                    // can participate in move operations.

Furthermore, if the two solutions above are combined with overloads and the function in question contains multiple arguments the number of overloads required to capture all of the possible states of arguments causes an exponential explosion of overloads which is not scalable. With the addition of the rvalue reference in Modern C++, it is possible to compact this solution back into a single function, and rvalues will remain as viable candidates for move operations.

Where does `std::forward` apply in this situation?

Since lvalue expressions are already capable of passing through function calls in this situation, we actually want to avoid applying move semantics on these arguments because we could cause unexpected side-effects, such as moving local parameters.

But, in order to make rvalues capable of using the move operations, we need to indicate this to the compiler with something like std::move. std::forward provides a conditional cast to an rvalue reference, only to rvalue types. Once again the rules of reference collapsing are used to build this construct, except in a slightly different way.

Implementation of std::forward:

C++

template< class T >
T&& forward( typename std::remove_reference<T>::type& t )
{
  return static_cast<T&&>(t);
}

What is a forwarding/universal reference?

This is a special instance of an rvalue reference that is an instantiation of a function template parameter. Forwarding reference is the name that I have read in some standards proposal documents, and universal reference is the name that Scott Meyers used first in "Effective Modern C++".

It is important to identify this type when the situation occurs. Because the type deduction rules for this particular reference allows the type to become an lvalue reference, if an lvalue expression was used to initialize the template parameter. Remember the important type-deduction rules I pointed out above? References are not usually considered as part of type-deduction. This is the one exception.

If an rvalue expression is used to initialize the template parameter, then the type becomes an rvalue reference, which will make it qualify for move operations. Therefore, std::forward should be used to inform the compiler that you want a move operation to be performed on this type if an rvalue was used to initialize the parameter.

Observe forward vs. move

This next program allows you to observe the side-effects that could occur by using std::move when std::forward is most likely what was intended. This program is adapted from Item 25 in "Effective Modern C++". I have expanded the program to provide two methods to set the name of the test class, Item.

Item class

The class has a single value called name. name is set equal to "no name" in the default constructor. The name can be set in two different ways:

Item::fwd_name:
Sets the name value with an input rvalue reference string, which is forwarded to the internal storage of name in the class.

C++
```
template< typename T >
  void Item::fwd_name(T&& n)
  {
    m_name = std::forward<T>(n);
  }
```
Item::move_name:
Sets the name value with an input rvalue reference string, which is moved to the internal storage of name in the class.

C++
```
template< typename T >
  void Item::move_name(T&& n)
  {
    m_name = std::move(n);
  }
```

Notice that both of these functions are function templates. This provides the differentiating factor that makes std::forward necessary, moving a type-deduced argument.

Forward Example

C++

int main(int argc, char* argv[])
{
  std::cout << "Forward 'only' moves rvalues:\n\n";
  string fwd("Forward Text");
 
  Item fwd_item;
  fwd_item.fwd_name(fwd);
  cout << "fwd_name:   " << fwd_item.name() << "\n";
  cout << "fwd(local): " << fwd << "\n";
 
  std::cout << "\nMove 'always' moves:\n\n";
  string mv("Move Text");
 
  Item move_item;
  move_item.move_name(mv);
  cout << "move_name:  " << move_item.name() << "\n";
  cout << "mv(local):  " << mv << "\n";
 
  return 0;
}

Output

Forward 'only' moves rvalues:

fwd_name:   Forward Text
fwd(local): Forward Text

Move 'always' moves:

move_name:  Move Text
mv(local):  no name

Move semantics and exceptions

A new keyword has been added to Modern C++ regarding exception handling, the keyword is noexcept. noexcept can use this information provided by the programmer to enable certain optimizations for non-throwing functions. One potential optimization is to not generate stack unwinding logic for the functions that specify noexcept.

It is possible to test if a function specifies noexcept by using the noexcept operator. This may be necessary if you can conditionally provide an optimization, but only if the functions you want to call are specified as non-throwing. The move operations in std::vector and std::swap are two good examples of functions that are conditionally optimized based on a non-throwing specification.

Finally, there is a function called std::move_if_noexcept that will conditionally obtain an rvalue reference if the move constructor of the type to be moved does not throw. Refer to Item 14 in "Effective Modern C++" for a thorough description of noexcept.

What you need to know

This is the section that I promised at the beginning of the essay. The most important concepts that you need to take away and know. Not remember, know, if you are going to employ move semantics and perfect-forwarding when you practice Modern C++.

Remember the difference between a value category and a type:
Lvalue and rvalue are expressions that identify certain categories of values. Lvalue reference and rvalue reference are both types; because the names are so similar, it is easy to confuse the two.
Rvalues are the only expression types valid for move operations:
std::move and std::forward explicitly attempt to convert arguments to rvalue references. This is performed by using the rvalue reference (type) in an rvalue expression, which is eligible for move operations.
If it has a name, then it is an lvalue (expression):
This is true even if the type of the lvalue is an rvalue reference. Refer to the previous guideline for the ramifications.
Use std::move to explicitly request a move operation:
std::move unconditionally makes the argument eligible for a move operation. I say request, because a move is not always possible, and the compiler may still elect to use a copy operation.

After you use a value as an input to std::move, it is valid to only call an arguments destructor, or assign a new value to it.
Use std::forward to move type-deduced arguments in templates:
std::forward conditionally casts its argument to an rvalue reference and is only required to be used in this context. This is important because an lvalue that is inadvertently moved could have unintended and unsafe side-effects, such as moving local values.
There are additional rules for the special compiler generated functions for a class:
The move constructor and move assignment operator now can be generated automatically by the compiler if your class does not implement them. However, there are even stricter rules that dictate when these functions can be generated. Primarily, if you implement or delete any of the following functions, the compiler will not generate any unimplemented move operations:
- Destructor
- Copy Constructor
- Copy Assignment Operator
- Move Constructor
- Move Assignment Operator
Strive to create exception free move operations and specify noexcept for them:
Your move operations are more likely to be selected by the compiler if you can provide an exception free move operation. noexcept tells the compiler your function will does not require exceptions. Therefore, it can optimize the generated code further to not worry about maintaining code to unwind the stack.

Summary

Rvalue references were added to Modern C++ to help solve the problem of eliminating unnecessary temporary copies of expensive objects when possible. The addition of this value category for expressions has made two new idioms possible with the language.

Many restrictions had to be put in place when the compiler could safely perform the operations automatically. Moreover, it is important to use the functions added to the Standard Library because they express intent and provide the extra hints the compiler needs to utilize these operations safely.

Here is one last piece of advice as you look for locations to use move semantics. Do not try to outwit the compiler because it is already capable of some amazing optimizations such as the Return Value Optimization (RVO) for local objects returned by value. Practice the knowledge in the previous section and your C++ programs will keep that svelte contour envied by all other languages.

References

C++ Rvalue References Explained by Thomas Becker, http://thbecker.net/articles/rvalue_references/section_01.html, March 2013.

A Brief Introduction to Rvalue References by Howard E. Hinnant, Bjarne Stroupstrup, and Bronek Kozicki, http://www.artima.com/cppsource/rvalue.html, March 10, 2008.

Value Categories at CppReference.com, http://en.cppreference.com/w/cpp/language/value_category, June 2015

C++: Rvalue References

Move it!

Lvalue and Rvalue

Does it have a name?

xvalues, prvalues, glvalues...

& (lvalue reference)

C++

C++

Dangling References

C++

&& (rvalue reference)

Move Semantics

Perfect-Forwarding

[Intermission]

Reference Collapsing

C++

The rules of reference collapsing

New Rules for compiler generated functions

C++

std::move("it!");

C++

Extremely Important!

Move operation implementations

C++

Observe move operations

Number

C++

Move Example

C++

Output

std::forward<perfect>("it!");

C++

C++

Where does std::forward apply in this situation?

C++

What is a forwarding/universal reference?

Observe forward vs. move

Item class

C++

C++

Forward Example

C++

Output

Move semantics and exceptions

What you need to know

Summary

References

Where does `std::forward` apply in this situation?