Introduction to TMPP
Templates can be used to perform compile-time calculations on data that is known, of course, at compile time. In some cases, this can be used to optimize the eventual run-time of a program. This technique is called template metaprogramming (TMP). Template metaprograms are often hard to comprehend, and I must warn you that this may well apply to many of the code-blocks in this article. I'm not even sure how any of the code presented here can be used in a practical application, but I wrote this to find something out for myself: "Is it possible to define a higher-level template metaprogramming meta-language?" It turns out that to a certain extent, this is possible and more importantly, that I managed to do it. Below is presented an example that illustrates my results, and if you are wondering how I did that, you might consider reading on!
A possible program could look like this:
#include <iostream>
#include "tmpp_function.h"
using namespace std;
template <int X, int Y>
using Foo = Function< ArgList<X, Y>,
Assign< Var<'x'>, Int<4> >, Assign< Var<'y'>, Int<5> >, Divide< Var<'x'>, Arg<0> >, Multiply< Var<'y'>, Arg<1> >,
Return< Add<Var<'x'>, Var<'y'>> > >;
int main()
{
Foo<2, 3>::Evaluate(); cout << Foo<2, 3>::evaluate << '\n'; }
What we've done now is program a little function or subroutine that is evaluated compile-time, using TMP. Therefore, I call this project/technique Template Metaprogramming Programming. It allows even inexperienced programmers to perform compile-time computations without doing any TMP on the surface.
The output (the resulting value of Var<'x'>
) can be presented in a compiler error, just to be able to get output from a program without even running it. On my own system (g++-4.7), it looks like this:
In file included from tmpp.cc:2:0:
tmpp_function.h: In instantiation of ‘struct Function<ArgList<2, 3>,
Assign<Var<120>, Int<4> >, Assign<Var<121>, Int<5> >,
Divide<Var<120>, Arg<0> >, Multiply<Var<121>, Arg<1> >,
Return<Add<Var<120>, Var<121> > > >::Evaluate’:
tmpp.cc:17:25: required from here
tmpp_function.h:85:9: error: static assertion failed:
The output of your TMPP program will be shown below in Output<...>:
tmpp_function.h:87:26: error: ‘Function<Body>::Evaluate::error’ has incomplete type
In file included from tmpp_operators.h:4:0,
from tmpp_function.h:4,
from tmpp.cc:2:
tmppfwd.h:43:8: error: declaration of ‘struct Output<17>’
Of course, the result can also be retrieved at runtime as I did on the second line of main
. In this case, it is sent to the standard output, showing 17
.
Using the Code
The code in this introduction makes use of a C++11 feature known as template typedefs. This allows you to use the function body multiple times, giving it a nice descriptive name (not Foo
!), while passing different (compile-time) parameters each time you use it. It actually looks like a real function now!
Formatting Conventions
I have not really researched if there exist any formatting conventions when it comes to templates and in particular template metaprogramming (TMP), or what happens to these conventions when you start to nest lots of templates within one another. When it comes to whitespace, I just did what seemed most convenient and readable at the time, not conforming to any convention at all. Regarding variable and type-names, I used the same convention I always use: capitalized CamelCase
for types, uncapitalized snake_case
for variables. One exception to this might be that I also used the CamelCase
for all template non-type parameters (which could be considered 'variables'), to emphasize that they are, in fact, template parameters. Another choice was whether to use enum
values for type-specific data (a crucial part in TMP) or static int const
values. For no apparent reason, other than the fact that this is just the way I was introduced to TMP myself, I have chosen to use enum
s.
Background
For those not very familiar with TMP, I will list some techniques I used to realize my 'language'. These examples will also illustrate some ideas behind TMP.
Value to Type Conversion
Templates were designed to write generic code that can be used with many different types. However, templates can also take value parameters, or template non-type parameters. This can be used to convert a value to a type:
template <int Value>
struct Int
{
enum { value = Value };
};
The value can be stored in the struct
within an enum
, to be retrieved later if necessary. We have now realized multiple different types, depending on the value of the int
. This can be used to specialize class-templates, based on the integral value, for example.
Type Equality
To test if two types are equal, we use partial specialization, and let the compiler figure out which one to instantiate, based on the two types. When the types happen to be equal, it will pick the specialization for which the enum
-value is true
:
template <typename, typename>
struct Equal
{
enum { value = false };
};
template <typename T>
struct Equal<T, T>
{
enum { value = true };
};
Conditional typedef
It is not uncommon to feel the need to do something like this:
typedef (condition ? Type1 : Type2) MyType;
This of course is not possible, but we can accomplish the same with some TMP using again partial specialization. As we will see, most TMP techniques rely on (partial) specialization of class-templates:
template <bool Condition, typename TrueType, typename FalseType>
struct ConditionalType
{
typedef TrueType Type;
};
template <typename TrueType, typename FalseType>
struct ConditionalType<false, TrueType, FalseType>
{
typedef FalseType Type;
};
Variadic Templates, Dealing with the Pack
Often, when dealing with variadic templates, you need to know how many types are contained in the pack, or you need the Nth type. The code below shows how this can be achieved:
template <typename ... Pack>
struct Variadic
{
template <typename N, typename First, typename ... Rest>
struct Get_
{
enum { value = Get_<N - 1, Rest ...>::value };
};
template <typename First, typename ... Rest>
struct Get_<0, First, Rest ...>
{
enum { value = First };
};
template <typename N>
struct Get
{
static_assert(N < number_of_arguments, "Index exceeds number of arguments");
enum { value = Get_<N, Pack ...>::value };
};
enum { pack_size = sizeof ... (Pack) };
};
We can now use Get<2>::Type
to get the 3rd type in the pack (index starts at 0). To obtain the number of elements contained in the pack, I used the new C++11 syntax (correct me if I'm wrong on this being new and C++11) of the sizeof
operator.
The Internals
Now let's get to business. I will start explaining the code top-down, starting at Function
which is the type you actually see when using the code as shown in the introduction. Along the way, we will see many different types and techniques which I will try to explain as well as I possibly can. The only thing left for me to do afterwards is pray that anyone actually bothers to read it...
Function
The Function
class is what the end-user will actually use, and should provide the means to define the input-parameters, a body, and some way to evaluate the body based on its input. The declaration therefore simply looks like this:
template <typename ... Body>
struct Function;
As you can see, the argument-list (ArgList
) is contained within the body, which it doesn't have to be. Actually it would probably be even more convenient to separate it, but this is just how it grew in the project, so no big deal. However, we must now manually assert that the body starts with an ArgList
, and ends with a Return
statement. To do so, we define the following enumeration for convenience:
enum
{
number_of_statements = sizeof ... (Body),
return_index = number_of_statements - 1
};
and use the method described above on how to fetch the Nth type from a variadic typelist. In this case, we rename the Get
struct to GetStatement
, as each type in our body (except the first one) will be some sort of meta-statement. To make sure that the body meets the requirements regarding a VarList
and Return
statement, we use static_assert
:
enum
{
arglist_check = IsArgList<typename GetStatement<0>::Type>::value,
return_check = IsReturnStatement<typename GetStatement<return_index>::Type>::value
};
static_assert(arglist_check, "First statement in body should be an argument-list");
static_assert(return_check, "Last statement in body should be a return statement");
Here, IsArgList
and IsReturnStatement
are defined like this:
template <typename T>
struct IsReturnStatement
{
enum { value = static_cast<OperatorID>(T::id) == RETURN };
};
template <typename T>
struct IsArgList
{
enum { value = static_cast<OperatorID>(T::id) == ARGLIST };
};
Again, I had to apply some trick in order to make this work. As you will soon see, each meta-operator (like Add
, Return
, etc.) has a field called id
. We can use this field to find out what kind of base-type the template has. In this context, a base-type has nothing to do with inheritance and polymorphism. I'm not sure if this is accepted as a term, but I will use it to indicate the template-template type of a type. For example, the base-type of Add<Var<'x'>, Int<5>>
is Add
. This will later be used to specialize based on the base-type instead of the complete type. In the implementation of IsReturnStatement
and IsArgList
it is used in a somewhat simpler, but convenient way.
So far, we have only established a means to assert that the body has the correct structure. Now we need to parse the statements and come up with a way to calculate the final value. This is the heart of the entire project and will be explained below.
Parsing the body
Parsing the array of statements boils down to merging all the statements into one, big, final, statement (which is of course a type). The idea might be best explained through example. Let's assume we have these two statements:
Add<Var<'x'>, Int<1>>, Add<Var<'y'>, Var<'x'>>
These two statements can be condensed into one:
Add<Var<'y'>, Add<Var<'x'>, Int<1>>>
where we have just substituted Var<'x'>
in the second statement with the entire first statement.
This was easy, but what if we need to replace the Var<'x'>
in the above result again? It is nested one level deeper this time, and you might already feel what will have to happen next: recursion. The entire process for the program from the introduction is as follows:
Assign< Var<'x'>, Int<4> >, Assign< Var<'y'>, Int<5> >, Divide< Var<'x'>, Arg<0> >, Multiply< Var<'y'>, Arg<1> >, Return< Add<Var<'x'>, Var<'y'>> >
Add<Var<'x'>, Var<'y'>>
Add<Var<'x'>, Multiply< Var<'y'>, Arg<1> >>
Add<Divide<Var<'x'>, Arg<0>>, Multiply< Var<'y'>, Arg<1>>>
Add<Divide<Var<'x'>, Arg<0>>, Multiply< Int<5>, Arg<1>>>
Add<Divide<Int<4>, Arg<0>>, Multiply< Int<5>, Arg<1>>>
Add<Divide<Int<4>, Int<2>>, Multiply< Int<5>, Int<3>>>
This entire process is left to be handled by Substitute
which is 'called' (actually instantiated, but I think it's nicer to talk about template instantiation as if they were functions that are being called) by the Function
's Parse
member:
template <int Index, typename Statement>
struct Parse
{
typedef typename GetStatement<Index>::Type NextStatement;
typedef typename Substitute<Statement, NextStatement>::Result Result_;
typedef typename Parse<Index - 1, Result_>::Result Result;
};
template <typename Statement>
struct Parse<0, Statement>
{
typedef typename SubstituteArgs<typename Statement::ReturnType,
typename GetStatement<0>::Type>::Result Result;
};
typedef typename Parse<return_index - 1,
typename GetStatement<return_index>::Type>::Result ResultingType;
The first int
parameter (Index
) to Parse
is the index of the next statement to be parsed, whereas the second parameter Statement
is the current statement. To parse the body, we have to start at the final statement (the Return
statement) and move up all the way to the statement with index 0 (which was the ArgList
, remember?), hence the specialization for Index == 0
. The final resulting type is 'stored' in ResultingType
.
Three things happen in Parse
:
- The next statement is fetched,
- The next statement is substituted in the current one, and
Parse
is recursively called on, again, the next statement.
This continues all the way up to the statement at index 0, where we enter the specialization. Here, instead of calling Parse
again, we only need to process the arguments in SubstituteArgs
. This is because at this stage, the references to arguments in the body are still unresolved. SubstituteArgs
will substitute all types of the form Arg<...>
with their corresponding value in the ArgList
. After this, the recursion is terminated and the result can be fetched from Parse<...,...>::Result
.
The final elements that need to be added are the means to retrieve a result. For runtime purposes, I just add another enum
-value:
enum { evaluate = ResultingType::value };
For a compile-time error that displays the value, I declare a struct
-template Output
without defining it:
template <int>
struct Output;
Now, in another struct
called Evaluate
, I try to instantiate an object of the type Output<evaluate>
, which can't be done, resulting in a compiler error (the one from the introduction).
struct Evaluate
{
static_assert(!arglist_check && !return_check,
"The output of your TMPP program will be shown below in Output<...>:");
Output<evaluate> output;
};
Substitute
The Substitute
struct
is where the real magic happens, and it does so in a 3-stage process. In order to make the substitution work as illustrated in the example above, we need to specialize on the base-type. After all, when the target statement is Add<... , ...>
, the resulting type must also be of base-type Add
, even though its template parameters might be completely different after substitution. To extract the basetype, we first inspect its id
field:
template <typename Current, typename Next>
struct Substitute
{
typedef typename Substitute2<static_cast<OperatorID>(Current::id), Current, Next>::Result Result;
};
This id
is then passed on to the second stage: Substitute2
, which is specialized for each possible id
.
template <OperatorID ID, typename Current, typename Next>
struct Substitute2
{
typedef void Result;
};
template <typename Current, typename Next>
struct Substitute2<VAR, Current, Next>
{
typedef Next Result;
};
template <typename Current, typename Next>
struct Substitute2<ARG, Current, Next>
{
typedef Next Result;
};
template <typename Current, typename Next>
struct Substitute2<ASSIGN, Current, Next>
{
typedef typename Substitute3<Assign, Current, Next>::Result Result;
};
template <typename Current, typename Next>
struct Substitute2<ADD, Current, Next>
{
typedef typename Substitute3<Add, Current, Next>::Result Result;
};
template <typename Current, typename Next>
struct Substitute2<RETURN, Current, Next>
{
typedef typename Substitute<typename Current::ReturnType, Next>::Result Result;
};
Most of these specializations call the third stage, which is Substitute3
. This accepts a template-template parameter which is used in the final method of substitution. Two exceptions are those for Var
and Arg
, at which the recursion ends. When the statement-tree is parsed all the way to its final leaves. The thing that has to be substituted, Next
is defined as the Result
in both cases. Another exception is the specialization for Return
, in which case Substitute
is called for the ReturnType
to get rid of the encapsulating Return
.
Now let's have a look at what happens in the third and final stage, Substitute3
. But before we do, let's first think about what it has to do... Looking at the example above a little more closely, we find that the thing that is being substituted (Next
) has to match somehow with one or both parameters the target statement. In our case, the upper statement Add<Var<'x'>, Int<1>>
needs to replace only the right-hand-side (RHS) of the lower statement Add<Var<'y'>, Var<'x'>>
, because they match in the variable being modified: x
. To provide a method to check which parts match and thus have to be substituted, we make sure that every operator has a name
field. Moreover, each operator should provide information on what to actually substitute. For most operators it is the operator itself, but for the assignment-operator it is only its RHS operand. This information is 'stored' in the ReturnType
definition:
template <int Name>
struct Var
{
static_assert(Name < 0xf0, "Numeric representation of variable-name must be smaller than 0xf0");
typedef Var<Name> LHS;
typedef Var<Name> RHS;
typedef Var<Name> ReturnType;
enum
{
name = Name,
value = 1,
id = VAR
};
};
template <int Val>
struct Int
{
typedef Int<Val> LHS;
typedef Int<Val> RHS;
typedef Int<Val> ReturnType;
enum
{
name = 0xff, value = Val,
id = INT
};
};
template <int Index>
struct Arg
{
typedef Arg<Index> LHS;
typedef Arg<Index> RHS;
typedef Arg<Index> ReturnType;
enum
{
name = 0xf0 + Index,
value = 1,
id = ARG
};
};
template <typename T>
struct Return
{
typedef T ReturnType;
enum
{
name = T::name,
id = RETURN,
value = T::value
};
};
template <typename L, typename R>
struct Assign
{
typedef L LHS;
typedef R RHS;
typedef R ReturnType;
enum
{
name = L::name,
id = ASSIGN,
value = R::value
};
};
template <typename L, typename R>
struct Add
{
typedef L LHS;
typedef R RHS;
typedef Add<L, R> ReturnType;
enum
{
name = L::name,
id = ADD,
value = L::value + R::value
};
};
This way, Add
&co will have the same nametag as the most nested variable in their left-hand-side (LHS). Matching names can now be detected and it should now be clear which of the operands to substitute, if any. However, we are only supposed to substitute single variables like Var<'x'>
, not entire statements like Add<... , ...>
. To avoid this from happening, I introduce a new term to the equation: atomics. We will call a type atomic when it can't be subdivided anymore. For example, Add<Var<'x'>, Int<1>>
is not atomic, whereas Var<'x'>
and Int<1>
are. Now, to determine whether a type is atomic, I'll use the IsAtomic
struct
:
template <typename T>
struct IsAtomic
{
enum { value = false };
};
template <>
template <int Value>
struct IsAtomic<Int<Value>>
{
enum { value = true };
};
template <>
template <int Name>
struct IsAtomic<Var<Name>>
{
enum { value = true };
};
template <>
template <int Index>
struct IsAtomic<Arg<Index>>
{
enum { value = true };
};
We now have the tools available to define Substitute3
:
template <template <typename ...> class Operator, typename Current, typename Next>
struct Substitute3
{
enum
{
next_name = static_cast<int>(Next::name),
lhs_name = static_cast<int>(Current::LHS::name),
rhs_name = static_cast<int>(Current::RHS::name),
lhs_name_match = (lhs_name != 0xff) && (lhs_name == next_name),
rhs_name_match = (rhs_name != 0xff) && (rhs_name == next_name),
lhs_done = static_cast<bool>(IsAtomic<typename Current::LHS>::value),
rhs_done = static_cast<bool>(IsAtomic<typename Current::RHS>::value)
};
typedef typename ConditionalType <
lhs_done && rhs_done,
typename ConditionalType <
lhs_name_match && rhs_name_match, Operator<typename Next::ReturnType, typename Next::ReturnType>,
typename ConditionalType <
lhs_name_match && !rhs_name_match, Operator<typename Next::ReturnType, typename Current::RHS>,
typename ConditionalType <
!lhs_name_match && rhs_name_match, Operator<typename Current::LHS, typename Next::ReturnType>,
Current >::Type
>::Type
>::Type,
typename ConditionalType <
lhs_done && !rhs_done,
typename ConditionalType <
lhs_name_match,
Operator<typename Next::ReturnType,
typename Substitute<typename Current::RHS, Next>::Result>,
Operator<typename Current::LHS,
typename Substitute<typename Current::RHS, Next>::Result>
>::Type,
typename ConditionalType <
!lhs_done && rhs_done,
typename ConditionalType <
rhs_name_match,
Operator<typename Substitute
<typename Current::LHS, Next>::Result, typename Next::ReturnType>,
Operator<typename Substitute
<typename Current::LHS, Next>::Result, typename Current::RHS>
>::Type,
Operator< typename Substitute<typename Current::LHS, Next>::Result,
typename Substitute<typename Current::RHS, Next>::Result >
>::Type
>::Type
>::Type Result;
};
It's getting more interesting now, right? Let's start at the top and work our way down...
The enumeration will perform some checks as to which names match the one of the replacement type (Next
) by extracting all names, and comparing them. When the name is that of an anonymous value, like Int<4>
, it will never match the names. Next, we determine if we are done with the recursion by checking if the operands are atomic yet. If so, we can directly replace the atomic operand with the replacement Next::ReturnType
. If not, we pass the operand recursively to Substitute
, as is done in the second part of Substitute3
.
The second part should be read as a large if-else
ladder, where I made use of the ConditionalType
struct
. Written down in non-C++ pseudo-code, it looks something like this (highlights match those in the real code):
if (lhs_done && rhs_done)
if (lhs_name_match && rhs_name_match)
Result = Operator(Next::ReturnType, Next::ReturnType);
else
if (lhs_name_match && !rhs_name_match)
Result = Operator(Next::ReturnType, Current::RHS);
else
if (!lhs_name_match && rhs_name_match)
Result = Operator(Current::LHS, Next::ReturnType);
else
Result = Current;
else
if (lhs_done && !rhs_done)
if (lhs_name_match)
Result = Operator(Next::ReturnType, Substitute(Current::RHS, Next);
else
Result = Operator(Current::LHS, Substitute(Current::RHS, Next);
else
if (!lhs_done && rhs_done)
if (rhs_name_match)
Result = Operator(Substitute(Current::LHS, Next), Next::ReturnType);
else
Result = Operator(Substitute(Current::LHS, Next), Current::RHS);
else
Result = Operator(Substitute(Current::LHS, Next), Substitute(Current::RHS, Next));
If both operands are done, i.e. atomic (upper ladder), the recursion ends here and the Return
type has been determined by substituting Next::ReturnType
for the operands with matching names. If either of the operands is not yet done (lower ladder), it has to be determined which of the operands has to be passed into the recursion. This should be the operand for which the name matches that of Next
. The final else
clause is reached when none of the operands is atomic yet, in which case both have to be passed into the recursion.
SubstituteArgs
The SubstituteArgs
was already briefly mentioned in the description of Parse
as being the third step of the parsing procedure. Its job is to 'iterate' over all arguments in the ArgList
and replace every reference to an argument in the resulting body (which has already gone through the whole process of substitution) with the corresponding Int
value. Of course, iteration is not possible so we have to do it through recursion again, requiring yet another level of indirection: SubstituteArgs2
:
template <typename Body, typename ArgList>
struct SubstituteArgs
{
typedef typename SubstituteArgs2<ArgList::number_of_arguments - 1, Body, ArgList>::Result Result;
};
template <int Index, typename Body, typename ArgList>
struct SubstituteArgs2
{
typedef Int<ArgList::template Get<Index>::value> Replacement;
typedef typename Substitute<Body, Assign<Arg<Index>, Replacement>>::Result Result_;
typedef typename SubstituteArgs2<Index - 1, Result_, ArgList>::Result Result;
};
template <typename Body, typename ArgList>
struct SubstituteArgs2<0, Body, ArgList>
{
typedef Int<ArgList::template Get<0>::value> Replacement;
typedef typename Substitute<Body, Assign<Arg<0>, Replacement>>::Result Result;
};
SubstituteArgs2
is specialized for Index == 0
, meaning that we are dealing with the final argument. The non-specialized version creates an Int
with the proper int
value, retrieved from the ArgList
(which won't be discussed explicitly, as its implementation is almost identical to the example in Variadic templates, dealing with the pack) through its Get
member. It then calls Substitute
to perform the actual substitution and calls itself recursively until all arguments have been substituted.
The End
We have now ourselves recursed through the entire tree of structures used in this little project. The
Parse
member can now properly do its work and a single
ResultingType
is constructed from the body of separate statements. The only thing left to do is to compute a value, but this is almost trivial after all the hard work we've already been through. Each of the operators has a
value
-field that computes its value based on the
value
fields of its operands. The recursion (yes, again) ends at the atomic
Int
s which define their value as the actual integer value with which they were instantiated. It can occur that there are still
Var
s present in the
ResultingType
(when they are never assigned to). Therefore I have chosen for a default value of 1 for unassigned variables. This value cannot be 0, for this would cause problems when trying to instantiate for example
Divide<Var<'x'>, Var<'y'>>
, resulting in a compiler error telling you that you try to divide by 0 (same for
Modulo
).
That's it, we're done. If you have questions, I'm happy to respond. Also, does anyone know of a practical application where this could be used?
Thanks for reading!
History
- 7 Sept. 2013: Initial draft.
- 9 Sept. 2013: Fixed download link, moved article to C/C++ Language -> Templates.
- 9 Sept. 2013: Fixed typo, removed superfluous specialization from Parse.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.