Click here to Skip to main content
15,881,715 members
Articles / Programming Languages / Objective C
Tip/Trick

Slightly Less Costly but Much Usable C++ Reflection with Singular Inheritance Rule

Rate me:
Please Sign up or sign in to vote.
5.00/5 (3 votes)
9 Jun 2019CPOL9 min read 13K   78   7  
Template code for basic reflection functionalities (e.g., dynamic casting, instance type comparison, class name) with C#-like singular inheritance rule (a.k.a, one-base-multiple-interface rule)

Introduction

Reflection is a mechanism that provides programmers a way to dynamically employ the class type information at runtime. Some notable examples of reflection include dynamic class typecasting and class name inference, and they are all already provisioned by the modern C++ (say, dynamic_cast and typeid().name()).

However, the current C++ reflection features are either costly performance-wise or human-unfriendly. Dynamic casting, for example, is often prohibited by programmers who seek for higher performance and typeid().name() only provides a mangled name that ordinary programmers can barely recognize. C++ actually provides a way to demangle a string, but it is far from handy for anytime usage.

This article presents a code template that implements the basic reflection features in a slightly less performance cost but with much better usability. The template class employs C#-like singular inheritance rule to exploit an alternative way to implement dynamic casting without using the standard RTTI feature, and exploits the de-facto standard C++ macro __PRETTY_FUNCTION__ to provision the name information of classes. This code template shows about 30% better performance in dynamically casting micro-benchmark compared to the standard dynamic_cast, while providing fundamental reflection features in a usable way.

Using the Code

Two major things you need to follow to use this template are;

  • Include object.h at the top of your source code.
  • Annotate your custom class with using reflection.

Class Declaration

First, annotate the class to use the reflection features with using reflection.

C++
class Base {
  using reflection;
  /* ... */
};

Second, inherit (or extend) a base class using the extend() macro. The derived classes also need to be annotated with using reflection.

C++
class Derived1 : extend(Base) {
  using reflection;
  /* ... */
};

Finally, implement an interface class (pure abstract class) using the implement() macro.

C++
class Derived1_1 : extend(Derived1), implement(IBooable) {
  using reflection;
  /* ... */
};

Below are some important points when declaring reflection-enabled classes:

  • It is highly recommended to place using reflection right below the class head (e.g., class Base : object in the first code snippet).
  • The access specifier below the using reflection annotation is private by default.
  • This code template assumes singular inheritance rule; any class can extend only one base class but can implement multiple interfaces.

Accessing Base

Since there is only one base class of a given class, there is no need to write the name of the base class whenever accessing it. Instead, this code snippet provides a keyword base for the purpose of any base accesses.

  • A constructor in a derived class can initialize the base class using the base keyword instead of the base class' name.
    C++
    class Derived1 : extend(Base) { 
      using reflection; 
      Derived1(int x) {}
      /* ... */
    };
    
    class Derived1_1 : extend(Derived1), implement(IBooable) {
      using reflection; 
      Derived1_1() : base(10) {}       //< Use 'base' instead of 'Derived1'.
      /* ... */
    };
  • When you have to access the base class' function, again use the keyword base to access the base.
    C++
    int Derived1_1::foo() {
      base::foo();                     //< Use 'base' instead of the base's name.
      return 0;
    }

Dynamic Typecasting

This code template adopts the well-used dynamic casting interface used in the LLVM project, which provides the LLVM programmers three fundamental dynamic casting functions in total; dyn_cast<>, cast<>, and isa<>.

  • dyn_cast<> is the counterpart of the standard dynamic_cast<> operator. The only difference is you must provide it a non-starred class name. For example, the operation dynamic_cast<SomeClassName *> is equivalent to dyn_cast<SomeClassName>. Note that there is no asterisk after the class name in the template bracket.
  • cast<> is almost the same as dyn_cast<>, but it crashes the program when a given instance cannot be cast to a specified type. This might save some extra typing when you need to assert the type of an instance.
  • isa<> returns true if the instance is castable to the specified type, rather than returning a casted pointer. This is handy if your only point of interest is just checking the type.

Below is a short usage example of these three casting interfaces.

C++
int main() {
  Base *base = new Derived1_1();

  Derived1_1 *derived = dyn_cast<Derived1_1>(base);   //< derived now points to base.

  cast<Derived1>(base);        //< Nothing happens, since a Derived1_1 instance
                               //  is also Derived1 in the inheritance tree.

  std::cout << isa<Derived1_1>(base) << std::endl;    //< prints 'true'.

  return 0;
}

Class Name Inference

A reflection-enabled class provides a getType() function and a Type static public member. They present the same Type structure, which contains the class' name as well as a fully-qualified name in a string form. There are two members in a Type structure:

  • clsname is the name of the class type itself.
  • fullname is a fully-qualified class name, containing the namespace names it is nested in.

Below is a simple usage example.

C++
int main() {
  Namesp::Base *base = new Namesp::Derived1_1();     //< Assume they are in Namesp.
  std::cout << base.getType().clsname << std::endl;  //< prints 'Derived1_1'.
  std::cout << base.getType().fullname << std::endl; //< prints 'Namesp::Derived1_1'.
  std::cout << Namesp::Base::Type.clsname << '\n';   //< prints 'Base'.
  return 0;
}

Dynamic Type Comparison

Not only does the Type structure contain the class name information, but they can also be compared with each other to test whether they are the same class type. Below is a simple example.

C++
int main() {
  Base *base1 = new Derived1();
  Base *base2 = new Derived2();   //< Let's assume there's Derived2, extending Base.

  std::cout << (base1.getType() == base2.getType()) << std::endl;  // 'false'.
  std::cout << (base1.getType() == Derived1::Type) << std::endl;   // 'true'.

  return 0;
}

How It Works

Dynamic Typecasting

This code template implements singular inheritance rule. That is, any class can only extend one base class but, at the same time, it can implement multiple interfaces. To implement this rule, this code template adopts an intermediate class __EObject__, that is conceptually inserted in between any class inheritance tree edge.

C++
template <clshash_t HASH> bool (*__gtest__)(clshash_t) = nullptr;

template <typename PARENT, filehash_t DFILE, linenum_t DLINE>
class __EObject__ : public PARENT {
public:
  template <typename... ARGS>
  __EObject__(ARGS... args) : PARENT(args...) 
  { __gtest__<HASH(DFILE, DLINE)> = PARENT::__match__; }
};

#define extend(PARENT) public __EObject__<PARENT, HASH(__FILE__), __LINE__>

__EObject__ accepts three template parameters and is specialized to a unique type for each inheritance edge with different template parameters. PARENT is the type of the base class, and DFILE/DLINE is the __FILE__/__LINE__ macro, respectively. (Strictly speaking, DFILE is the hash of the __FILE__ string.) These macros are instantiated by the compiler to the name of the source code and the line number in it, and they represent the information of the declaration site. This information is to be used when deciding the castable-ness of classes.

One tricky detail here is that, every __EObject__ initializes the global template pointers to the __match__() function of the PARENT class. The __match__() function is an entry point of the type matching mechanism, and this entry point is externally available only if an __EObject__ class of the corresponding inheritance edge has been created.

C++
template <clshash_t HASH>
static std::map<clshash_t, bool> __idecl__ = std::map<clshash_t, bool>();

template <typename IFACE, filehash_t DFILE, linenum_t DLINE>
class __IObject__ : public IFACE {
public:
  __IObject__()
  { __idecl__<HASH(DFILE, DLINE)>[IFACE::CLASSID] = true; }
};

#define implement(PARENT) public __IObject__<PARENT, HASH(__FILE__), __LINE__>

__IObject__ is similar to __EObject__, but since any class can implement multiple interfaces unlike the class extension, __IObject__ marks whether there is an inheritance edge between this class and the IFACE interface in a global map.

NOTE: Inheritance between the interfaces is not considered here.

Once a dynamic casting interface is called, the primary interface function __is__() is invoked. __is__() is contained in the macro reflection, which will be expanded in the place of the annotation, using reflection.

C++
#define reflection
  ...

  static constexpr const filehash_t FILEHASH = HASH(__FILE__);
  static constexpr const linenum_t LINENUM = __LINE__;
  static constexpr const clshash_t CLASSID = HASH(FILEHASH, LINENUM);

  ...

  template <size_t N>
  static constexpr bool __impl_inner__(clshash_t ihash) {
    if constexpr (N > 6) return false;
    else if (__idecl__<HASH(FILEHASH, LINENUM - N)>[ihash]) return true;
    else return __impl_inner__<N + 1>(ihash);
  }

  static bool __impl__(uint64_t ihash) 
  { return __impl_inner__<0>(ihash); }

  template <size_t N>
  static constexpr bool __test_inner__(clshash_t hash) {
    if constexpr (N > 6) return false;
    else if (__gtest__<HASH(FILEHASH, LINENUM - N)>)
      return __gtest__<HASH(FILEHASH, LINENUM - N)>(hash);
    else return __test_inner__<N + 1>(hash);
  }

  static bool __test__(clshash_t hash)
  { return __test_inner__<0>(hash); }

  static bool __match__(clshash_t hash) {
    if (CLASSID == hash || __impl__(hash)) return true;
    else return __test__(hash);
  }

  virtual bool __is__(clshash_t hash)
  { return __match__(hash); }

  ...

(Line continuation characters omitted.)

__is__() just hands over the control to __match__(), which performs the actual type comparison. There are three tasks __match__() must do here;

  • Does this class' type match with the target type?
  • Does this class implement the target type (presumably an interface type)?
  • Is one of the base classes the target type?

__match__() does these tasks as follows. First, it matches its own class ID (CLASSID) with the target type's hash. Next, it checks whether it implements the target type by looking up the entry in the __idecl__ map. If this class implements the target type, there has to be a true entry between this class and the target type in the map. This task is done by __impl_inner__(), which is practically called by __match__().

However, the thing is that LINENUM of this class is not equal to the DLINE, which is given as an __IObject__ template parameter, since the value of __LINE__ would be off by little between where the LINENUM has been declared and where __IObject__ has been declared as a base interface. Even worse, it cannot be determined by exactly which amount of lines it is off from each other. To resolve this problem, first it assumes that the using reflection annotation is located at least in the proximity of the implement() macro, like one or two lines below. Then it checks a few entries with a class ID, created with the line number off by a few lines above (off up to 6 in this implementation). As long as programmers have followed the requirement and placed using reflection right below the class head, this eventually can decide whether there was __IObject__ between this class and the target interface.

If neither of this class nor one of its interfaces is not a target class, __match__() continues the process in the base class' __match__() function. One challenging point here is that, in C++ syntax, there can be multiple base classes and they can all have their own __match__(), even if the __match__() function we want to call is the one in the class that has been extended. To detour this problem, it looks up the __gtest__ global function pointer template and checks whether there is a function pointer corresponding to the inheritance edge between this and the target class. Since __gtest__ has the same problem (LINENUM off from DLINE by little), it uses the same strategy here to look up the function pointer with __impl_inner__(). This task is delegated by __test_inner__().

This process continues all the way up to the point where it cannot find any __match__() function in __gtest__ in the higher classes.

Class Name Inference

This code template utilizes two main features; the __PRETTY_FUNCTION__ macro provided by GCC-compatible compilers, and constant expression initialization using lambda functions from C++17.

C++
#define reflection
  ...

  static constexpr std::array<char, 1024> _fullname = 
  [](){ 
    std::array<char, 1024> fullname{0};
    const size_t fullname_len = 
      Object::strrstr_c<2>(__PRETTY_FUNCTION__);
    for (size_t i = 0; i < fullname_len - 5; i++)
      fullname[i] = __PRETTY_FUNCTION__[5 + i];
    fullname[fullname_len + 1] = '\0';
    return fullname;
  }();

  static constexpr std::array<char, 1024> _clsname = 
  [](){ 
    std::array<char, 1024> clsname{0};
    const size_t func_scope_pos =
      Object::strrstr_c<2>(__PRETTY_FUNCTION__);
    const size_t cls_scope_pos = 
      Object::strrstr_c<3>(__PRETTY_FUNCTION__);
    if (cls_scope_pos == -1)
      return _fullname;
    else { 
      size_t len = func_scope_pos - cls_scope_pos - 2;
      for (size_t i = 0; i < len; i++)
        clsname[i] = __PRETTY_FUNCTION__[i + cls_scope_pos + 2];
      clsname[len + 1] = '\0';
      return clsname;
    }
  }();
  
  static inline const std::string FULLNAME = [](){
    std::string name = std::string(_fullname.begin(), _fullname.end());
    name.resize(name.find('\0'));
    return name;
  }();

  static inline const std::string CLASSNAME = [](){
    std::string name = std::string(_clsname.begin(), _clsname.end());
    name.resize(name.find('\0'));
    return name;
  }();

  ...

C++ does not provide a direct way to infer the name of the class (other than typeid()). Instead, there is an old-fashioned de-facto standard macro called __PRETTY_FUNCTION__ that provides a fully-qualified prototype of the function. Since it is fully-qualified, the function prototype also contains the class name it belongs to, as well as the namespace name it is enclosed by. Properly parsing this macro itself can provision the names we want to get.

In fact, it has been used fairly well as an informal way to get the class name information, but most of them parse this macro string at runtime even though both the macro value itself and the parsed result is a compile-time constant. To avoid an extra runtime overhead, this code template utilizes the constant expression feature of C++. To be more specific, it can parse __PRETTY_FUNCTION__ using constant expressions, so that the compiler can perform the parsing task in advance to the runtime.

The problem is, C++ hadn't provided a way to initialize a static member variable inside the class (other than integral types). This is problematic because it implies the using reflection annotation is not enough in preparing the class name information and programmers need to annotate something else, somewhere outside of the class declaration.

Fortunately, this problem can be avoided in C++17 using the lambda initializer. Since C++17, a static constant expression class member can be initialized with a lambda function, inside the class declaration. This allows us to cram the parsing code into the declaration, and there is no need to worry about extra annotation anymore.

Dynamic Type Comparison

C++
struct Type {
  std::string fullname;
  std::string clsname;

  bool operator==(const Type& o) const
  { return (this == &o); }
};

#define reflection\
  ...

  static inline const Object::Type Type = []()
  { return (Object::Type){ FULLNAME, CLASSNAME }; }();

  virtual const Object::Type& getType()
  { return Type; }

  ...

Dynamically comparing the class type is far more straightforward to implement than the other two features. The data member Type is unique for each class type as it is a static member. By simply comparing its address, one can dynamically decide whether two instances have the same type or not.

Performance Comparison

Image 1

In the microbenchmark where it performs dyn_cast<> and dynamic_cast<> multiple times, it shows dyn_cast<> outperforms dynamic_cast<> by around 30%. (More details will be uploaded.)

Disclaimer

Currently, dynamic type comparison checks whether two instances have exactly the same type. This is different from the is operator found in C# or some relevant languages, regarding any base class is the derived type. This feature is not yet implemented yet but will be done at any moment as it does not introduce any fundamental challenge.

History

  • 2019.06.07: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Student Seoul National University
Korea (Republic of) Korea (Republic of)
PhD student in system security. Currently interested in fuzzing. Can't WINE be more user-friendly? :P

Comments and Discussions

 
-- There are no messages in this forum --