Click here to Skip to main content
15,868,141 members
Articles / Programming Languages / C++

An Introduction to the Range-v3 Library

Rate me:
Please Sign up or sign in to vote.
5.00/5 (4 votes)
17 Aug 2020CPOL7 min read 10.5K   79   3  
Ranges is coming to C++, and the Range-v3 library was the basis for the proposal to add range support to the C++ standard library
The Range-v3 library was the basis for the proposal to add range support to the C++ standard library. The library is built on top of the iterators and algorithms already provided by the standard template library and it makes them composable – which is pretty much a big thing.

Introduction

Ranges is a significant addition to the standard library, with the potential to really change how we process data in C++.

The range-v3 library, https://github.com/ericniebler/range-v3, is mainly written by Eric Niebler, and it was the basis for the proposal to add range support to the C++ standard library. The library is built on top of the iterators and algorithms already provided by the standard template library and it makes them composable – which is pretty much a big thing.

It is useful to think of a range as a pair of iterators, basically something that provides begin() and end() functions. Ranges are classified according to the capabilities of their iterators: We can read from an input range, write to an output range, or both depending on the capabilities of the iterators.

The key benefit of grouping two iterators together, is that we can return a complete range as the result of a function, passing it directly to another function without creating local variables to hold the temporary results.

Views, Actions and Algorithms

Algorithms, Views, and Actions are the three pillars of the range-v3 library.

The algorithms are the same as those provided by the standard template library, with additional overloads that take ranges.

A view is a composable adaptation of a range that is evaluated lazily as the view is iterated, and in this article I will try to explain what this actually means.

An action is an eager application of an algorithm to a container that changes the contents the container in-place and returns it for further processing.

Before looking at ranges, we need something simple to work with:

C++
enum class Category { First, Second, Third };

struct A
{
    Category category;
    int value;

    constexpr A( int v ) noexcept
        : category( static_cast<Category>(v%3) ),
          value( v )
    { }

    constexpr int Value( ) const noexcept
    {
        return value;
    }
};

const char* to_string( Category category )
{
    return category == Category::First ?
        "First" : category == Category::Second ? "Second" : "Third";
}

std::ostream& operator << ( std::ostream& stream, const A& a )
{
    stream << to_string( a.category ) << ' ' << a.value << ' ';
    return stream;
}

and finally, some code that uses range-v3:

C++
using namespace ranges;
void Example001a()
{
    auto items = views::iota( 0, 6 )
        | views::transform( []( int i ) { return A( i ); } )
        | to<std::vector>( )
        | actions::sort( less{}, &A::category );

    // output: [First 0 ,First 3 ,Second 1 ,Second 4 ,Third 2 ,Third 5 ]       
    std::cout << views::all( items ) << std::endl;
}

Quite simple, but it illustrates some of the main benefits of ranges.

Convenience

Without ranges, the above can be implemented like this:

C++
void Example001a_( )
{
    std::vector<int> ints( 6 );
    std::iota( ints.begin( ), ints.end( ), 0 );
    std::vector<A> items;
    std::transform( ints.begin( ), ints.end( ),
                std::back_inserter( items ),
                []( auto i ) { return A( i ); } );

    std::sort( items.begin( ), items.end( ),
                []( auto& first, auto& second ) 
                { return first.category < second.category; } );

    bool isFirst = true;
    std::for_each( items.begin( ), items.end( ), [&isFirst]( auto& item )
    {
        std::cout << (isFirst ? '[' : ',');
        std::cout << item;
        isFirst = false;
    } );

    std::cout << ']' << std::endl;
}

which is a bit more work, and a bit harder to read due to the clutter caused by passing around the iterators.

Composability

The ranges library promotes composability by allowing us to chain together a sequence of operations using the ‘|’ operator:

C++
auto items = views::iota( 0, 6 )
    | views::transform( []( int i ) { return A( i ); } )
    | to<std::vector>( )
    | actions::sort( less{}, &A::category );

This is called a pipeline, and in our little example, ‘items’ is created using a pipeline with four steps:

  1. views::iota( 0, 6 ) is a bounded value factory producing values from the range [0,6).
  2. views::transform( []( int i ) { return A( i ); } ) creates A objects using the output of step 1.
  3. to<std::vector>( ) creates a vector, which is required for the last step of the pipeline since actions changes the contents of the container in-place.
  4. actions::sort( less{}, &A::category ) sorts the vector according to the category member of A

The last step also demonstrates another feature of the ranges library called projections, which allows us to specify which member ranges::less should operate on, removing the need to implement the predicate, or a custom compare function on A – which is rather inflexible by comparison.

Flexibility

The ranges library is just as flexible as the familiar iterator based stl library, allowing for easy integration with POC data types:

C++
void Example001b( )
{
    int values[] = {0,1,2,3,4,5};

    auto items = values
        | views::transform( []( int i ) { return A( i ); } )
        | to<std::vector>( )
        | actions::sort( less{}, &A::category );

    std::cout << views::all( items ) << std::endl;
}

or:

C++
void Example001c( )
{
    int ints[] = { 0,1,2,3,4,5 };
    int* first = std::begin(ints);
    int* last = std::end( ints );

    auto items = span( first, last )
        | views::transform( []( int i ) { return A( i ); } )
        | to<std::vector>( )
        | actions::sort( less{}, &A::category );

    std::cout << views::all( items ) << std::endl;
}

Containers

All containers that have a stl compliant implementation of begin() and end() are valid ranges, so chances are you already have a lot of code that can work with ranges.

Views

Views are ranges that usually operate on data provided by other ranges, transforming the data according to some view specific algorithm. Views should not own any data beyond what is required to implement their algorithm. The algorithm implemented by a view should be applied when data is requested from the view and not when it is created, which is why we say that views are evaluated lazily.

This is more easily explained by implementing a simple toy view that calculates the square of its input.

A view usually consists of a view implementation, an adaptor implementation, and an instance of the adaptor implementation.

First, we create a simple meta function that will be used to determine the type of the iterator of the underlying range:

C++
template<typename T>
using IteratorBase = decltype( std::begin( std::declval<T&>( ) ) );

Then we create our view implementation:

C++
template <typename Rng>
class SquareView : public ranges::view_base
{
    ...
};

We derive our view implementation from ranges::view_base. ranges::view_base is an empty base class, its only purpose is to provide a mechanism that allows other pieces of code to identify derived classes as view implementations.

C++
template <typename Rng>
class SquareView : public ranges::view_base
{
private:
    using RangeType = ranges::views::all_t<Rng>;
    RangeType range_;
    ...
};

SquareView has a single data member, RangeType range_, where RangeType is an alias for ranges::views::all_t<Rng> which adapts itself to the type of range specified by Rng.

If Rng is a view, we get the type of that view; or if Rng is a container, we get a ranges::ref_view<> referencing the underlying container; or we get a ranges::subrange<>. This is done to ensure that containers are not copied, and to minimize the work required to copy or move an instance of the view.

Since we want to keep things simple, we derive IteratorType from the iterator implementation of the underlying range and reuse the pre- and post-increment implementations from the base class:

C++
template <typename Rng>
class SquareView : public ranges::view_base
{
    ...

    class IteratorType : public IteratorBase<Rng>
    {
    public:
        using Base = IteratorBase<Rng>;
        using value_type = typename std::iterator_traits<Base>::value_type;

        IteratorType( ) { }
        IteratorType( const Base& other )  : Base( other ) { }

        IteratorType& operator++( )
        {
            ++static_cast<Base&>( *this );
            return *this;
        }
        IteratorType operator++( int )
        {
            return static_cast<Base&>( *this )++;
        }
        // Where the magic happens
        value_type operator*( ) const
        {
            value_type value = *static_cast<Base>( *this );
            return value * value;
        }
    };
public:
    ...
};

The implementation of the dereference operator is where things get interesting:

  1. We retrieve the ‘input’ value from the underlying range.
  2. We calculate the square of the retrieved value and return the result.

This is how a view gets evaluated lazily.

The rest of SquareView is easy to implement:

C++
template <typename Rng>
class SquareView : public ranges::view_base
{
    ...
public:
    using iterator = IteratorType;
    SquareView( )
    { }

    SquareView( Rng&& range )
        : range_( ranges::views::all( static_cast<Rng&&>( range ) ) )
    { }

    iterator begin( ) const { return ranges::begin( range_ ); }

    auto end( ) const { return ranges::end( range_ ); }
};

Next, we need to implement an adaptor for our view:

C++
// Deduction guideline needed to help the
// compiler make the right choice for us
template <typename Rng>
SquareView( Rng&& )->SquareView<Rng>;

struct SquareFn
{
    template <typename Rng>
    auto operator()( Rng&& range ) const
    {
        return SquareView( std::forward<Rng>( range ) );
    }

    template <typename Rng>
    friend auto operator|( Rng&& range, const SquareFn& )
    {
        return SquareView( std::forward<Rng>( range ) );
    }
};

The first operator implemented by the adaptor lets us treat our view as a function:

C++
auto view = Views::Square( inputValues );

while the second operator implementation allows us to use the ‘|’ pipe notation for composition:

C++
auto view = inputValues | Views::Square;

and finally, an instance of the adaptor:

C++
namespace Views
{
    constexpr SquareFn Square;
}

As demonstrated below, Views::Square is the user facing interface to our view:

C++
using namespace ranges;

void Example001( )
{
    std::vector<int> inputValues{ 1,2,3,4,5,6,7 };

    auto view = inputValues
        | Views::Square
        | views::drop( 2 );

    int sum = 0;
    for ( auto it = view.begin( ); it != view.end( ); ++it )
    {
        sum += *it;
    }

    std::cout << "Sum: " << sum << std::endl;
}

int main()
{
    Example001( );
}

It is only when we are iterating over the view that the data from inputValues is accessed.

C++
auto view = inputValues
        | Views::Square
        | views::drop( 2 );

does not access any data, it only specifies how the data will be accessed. We are actually composing a type:

C++
ranges::drop_view<SquareView<std::vector<int,std::allocator<int>> &>>

for the view variable that gets instantiated and operates on a vector of integers allowing the compiler to generate highly optimized code for our view.

At this point, I hope, A view is a composable adaptation of a range that is evaluated lazily as the view is iterated makes a lot more sense.

Did We Really Have to Go Through All of This to Create a View?

No, when we need to implement something like SquareView for the range-v3 library, it would be easier, and much better, to implement this using the ranges::view_adaptor template:

C++
template<class Rng>
class SquareView : public ranges::view_adaptor<SquareView<Rng>, Rng>
{
    friend ranges::range_access;
    class adaptor : public ranges::adaptor_base
    {
    public:
        adaptor( ) = default;

        auto read( ranges::iterator_t<Rng> it ) const -> decltype( (*it) * ( *it ) )
        {
            auto value = *it;
            return value * value;
        }
    };

    adaptor begin_adaptor( ) const { return { }; }
    adaptor end_adaptor( ) const { return { }; }
public:
    SquareView( ) = default;
    SquareView( Rng&& rng )
        : SquareView::view_adaptor{ std::forward<Rng>( rng ) }
    {}
};

ranges::adaptor_base provides all the mechanics required to deal with the various kinds of iterators that a real view implementation should be able to handle, including pointers – which our toy implementation cannot handle due to our simplistic iterator implementation.

This implementation of SquareView is an input range when it is wrapping an input range and a forward range when it is wrapping a forward range, and so on; allowing the compiler to choose algorithm implementations based on the capabilities of the iterators.

Normally, when implementing something like this, we would just use views::transform:

C++
auto result = inputValues
    | views::transform( []( auto value ) { return value * value; } )
    | views::drop( 2 );

Performance

The range-v3 library performs very well. Consider the following:

C++
constexpr size_t ValueCount = 1000000ULL;
constexpr size_t IterationCount = 10000ULL;

size_t RangeTest( )
{
    using namespace ranges;
    size_t result = 0;

    for ( size_t i = 0; i < IterationCount; i++ )
    {
        auto values = views::iota( 0ULL, ValueCount )
            | views::transform( []( auto value ) { return value * value; } );

        for ( auto value : values )
        {
            result += value;
        }
    }
    return result;
}

And here is an implementation of something similar using classic stl:

C++
size_t VectorTest( )
{
    std::vector<size_t> values( ValueCount );
    size_t result = 0;
    for ( size_t i = 0; i < IterationCount; i++ )
    {
        std::iota( values.begin( ), values.end( ), 0 );
        std::transform( values.begin( ), values.end( ), values.begin( ), 
            []( auto value ) { return value * value; } );
        for ( auto value : values )
        {
            result += value;
        }
    }
    return result;
}

The vector is allocated at the start of the function, and then reused for each iteration - so this does not cause much overhead.

And, finally, an implementation that performs the calculations directly:

C++
size_t DirectTest( )
{
    size_t result = 0;
    for ( size_t i = 0; i < IterationCount; i++ )
    {
        for ( size_t j = 0; j < ValueCount; j++ )
        {
            result += j * j;
        }
    }
    return result;
}

With the help of a little function to time the execution of the three tests:

C++
template <typename F>
void PerformanceOf(const char* testName, F testFunction )
{
    using Seconds = std::chrono::duration<double, std::chrono::seconds::period>;

    auto start = std::chrono::high_resolution_clock::now( );
    auto value = testFunction( );
    auto end = std::chrono::high_resolution_clock::now( );
    auto duration = Seconds( end - start );
    printf( "%s value: %llu, time: %f seconds\n", testName, value, duration.count( ) );
}   

calling:

C++
void Performance001( )
{
    PerformanceOf( "Range", RangeTest );
    PerformanceOf( "Vector", VectorTest );
    PerformanceOf( "Direct", DirectTest );
}

produce the following output:

C++
Range value: 12914400067280709120, time: 3.573238 seconds
Vector value: 12914400067280709120, time: 8.660025 seconds
Direct value: 12914400067280709120, time: 3.102976 seconds

It is not surprising that DirectTest( ) outperforms the other implementations, but I think it is impressive that the gap between DirectTest( ) and RangeTest( ) is not far greater. While far from an exhaustive performance test of the range-v3 library, it proves that working through iterators has little overhead.

At this point, it is also interesting to see how our view implementations perform.

Rewriting RangeTest( ) to use the ranges::view_adaptor based implementation of SquareView:

C++
size_t RangeTest( )
{
    using namespace ranges;
    size_t result = 0;

    for ( size_t i = 0; i < IterationCount; i++ )
    {
        auto values = views::iota( 0ULL, ValueCount )
            | Views::Square;

        for ( auto value : values )
        {
            result += value;
        }
    }
    return result;
}

produce the following output:

C++
Range value: 12914400067280709120, time: 3.652167 seconds

And running it with our initial toy implementation produce:

C++
Range value: 12914400067280709120, time: 3.623685 seconds

C++ 20 Ranges: Coming Soon to a Compiler Near You

C++ 20 ranges will be made available with Visual Studio 16.8.1 – until then, and probably even afterwards, the range-v3 library will be an exceptionally good alternative that has already been used by thousands of C++ programmers across the world.

The range-v3 is a beautifully crafted library and considering how much it relies on template meta programming, it is surprisingly readable.

History

  • 18th August, 2020 - Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect Sea Surveillance AS
Norway Norway
Chief Architect - Sea Surveillance AS.

Specializing in integrated operations and high performance computing solutions.

I’ve been fooling around with computers since the early eighties, I’ve even done work on CP/M and MP/M.

Wrote my first “real” program on a BBC micro model B based on a series in a magazine at that time. It was fun and I got hooked on this thing called programming ...

A few Highlights:

  • High performance application server development
  • Model Driven Architecture and Code generators
  • Real-Time Distributed Solutions
  • C, C++, C#, Java, TSQL, PL/SQL, Delphi, ActionScript, Perl, Rexx
  • Microsoft SQL Server, Oracle RDBMS, IBM DB2, PostGreSQL
  • AMQP, Apache qpid, RabbitMQ, Microsoft Message Queuing, IBM WebSphereMQ, Oracle TuxidoMQ
  • Oracle WebLogic, IBM WebSphere
  • Corba, COM, DCE, WCF
  • AspenTech InfoPlus.21(IP21), OsiSoft PI


More information about what I do for a living can be found at: harlinn.com or LinkedIn

You can contact me at espen@harlinn.no

Comments and Discussions

 
-- There are no messages in this forum --