This article discusses double-ended vector, which allows to "push front" as well as "push back", in contrast to std::vector, where only the "push back" is available. It looks at Boost devector. It shows an area where the implementation can be improved (insertion inside the vector), provides an alternative implementation, called DeVector. In this article, the performance of both implementations is compared with std::deque, which demonstrates that double-ended vector is usually faster than std::deque. The source code is provided that can be compiled in any C++17 compiler.
Double-ended vector (devector) has been discussed a few times. Some implementations are done by Lars Greger Nordland Hagen , Valasiadis Fotios  and there is also a Boost implementation .
The double-ended vector idea is rather simple: if in a vector the data starts with the beginning of the pre-allocated space, in a double-ended vector the data is positioned relatively in the middle, which means that for a while new items can be pushed at both ends without re-allocating space or moving the elements.
In Figure 1, the
std::vector memory reservation is shown.
Figure 1. std::vector memory reservation. The data space always starts with the start of the reserved memory, the size of which is called capacity. As the data grows, the capacity should increase.
If the data needs more space than the capacity, the new space with increased capacity should be allocated. The newly allocated space is usually greater than required (by a factor between 1.5 or 2.0). That allows not to allocate memory too often, as the data increases. The issue with the vector is that it is easy to append data to the end of a vector (
push_back operation), but not to the beginning.
If data is positioned roughly in the middle of the reserved space, so that it can grow in both directions, we end up with double-ended vector or
devector, which is shown in Figure 2.
Figure 2. Devector memory reservation. The data can grow in both directions, so that push_front as well as push_back can be easily performed.
The headroom is used for
push_front and the tail capacity for
push_back; if there is not enough room for either of them, usually a new memory slot can be reserved, which can be, for instance, three times the size of the required data, which will be positioned in the middle of the reserved slot.
2. Special Consideration: Balanced Insertion
One issue that should be considered is insertion. When an element is inserted into a
vector, there is only one option -- to move the rest of the elements forward, by increasing their addresses in memory. In
there are two options:
- to move the rest of elements forward;
- to move the elements in front backwards.
Which option should be chosen? it depends on the position of the element. If it is closer to the end of the data, then option 1 (forward). If it is closer to the beginning, then option 2 (backwards). If it is in the middle, either option will do. In Figure 3, the basic idea of balanced insertion is illustrated.
Figure 3. Balanced Insertion into a devector.
Is this important? Yes, it is. Imagine if a devector is build using random insertion of elements. By using balanced insertion, we will cut the time by half! If a
devector used to create a
flat map, which contains "key-value" pairs as elements and binary search is used to find the right element. By using balanced
devector, the insertion time into a
flat map can be cut by half in comparison with time used by
Unfortunately, in the Boost implementation (
Boost devector), the insertion is not balanced and the benchmarks show it. I have included my own implementation, which I call
DeVector, where both insertion and deletion are balanced.
3.1. General Observation
At first, I tested devector in Visual C++ 2022 and saw major advantage in using
devector. Then I tested in GCC 11.2 with C++17 and the results were not so pronounced. I am presenting results for both compilers.
3.2. System Used
The code was executed on a PC with Intel i7-4790, 3.6GHz processor, 16GB RAM.
When compiled in Visual C++ 2022, the following options were used:
In GCC 11.2, the following command line was used:
g++ -O3 -std=c++17 -I"/boost_1_80_0" -m64 DeVectorTests.cpp -o DeVectorTests.exe
3.3. Random Insertion
3.3.1. Results for Visual C++ 2022
In Figure 4a, the random insertion results are shown.
Figure 4a. Visual C++ 2022. Random insert of double floating-point values (microseconds per element). The data is in logarithmic scale.
It shows that random insertion into a
Boost devector is about twice as slow as into a
DeVector, which uses balanced insertion. At the same time, insertion into a
deque is on average 7 times slower than into a
DeVector, which increases to 14 times for 5 million elements. What it means in practice for creating a structure of 5 million elements:
std::deque, it takes 8 hours;
Boost devector, it takes 1 hour and 5 minutes;
DeVector, it takes 34 minutes.
For 100,000 elements:
std::deque, it takes 3 seconds;
std::vector it takes about 0.7s;
Boost devector, it takes about 0.85s;
DeVector, it takes 0.5s.
For random insertion, the advantage double-ended vector over
std::deque is obvious.
3.3.2. Results for GCC 11.2 C++
In Figure 4b, the random insertion results are shown.
Figure 4b. GCC 11.2 C++. Random insert of double floating-point values (microseconds per element).
There is some diffrence, but it is not as dramatic as in Visual C++: for the number of elements 5 million and more,
deque takes over twice as much time as
3.4. Random Access
3.4.1. Results for Visual C++ 2022
The random access results are shown in Figure 5a.
Figure 5a. Visual C++ 2022. Random access of double float-point values (nanoseconds per element).
You can see that there is no much difference in the results of up to 100,000 elements, but than the access to the deque elements slows down in contrast to all other structures.
3.4.2. Results for GCC 11.2 C++
The random access results are shown in Figure 5b.
Figure 5b. GCC 11.2 C++. Random access of double float-point values (nanoseconds per element).
There is not much diffrence. In practice, it takes 38 seconds to access 109 elements in a
deque, and it takes 32 seconds to access the same number of elements in a
3.5. Insertion at the Beginning
3.5.1. Results for Visual C++ 2022
In all the structures, except for
vector, this opertation is called
insert with index 0 is used. The results are shown in Figure 6a.
Figure 6a. Visual C++ 2022. Insertion at the beginning (nanoseconds/element).
deque, it takes about 27 nanosecond per element; for double-ended vector about 10-12 ns/element; for
vector, it is very inefficient. As a result, it takes over 2.5 hours to insert 5 million elements at the front of a vector, as opposed to a fraction of a second for all the other structures.
3.5.2. Results for GCC 11.2 C++
The results for
push_front are shown in Figure 6b.
Figure 6b. GCC 11.2 C++. Results for push_front (nanoseconds per element).
There is no major difference. But you can see that
deque is faster. But still we are taking about fractions of a second: for pushing to the front 10 million elements it takes 0.05s for a
deque and about 0.097s for a
devector. One billion elements take 4.4 seconds for
deque and about 7.7 for both devectors.
3.6. The push_back Operation
3.6.1. Results for Visual C++ 2022
There are nothing special about
push_back: it is a very quick operation and even for one million elements takes a fraction of a second. The results are shown in Figure 7a.
Figure 7a. push_back (nanoseconds/element).
The performance of
deque is about 2.5 times slower than that of
3.6.2. GCC 11.2 C++
The results are shown in Figure 7b.
Figure 7b. The push_back operation (nanoseconds/element).
deque is slightly faster. In general the results are similar to
The results clearly demonstrate that
deque is slower than double-ended vector, particularly in random construction with one-million elements or more. In random access,
deque is slower as well. In Visual C++
deque is particularly inefficient. I was suprised to see how the difference
between Visual C++ and GCC implementation:
deque implementation is much better in the latter, where push operations are rather efficient.
Perhaps, C++ libraries should provide
devector as one of the standard structures, like
std::list. The only advantage of using
deque instead of
devector may be systems with small memory space, where it is difficult to allocate a lump memory as fragmented allocation could give some advantage.
In practice, for storing 10000 floating-point elements,
deque is good enough. With more elements, special attention should be paid to the performance and
devector should be preferred, where random insertion and access are more efficient.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.