Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C++
Tip/Trick

vectormap: A C++ Map Class with Order-added Iteration

Rate me:
Please Sign up or sign in to vote.
4.11/5 (2 votes)
15 Feb 2022CPOL2 min read 9.1K   85   2   10
Use vectormap when you want the fast lookup of unordered_map, and the order-added iteration of vector
unordered_map provides great speed, but what if you want to maintain the order items were added to the map? vectormap combines a vector of pairs and an unordered_map to save the day.

Introduction

New programmers might assume that when they add multiple key-value pairs to a map data structure that the order items were added would be the order you get iterating over the data structure.

Not so!

The C++ Standard Library provides two map data structures, one named, get this, map, the other, unordered_map.

The map class is implemented using a red-black tree, a binary tree that balances itself when items are added to or removed. When you iterate over a map, you get the sort order of the keys. So if you add C, A, B, or any other order, when you iterate, you will always get A, B, C. If that ordering is what you want, you've got it for free...almost. Searching for an item takes O(log(n)) time, so you pay a price for that ordering by keys.

The unordered_map class is implemented as a hash table. As the name implies, there is no ordering for iteration provided at all. Instead, you get sort of a random order, as the items added scatter across the buckets of the hash table. The benefit of unordered_map over map is that searching for items takes O(1).

So what's to be done if you want fast searches and order-added iteration? Read on!

Enter vectormap

C++
#pragma once

#include <stdexcept>
#include <unordered_map>
#include <vector>

namespace mscript
{
    /// <summary>
    /// A vectormap combines a map with its fast key->value lookups
    /// with a vector to preserve the order that key-value pairs were added
    /// </summary>
    /// <typeparam name="K">Key type of the map</typeparam>
    /// <typeparam name="V">Value type of the map</typeparam>
    template <typename K, typename V>
    class vectormap
    {
    public:
        /// <summary>
        /// Access the vector of pairs directly for index access or iteration
        /// </summary>
        const std::vector<std::pair<K, V>>& vec() const
        {
            return m_vec;
        }

        /// <summary>
        /// What is the value for a given key?
        /// NOTE: By returning the value by, well, value
        ///       this function can stay const
        ///       and users of the class can use pointers
        ///       if they want side effects down the line.
        /// </summary>
        V get(const K& key) const
        {
            const auto& it = m_map.find(key);
            if (it == m_map.end())
                throw std::runtime_error("key not found");
            return m_vec[it->second].second;
        }

        /// <summary>
        /// What is the value at a given index?
        /// NOTE: By returning the value by, well, value
        ///       this function can stay const
        ///       and users of the class can use pointers
        ///       if they want side effects down the line.
        /// </summary>
        V getAt(size_t index) const
        {
            if (index >= m_vec.size())
                throw std::runtime_error("index out of range");
            return m_vec[index].second;
        }

        /// <summary>
        /// How many key-value pairs are in this?
        /// </summary>
        size_t size() const
        {
            return m_vec.size();
        }

        /// <summary>
        /// Does this map contain this key?
        /// </summary>
        bool contains(const K& key) const
        {
            return m_map.find(key) != m_map.end();
        }

        /// <summary>
        /// Associate a value with a key
        /// </summary>
        void set(const K& key, const V& val)
        {
            auto it = m_map.find(key);
            if (it == m_map.end())
            {
                m_map.insert({ key, m_vec.size() });
                m_vec.push_back({ key, val });
            }
            else
                m_vec[it->second].second = val;
        }

        /// <summary>
        /// Get a value, checking if it exists first
        /// </summary>
        /// <param name="key">Key value to look up</param>
        /// <param name="val">Value to populate</param>
        /// <returns>true if a value exists for the key, false otherwise</returns>
        bool tryGet(const K& key, V& val) const
        {
            const auto& it = m_map.find(key);
            if (it == m_map.end())
                return false;

            val = m_vec[it->second].second;
            return true;
        }

        /// <summary>
        /// Get a list of the keys of this map
        /// </summary>
        std::vector<K> keys() const
        {
            std::vector<K> retVal;
            retVal.reserve(m_vec.size());
            for (size_t k = 0; k < m_vec.size(); ++k)
                retVal.push_back(m_vec[k].first);
            return retVal;
        }

        /// <summary>
        /// Get a list of the values of this map
        /// </summary>
        std::vector<V> values() const
        {
            std::vector<V> retVal;
            retVal.reserve(m_vec.size());
            for (size_t v = 0; v < m_vec.size(); ++v)
                retVal.push_back(m_vec[v].second);
            return retVal;
        }

    private:
        std::unordered_map<K, size_t> m_map;
        std::vector<std::pair<K, V>> m_vec;
    };
}

A pretty straightforward unordered_map + vector wrapper class. Just use the vec() function to walk the items in the order in which they were added, and use contains() and get/tryGet() for fast item lookups.

Update February 15, 2022: Per Paulo Zemek's feedback about set() making a poor bargain with the programmer, I took his advice and stashed the vector's indexes into the map's value entries.  Now when you update the map, you know which vector element to update.  The updated class passed good unit tests and the calling application's unit and smoke tests all pass, too, with no modificiation to application or tests. Good times.

 

Update February 18, 2020: Upon further reflection, having the map values in both the unordered_map and the vector was duplicating the memory usage.  Now the map values are just indexes into the vector which has the values.

Conclusion

If you ever need a data structure with fast searching and order-added iteration, look no further than vectormap. Enjoy!

History

  • 15th February, 2022: Initial version and first revision
  • 18th February, 2022: Second revision

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
United States United States
Michael Balloni is a manager of software development at a cybersecurity software and services provider.

Check out https://www.michaelballoni.com for all the programming fun he's done over the years.

He has been developing software since 1994, back when Mosaic was the web browser of choice. IE 4.0 changed the world, and Michael rode that wave for five years at a .com that was a cloud storage system before the term "cloud" meant anything. He moved on to a medical imaging gig for seven years, working up and down the architecture of a million-lines-code C++ system.

Michael has been at his current cybersecurity gig since then, making his way into management. He still loves to code, so he sneaks in as much as he can at work and at home.

Comments and Discussions

 
QuestionThe set shows the problem with this approach. Pin
Paulo Zemek15-Feb-22 11:39
mvaPaulo Zemek15-Feb-22 11:39 
AnswerRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni15-Feb-22 14:16
professionalMichael Sydney Balloni15-Feb-22 14:16 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek15-Feb-22 14:23
mvaPaulo Zemek15-Feb-22 14:23 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni15-Feb-22 16:50
professionalMichael Sydney Balloni15-Feb-22 16:50 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek16-Feb-22 8:31
mvaPaulo Zemek16-Feb-22 8:31 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni16-Feb-22 15:27
professionalMichael Sydney Balloni16-Feb-22 15:27 
GeneralRe: The set shows the problem with this approach. Pin
Paulo Zemek16-Feb-22 20:59
mvaPaulo Zemek16-Feb-22 20:59 
I am really glad to see that my articles actually go to a point that you are interested in working on or just studying.

I honestly never did new comparisons between unordered_map and my dictionary implementation, although I know for sure that the main points remain the same... .NET (and this implementation) is a "single array" for all items, where C++ implementation is a new vector (or similar) per bucket.

Aside from that, the documentation of .NET dictionary says the order of inserts is not kept, although it is when there are no Remove calls. That also didn't change on all the implementations I was able to test.

Anyways, talking about object pool and the like, I can tell that C++ unordered_map with its calls to new[] delete[] and similar will not beat the pool speed... but the point with unordered_map is that it, by default, doesn't use any pool, but lets you specify the allocator (which by default doesn't use any kind of pooling, but you can always replace the allocator).

So, assuming a good allocator, it can be as good, or even better, than the one used in my Dictionary. The thing is, most users will never use a different allocator. Also, because the allocator needs to work on top of buckets, not over the entire array, there's a real risk that "such a strategy" will end up wasting too much memory.

The way the .NET Dictionary (and especially this one) works, is that we will just put items into the main array, without ever having to allocate new items or arrays. When there is a particular conflict we will use the pool... and the pool is both thread-unsafe (which is bad if we have many threads, but faster when used by a single thread) and knows objects have the exact same size, which usually defeats the performance of most variable size allocators by the simple fact that we just increase a pointer (when we still have memory available), or we reuse a "deleted" pointer, which is actually put into a linked list of deleted pointers, using that same memory to avoid new allocations... so, when an item is removed, we just mark that memory as "available" without a new allocation or deallocation (object constructor and destructor still run, but real malloc/free calls don't).

That's not something that unordered_map or any other container will do by default. And, again, if you use a memory manager/allocator, it might work but becomes a part of the user responsibility... with my approach, it comes "by default" and, considering how my Dictionary allocates memory, is possibly one of the best approaches anyways (which will not be true for unordered_map with its many inner-vectors).

So it goes to the point that you will possibly need to compare the differences... and see to which direction to go, be it implementation wise, be it interface wise... I know I used a non-C++ standard here... but at the time I wrote the article almost nobody used the STL one because every other implementation tried to be simpler... now that STL got a good performance people use it more... I still don't know if that's a good thing or if it just made the real problem live longer.

AnswerRe: The set shows the problem with this approach. Pin
ElectronProgrammer17-Feb-22 5:02
ElectronProgrammer17-Feb-22 5:02 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni17-Feb-22 7:26
professionalMichael Sydney Balloni17-Feb-22 7:26 
GeneralRe: The set shows the problem with this approach. Pin
Michael Sydney Balloni18-Feb-22 6:16
professionalMichael Sydney Balloni18-Feb-22 6:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.