Using C++ Move Semantics to Manage Pointers to Externally Allocated Memory

Bruno van Dooren

4.97/5 (19 votes)

Nov 4, 2022

MIT

10 min read

13546

128

The win32 subsystem often returns pointers to objects that need to be deallocated by the caller. In this article, I show a way to do this reliably and in an exception-safe manner.

Download source - 7.2 KB

Introduction

I've been doing a lot of security related programming lately. The Windows security related APIs are fairly easy to use, but they are fundamentally C style interfaces. On the surface, you'd think that since C++ is a superset of C, this is not an issue.

The problem is that many times, the API gives you a pointer to a return value, and expects you to free it using LocalFree. Take, for example, the ConvertSidToSidString API:

BOOL ConvertSidToStringSidW( [in] PSID Sid, [out] LPWSTR *StringSid );

It returns StringSid as a pointer which needs to be freed by the caller. But consider the following:

    wstring w32_ConvertSidToStringSid(PSID pSid)
    {
        PWCHAR sidString;

        //Get a human readable version of the SID
        if (!ConvertSidToStringSid(pSid, &sidString)) {
            throw ExWin32Error();
        }
        
        wstring retval = sidString;
        LocalFree(sidString);

        return retval;
    }

This is a possible implementation of a wrapper function. I maintain a large collection of API wrappers like this for two reasons. The first is that dealing with strings is just soooooo much easier when using std::wstring instead of PWCHAR datatypes. The second is that your overall code is so much cleaner and more readable when using exceptions and RAII. And this is where the conflict between the C style API and C++ becomes obvious.

If you look at the code, you'll notice that we initialize a wstring variable using the returned pointer, after which we free the pointer. If, for some reason, the wstring constructor throws an exception, sidString is never freed and there will be a memory leak.

Sure, the chances of wstring constructor throwing an exception is virtually nonexistent. And in this case, we could probably figure out an easy way around it using a stack based array and a memory copy. But the principle of the problem still exists. And it gets a lot worse if the API returns a pointer that we have to use as a parameter in other function calls. In that scenario, we simply cannot predict when an exception will happen. On top of that, the resulting code would be a spaghetti of nested if statements.

What we REALLY want is a way to deal with these pointers in a way that

is guaranteed to Free the pointer and
is exception safe
allows passing of the pointer to callers / subroutines easily

In short, we need to come up with a smart pointer implementation.

Reusing unique_ptr?

I am a big proponent of re-use. If I -can- reasonably use a class that has seen years of use and fine tuning, I certainly will. For COM, I use CComPtr. For Variant structures, I use CComVariant, etc.

My first thought was to use unique_ptr. It does almost everything we want. The implementation I made as a test is pretty simple:

using deleter = void(*)(void *);
void deleterfunc(void* ptr) { if (ptr) LocalFree(ptr); }

template<typename T>
struct CLocalAllocPtr : public std::unique_ptr < T, deleter>
{
public:
    CLocalAllocPtr(T* t) : std::unique_ptr < T, deleter>(t, deleterfunc) {}
};

That's it. unique_ptr requires the constructor to take a function pointer to the function that will eventually clean up the pointer. Our CLocalAllocPtr is 100% a unique_ptr with a different cleanup function because the memory our pointer is pointing to was allocated in a way that requires LocalFree instead of other heap management functions.

We can use it like this:

void foo(LPWSTR* arg) {
    *arg = (LPWSTR)LocalAlloc(LPTR,42);
}

int main()
{
    WCHAR* rawPtr = NULL;
    foo(&rawPtr);
    CLocalAllocPtr <WCHAR> smartPtr(rawPtr);
    return 0;
}

Assume foo here is an API call which is outside of our control. foo will allocate the memory and return the pointer. Since we are responsible for it, we pass control of the pointer to CLocalAllocPtr which will manage the lifetime and make sure LocalFree is executed when smartPtr goes out of scope.

unique_ptr implements move semantics so we can also do the following:

void foo(LPWSTR* arg) {
    *arg = (LPWSTR)LocalAlloc(LPTR,42);
}

CLocalAllocPtr <WCHAR> Bar() {
    WCHAR* rawPtr = NULL;
    foo(&rawPtr);
    return CLocalAllocPtr <WCHAR> (rawPtr);
}

int main()
{
    CLocalAllocPtr <WCHAR> smartPtr2 = Bar();
    return 0;
}

We can transfer ownership of the pointer to callers and subroutines. On the surface, it does everything we need.

Accessing the Raw Pointer

One could argue that there are many times when you need to supply the pointer value directly to an API call. The unique_ptr class provides a get() method.

void Baz(WCHAR* arg) { }

int main()
{
    CLocalAllocPtr <WCHAR> smartPtr2 = Bar();
    Baz(smartPtr2.get());

    return 0;
}

Honestly, I don't like that approach. Yes, I know it's the 'C++' way to do things but I want an automatic conversion. That too is but a small addition to our CLocalAllocPtr class.

template<typename T>
struct CLocalAllocPtr : public std::unique_ptr < T, deleter>
{
public:
    CLocalAllocPtr(T* t) : std::unique_ptr < T, deleter>(t, deleterfunc) {}

    operator T* () {
        return this->get();
    }
};

void Baz(WCHAR* arg) { }

int main()
{
    CLocalAllocPtr <WCHAR> smartPtr2 = Bar();
    Baz(smartPtr2); //automatic conversion to pointer
    return 0;
}

With the addition of a simple casting operator, we can actually use the smart pointer just like we would use a regular pointer. That's it, case closed, job well done!

...

Except there is one little detail that spoils the fun. If I'm honest, the above implementation is solid, and builds upon unique_ptr which is great from a design point of view. However, it still relies on the programmer to IMMEDIATELY wrap the raw pointer into a smart pointer. For a simple example like ours, this is trivial. But if you're dealing with many pointers, you can still create problems if you fail to immediately do it. Plus from a cleanliness point of view, it is 1 extra step which I want to eliminiate.

What I REALLY want is behavior like CComPtr's reference operator which allows me to do things like this:

        CComPtr<IADs> rootDse = NULL;
        hr = ADsOpenObject(L"LDAP://rootDSE",
            NULL,
            NULL,
            ADS_AUTHENTICATION_ENUM::ADS_SECURE_AUTHENTICATION, // Use Secure 
                                                                // Authentication
            IID_IADs,
            (void**)&rootDse);

A COM smart pointer allows itself to be referenced to obtain a pointer to its internal pointer. This means that when the call to ADsOpenObject finishes, the smart pointer is initialized. There's no need to add an extra step. Sadly, this is not possible with a unique_ptr. It's entire premise is that it is unique and solely responsible for managing the lifetime. And in order to make that guarantee, it keeps that member private. This means that even in our derived class, we cannot access it.

As they say: this implementation is close, but no cigar. We'll have to get back to the drawing board and roll our own if we want to combine certain unique_ptr behavior with a reference operator.

Implementing CLocalAllocPtr from Scratch

Thankfully, what we want is fairly limited in scope, so there is no need to implement unique_ptr all over again. Let's start with the constructor / destructor.

    template<typename T>
    struct CLocalAllocPtr
    {
        T Ptr = NULL;

        void Release() {
            if (Ptr) {
                LocalFree(Ptr);
                Ptr = NULL;
            }
        }

        ~CLocalAllocPtr() {
            Release();
        }

        CLocalAllocPtr() {
            Ptr = NULL;
        }

        CLocalAllocPtr(T ptr) {
            Ptr = ptr;
        }

        CLocalAllocPtr(CLocalAllocPtr&& other) noexcept {
            if (&(other.Ptr) != &(this->Ptr)) {
                Ptr = other.Ptr;
                other.Ptr = NULL;
            }
        }    
    }

We have three types of constructor. The default one just initializes and empty smart pointer. The one that takes a raw pointer assumes ownership of the pointer. And then, there is a move constructor. The move constructor is used whenever it is initialized with an rvalue. When that happens, it assumes ownership of the contained pointer and clears out the pointer from the rvalue to avoid double destruction.

There is no copy constructor because in our scenario, that would not make sense. The point of having this class is to manage the lifecycle of pointers that have been allocated by another party. We cannot copy or duplicate that behavior, nor do we want to. Should we want to have another instance, then the right approach is to ask the other party to allocate one for us.

Next to the constructors, we also have the assignment operators.

        CLocalAllocPtr& operator = (CLocalAllocPtr&& other) noexcept {
            if (&(other.Ptr) != &(this->Ptr)) {
                Release();
                Ptr = other.Ptr;
                other.Ptr = NULL;
            }
            return *this;
        }

        CLocalAllocPtr& operator = (T t) noexcept {
            Release();
            Ptr = t;
            return *this;
        }

In both cases, we take ownership of the raw pointer, and in both cases, we need to anticipate that if the instance already contains another pointer, it needs to be released.

In a move constructor / assignment, we need to check for self assignment. This is typically done with a comparison like if (&other != this). In this case, that is not an option because (shown in next section) I override the & operator in order to be able to use the class as a smartpointer. However, that doesn't really matter because the point of the check is to determine if the objects point to the same thing. For that purpose, we can also compare the addresses of the 'Ptr' variables in the objects. After all, the Ptr values are local to the object, so if they have different address locations, the objects are different too.

Accessing the Raw Pointer

With the lifecycle management of the pointer out of the way, we can implement the code for accessing the pointer.

        //Get a reference to the pointer value
        T* operator &() {
            return &Ptr;
        }

        //Cast to the pointertype
        operator T () {
            return Ptr;
        }

        //access members of Ptr
        T operator -> () {
            return Ptr;
        }

The reference operator is used for when we want to give a subroutine direct access to the contained pointer, similar to the behaviour of a COM smart pointer. The casting operator allows for implicit conversion to raw pointer value. That is used often when passing the pointer to a subroutine.

Maybe you've noticed that our first implementation had T as the type pointed to by the pointer, whereas this implementation has T as the 'pointer to something' type. This is intentional. It would have been possible to implement CLocalAllocPtr like unique_ptr and take the target type as the template argument instead of the pointer type. Functionally, it would work perfectly. The problem lies with the automatic cast to the raw pointer.

Let's go back to our use case and consider this function.

BOOL ConvertSidToStringSidW( [in] PSID Sid, [out] LPWSTR *StringSid );

Suppose we implement CLocalAllocPtr in a way that takes T to be whatever the pointer is pointing to. If we would want to use it like that and call that API, it would look like this:

CLocalAllocPtr<SID> pSid;
CLocalAllocPtr<WCHAR> outStr;

//... the SID comes from somewhere ...

ConvertSidToStringSidW( pSid, &outStr );

And this would work. I had both implementations side by side for comparison. In the end, I chose the implementation that made the most sense: the one that takes pointer types. This way, you wrap a PSID and use the smart pointer exactly like a PSID. You wrap an LPWSTR and use it like an LPWSTR.

The alternative implementation wraps a SID type and uses it like a PSID. It wraps a WCHAR type and uses it like an LPWSTR. It's functionally equivalent but it looks weird and out of place.

The Compiler Avoids a Pitfall

As I was testing my code, I was wondering about a potential pitfall that could lead to double deletion.

CLocalAllocPtr<SID> pSid1;
CLocalAllocPtr<SID> pSid2;

//... the SIDs come from somewhere ...

pSid1 = pSid2;  //????

There is an explicit cast available to raw pointer, and there is an assignment operator which takes a raw pointer. And if the compiler would use them automatically as a best fit to get around the fact that we have no copy constructor or copy assignment, we'd be in a lot of trouble because this could lead to a situation where we have two smart pointers each thinking they own the same pointer.

As it turns out, the compiler correctly refuses to compiler this, with the following message:

error C2280: 'CLocalAllocPtr<PSID> &CLocalAllocPtr<PSID>::operator =
(const CLocalAllocPtr<PSID> &)': attempting to reference a deleted function
 message : compiler has generated 'CLocalAllocPtr<PSID>::operator =' here
 message : 'CLocalAllocPtr<PSID> &CLocalAllocPtr<PSID>::operator =
(const CLocalAllocPtr<PSID> &)': function was implicitly deleted 
because 'CLocalAllocPtr<PSID>' has a user-defined move constructor

When you implement move semantics in a class, the compiler leaves the implicit declarations for the copy constructor and copy assignment in place, but removes the implementations. This is a logical thing to do because if you implement move semantics, it's a safe bet that you do not want automatic copying. If you want copying, you need to implement it explicitly.

The net result is that because the declarations still exist, when we try to compile pSid1 = pSid2 the compiler will choose the copy assignment over the conversion to pointer and the pointer assignment, because that is a more correct match. And this will cause the compilation error which informs you that something is going on that you may want to reconsider.

It is still possible to do this:

CLocalAllocPtr<SID> pSid1;
CLocalAllocPtr<SID> pSid2;

//... the SIDs come from somewhere ...

pSid1 = (PSID)pSid2;  //????

That will force the compiler to choose the route of casting to raw pointer and then using the assignment operator. The results will be catastrophic but in fairness, if you shoot yourself in the foot like this, at least you knowingly pulled the trigger and can blame only yourself.

Conclusion

Using the CLocalAllocPtr class, you can safely receive raw pointers and use them in your code without worrying about memory leaks or other problems that stem from passing around raw pointer. Feel free to use it if you're dealing with the win32 API in this manner.

Personally, I prefer my own implementation for its convenience. However, from a software management point of view, I can see why others would prefer the implementation which reuses unique_ptr. I've included that version in the source code download as well.

I've also included the implementation which uses the target type instead of the pointer type for reference.

Everything is licensed under the MIT license, so have fun with it.

History

10^th November, 2022: Article replaced malloc with LocalAlloc
7^th November, 2022: Updated code and article after user riki_p pointed out a typo
4^th November, 2022: Initial release