Click here to Skip to main content
15,891,629 members
Articles / Programming Languages / C++

32-bit pointers in a 64-bit world

Rate me:
Please Sign up or sign in to vote.
3.18/5 (6 votes)
24 Aug 2008CPOL3 min read 73.9K   204   17   15
64 bit pointers are wasteful if you don't need to access TBs of RAM

Introduction

64 bit is wonderful - we are no longer limited to just 2GB (per Win32 process) - you can now address a full 8TB of memory (per Win64 process)... which is great if your application really needs 8TB of memory.

If you don't need the full address space, 64-bit can be wasteful: pointers take twice the memory, which may actually result in a slower application, especially when using a pointer-laden data structure such as a binary tree (larger structures take more cache space).

A simple scheme is presented that allows using 32-bit "pointers" to access up to 64GB of memory on a 64-bit machine.

Encoding 64-bit pointers in 32-bits

The main idea of sptr is data alignment - modern processors feel much more comfortable when accessing "aligned" addresses: accessing a char at address 0xC0001 means more work than accessing an int at 0xC0000.

malloc() (which does the heavy lifting for new) only returns aligned addresses (see http://msdn.microsoft.com/en-us/library/ycsb6wwf.aspx) - a pointer returned by malloc() will always be aligned to a 16-byte boundary on Visual Studio - the 4 lowest bits will always be 0.

Therefore, at least for pointers we allocate on the heap, we can just "delete" the 4 lowest bits (by shifting right) without losing information.

If all memory addresses were allocated from the "bottom" (the highest bits are also 0), we could just encode a pointer as follows:

C++
#include <stddef.h>  // for uintptr_t
typedef unsigned __int32 uint32_t;

uint32_t encode(void* p)
{
  uintptr_t i = reinterpret_cast<uintptr_t>(p) >> 4;
  return static_cast<uint32_t>(i);
}

(uintptr_t is an integral data type that is guaranteed to be the same size as a pointer.)

This will encode any (16-byte aligned) address in the range 0x00 0000 0000 - 0x10 0000 0000 into a 32-bit integer.

However, the Operating System is free to map physical memory into other virtual memory segments - for example, mapping kernel memory "at the top", so it is preferable to keep such a map ourselves:

C++
#include <boost/thread/mutex.hpp>
#include <boost/thread/locks.hpp>

class sptr_base
{   
protected:
  static const uint32_t ALIGNMENT_BITS = 4;
  static const uint32_t ALIGNMENT = (1<<ALIGNMENT_BITS);
  static const uintptr_t ALIGNMENT_MASK = ALIGNMENT - 1;

protected:
  static uintptr_t     _seg_map[ALIGNMENT];
  static uintptr_t     _segs;
  static boost::mutex  _m;
  inline static uintptr_t ptr2seg(uintptr_t p)
  {
    p &= ~0xFFFFFFFFULL; // Keep only high part
    uintptr_t s = _segs;
    uintptr_t i = 0;
    for (; i < s; ++i)
      if (_seg_map[i] == p)
    return i;

    // Not found - now we do it the "right" way (mutex and all)
    boost::lock_guard<boost::mutex> lock(_m);
    for (i = 0; i < s; ++i)
      if (_seg_map[i] == p)
        return i;

    i = _segs++;
    if (_segs > ALIGNMENT)
      throw bad_segment("Segment out of range");

    _seg_map[i] = p;
    return i;
  }
};

ptr2seg() maps the high bits of a pointer to one of a few segments. The code refrains from using a mutex when reading the map, based on the atomicity of integer assignment.

The actual pointer encode/decode is then as follows:

C++
#include <boost/pool/detail/singleton.hpp>
typedef boost::details::pool::singleton_default<segment_mapper> the_segment_mapper;

uint32_t encode(const_native_pointer_type ptr)
{
  uintptr_t p = reinterpret_cast<uintptr_t>(ptr);
  if ((p & ALIGNMENT_MASK) != 0)
    throw bad_alignment("Pointer is not aligned");
  return (uint32_t)(ptr2seg(p) + p);
}

inline native_pointer_type decode(uint32_t e)
{
   uintptr_t el = e;
   uintptr_t ptr = (_seg_map[el & ALIGNMENT_MASK] + el) & ~ALIGNMENT_MASK;
   return reinterpret_cast<native_pointer_type>(ptr);
}

sptr

sptr gives a pointer-like interface to the "compressed" pointer. It allows accessing our "pointer" similar to a real pointer:

C++
sptr<int> p = new int;
*p = 5;
delete p;

// Conversions from/to pointers and pointer arithmetics work
int* q = new int;
p = q;
p++;
q = p;

Limitations

sptr has several limitations (which may make it inappropriate in many places).

First, sptr can't access the whole 64-bit address space (duh!). It is appropriate only for applications requiring just a bit more than the 32-bit address space (server machines are usually limited to 16GB-32GB of memory; Desktops usually can't deal with more than 8GB).

Second, although an address allocated on the heap will be aligned properly, pointer arithmetic may complicate things:

C++
sptr<char> buf = new char[100];

will work just fine as new returns an aligned address.

However, something like:

C++
for (sptr<char> p = buf; p != buf + 100; ++p)
   *p = '0';

will not work as p will now point to an unaligned address (the sptr implementation will assert at compile time).

However, in most cases, a standard pointer can be used:

C++
for (char* p = buf; p != buf + 100; ++p)
   *p = '0';

Third, mix sptr and multiple-inheritence with caution.

For example, given:

C++
struct A { char i; };
struct B { int j; };
struct C : public A, public B { };

This may or may not work (depending on structure alignment):

C++
sptr<C> c = new C;
sptr<B> b = c;

This is because b does not point to the same address as c (this is not the place for a discussion on multiple-inheritance, but you can try the following code to convince yourself):

C++
C* c = new C;
B* b = c; 
std::cout << ((uintptr_t)b - (uintptr_t)c) << std::endl;

There is no problem accessing the object via a standard pointer:

C++
sptr<C> c = new C;
B* b = c;

Performance

A small benchmark is provided with the code, implementing a simple linked-list which is repeatedly traversed.

The list's node is:

C++
struct node
{
   int val;
   node_ptr prev;
   node_ptr next;
};

The size of the node is therefore 20 bytes with standard pointers vs. 12 bytes with sptr.

However, this comes at a tradeoff - the extra bit shuffling impacts performance - the standard pointer version will perform 1.5x faster on the linked-list benchmark.

Accessing the map accounts for much of this performance hit. If it could be guaranteed that all addresses are allocated from the region 0x00 0000 0000 - 0x10 0000 0000, we could discard the map, significantly improving performance.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionGood solution, would add just1 improvement Pin
JCems16-Jan-15 7:04
JCems16-Jan-15 7:04 
GeneralUse with caution Pin
tobywf2-Sep-08 20:18
tobywf2-Sep-08 20:18 
GeneralRe: Use with caution Pin
Stefan_Lang3-Sep-08 4:00
Stefan_Lang3-Sep-08 4:00 
AnswerRe: Use with caution Pin
cppnow10-Oct-08 4:04
cppnow10-Oct-08 4:04 
GeneralRe: Use with caution Pin
Stefan_Lang10-Oct-08 4:14
Stefan_Lang10-Oct-08 4:14 
GeneralRe: Use with caution Pin
cppnow10-Oct-08 6:16
cppnow10-Oct-08 6:16 
GeneralRe: Use with caution Pin
Stefan_Lang10-Oct-08 6:30
Stefan_Lang10-Oct-08 6:30 
GeneralRe: Use with caution Pin
cppnow10-Oct-08 6:58
cppnow10-Oct-08 6:58 
GeneralRe: Use with caution Pin
Stefan_Lang12-Oct-08 22:52
Stefan_Lang12-Oct-08 22:52 
GeneralRe: Use with caution Pin
cppnow13-Oct-08 5:20
cppnow13-Oct-08 5:20 
Stefan63 wrote:
Erm, pools don't reallocate memory. Not at all! They just add new blocks of memory to the available pool, or release blocks that are no longer used. So you won't ever have to worry about pointer issues. Basically what you have is a list of memory blocks with a wrapper that simulates continuous memory.


I know what pools are - and as I said in an earlier post - standard memory allocators implement pools for small objects.
What I meant was is that if you want to use index-based pointers (not pools), you are essentially stuck with an array you need to reallocate.


Stefan63 wrote:
2. optimizing memory allocation to seamlessly fill memory block sizes.

The latter aspect is exactly what the above article appears to focus on, this is what gave me the idea to just go for pools instead.


Pools are great when you need to allocate lots of objects of the same size - but once you extend the size of the pool by allocated another block from the heap, you can't use indexes as pointers.


Stefan63 wrote:
Regarding your example - if you have large size data blocks in each node then your gain in memory for using 32 bit pointers will be negligible.


Sure.

Stefan63 wrote:
If you have small data blocks however, then you'll run into another problem, which is the performance of allocating (and possibly deallocating) billions of minute chunks of memory one at a time. Try allocating, say, one million blocks of 32 byte each using new. Get yourself a coffee while your program is busy. Now do the same with one billion blocks... In such a case you absolutely will need something like pools!


Also true - which is why general purpose memory allocators do implement pools for allocating small objects (off topic: if you know the size of the allocated block, you can also avoid the overhead of a header at for the memory block - that's what the standard STL allocator does - at least in the GNU implementation).

However, regardless of the memory allocator you use - you still are left with a pointer that in some circumstances may bloat your app. Of course - if your applications doesn't use pointer-heavy structures and/or the actual elements are large compared to the pointers, the suggested solution doesn't apply.
QuestionBenchmarks? Pin
Leo Davidson25-Aug-08 13:10
Leo Davidson25-Aug-08 13:10 
AnswerRe: Benchmarks? Pin
cppnow26-Aug-08 4:20
cppnow26-Aug-08 4:20 
GeneralError Pin
steveb25-Aug-08 9:24
mvesteveb25-Aug-08 9:24 
GeneralRe: Error Pin
JeanLuc_25-Aug-08 10:30
JeanLuc_25-Aug-08 10:30 
GeneralRe: Error Pin
cppnow25-Aug-08 10:49
cppnow25-Aug-08 10:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.