Click here to Skip to main content
15,891,708 members
Articles / Programming Languages / C++

32-bit pointers in a 64-bit world

Rate me:
Please Sign up or sign in to vote.
3.18/5 (6 votes)
24 Aug 2008CPOL3 min read 73.9K   204   17   15
64 bit pointers are wasteful if you don't need to access TBs of RAM

Introduction

64 bit is wonderful - we are no longer limited to just 2GB (per Win32 process) - you can now address a full 8TB of memory (per Win64 process)... which is great if your application really needs 8TB of memory.

If you don't need the full address space, 64-bit can be wasteful: pointers take twice the memory, which may actually result in a slower application, especially when using a pointer-laden data structure such as a binary tree (larger structures take more cache space).

A simple scheme is presented that allows using 32-bit "pointers" to access up to 64GB of memory on a 64-bit machine.

Encoding 64-bit pointers in 32-bits

The main idea of sptr is data alignment - modern processors feel much more comfortable when accessing "aligned" addresses: accessing a char at address 0xC0001 means more work than accessing an int at 0xC0000.

malloc() (which does the heavy lifting for new) only returns aligned addresses (see http://msdn.microsoft.com/en-us/library/ycsb6wwf.aspx) - a pointer returned by malloc() will always be aligned to a 16-byte boundary on Visual Studio - the 4 lowest bits will always be 0.

Therefore, at least for pointers we allocate on the heap, we can just "delete" the 4 lowest bits (by shifting right) without losing information.

If all memory addresses were allocated from the "bottom" (the highest bits are also 0), we could just encode a pointer as follows:

C++
#include <stddef.h>  // for uintptr_t
typedef unsigned __int32 uint32_t;

uint32_t encode(void* p)
{
  uintptr_t i = reinterpret_cast<uintptr_t>(p) >> 4;
  return static_cast<uint32_t>(i);
}

(uintptr_t is an integral data type that is guaranteed to be the same size as a pointer.)

This will encode any (16-byte aligned) address in the range 0x00 0000 0000 - 0x10 0000 0000 into a 32-bit integer.

However, the Operating System is free to map physical memory into other virtual memory segments - for example, mapping kernel memory "at the top", so it is preferable to keep such a map ourselves:

C++
#include <boost/thread/mutex.hpp>
#include <boost/thread/locks.hpp>

class sptr_base
{   
protected:
  static const uint32_t ALIGNMENT_BITS = 4;
  static const uint32_t ALIGNMENT = (1<<ALIGNMENT_BITS);
  static const uintptr_t ALIGNMENT_MASK = ALIGNMENT - 1;

protected:
  static uintptr_t     _seg_map[ALIGNMENT];
  static uintptr_t     _segs;
  static boost::mutex  _m;
  inline static uintptr_t ptr2seg(uintptr_t p)
  {
    p &= ~0xFFFFFFFFULL; // Keep only high part
    uintptr_t s = _segs;
    uintptr_t i = 0;
    for (; i < s; ++i)
      if (_seg_map[i] == p)
    return i;

    // Not found - now we do it the "right" way (mutex and all)
    boost::lock_guard<boost::mutex> lock(_m);
    for (i = 0; i < s; ++i)
      if (_seg_map[i] == p)
        return i;

    i = _segs++;
    if (_segs > ALIGNMENT)
      throw bad_segment("Segment out of range");

    _seg_map[i] = p;
    return i;
  }
};

ptr2seg() maps the high bits of a pointer to one of a few segments. The code refrains from using a mutex when reading the map, based on the atomicity of integer assignment.

The actual pointer encode/decode is then as follows:

C++
#include <boost/pool/detail/singleton.hpp>
typedef boost::details::pool::singleton_default<segment_mapper> the_segment_mapper;

uint32_t encode(const_native_pointer_type ptr)
{
  uintptr_t p = reinterpret_cast<uintptr_t>(ptr);
  if ((p & ALIGNMENT_MASK) != 0)
    throw bad_alignment("Pointer is not aligned");
  return (uint32_t)(ptr2seg(p) + p);
}

inline native_pointer_type decode(uint32_t e)
{
   uintptr_t el = e;
   uintptr_t ptr = (_seg_map[el & ALIGNMENT_MASK] + el) & ~ALIGNMENT_MASK;
   return reinterpret_cast<native_pointer_type>(ptr);
}

sptr

sptr gives a pointer-like interface to the "compressed" pointer. It allows accessing our "pointer" similar to a real pointer:

C++
sptr<int> p = new int;
*p = 5;
delete p;

// Conversions from/to pointers and pointer arithmetics work
int* q = new int;
p = q;
p++;
q = p;

Limitations

sptr has several limitations (which may make it inappropriate in many places).

First, sptr can't access the whole 64-bit address space (duh!). It is appropriate only for applications requiring just a bit more than the 32-bit address space (server machines are usually limited to 16GB-32GB of memory; Desktops usually can't deal with more than 8GB).

Second, although an address allocated on the heap will be aligned properly, pointer arithmetic may complicate things:

C++
sptr<char> buf = new char[100];

will work just fine as new returns an aligned address.

However, something like:

C++
for (sptr<char> p = buf; p != buf + 100; ++p)
   *p = '0';

will not work as p will now point to an unaligned address (the sptr implementation will assert at compile time).

However, in most cases, a standard pointer can be used:

C++
for (char* p = buf; p != buf + 100; ++p)
   *p = '0';

Third, mix sptr and multiple-inheritence with caution.

For example, given:

C++
struct A { char i; };
struct B { int j; };
struct C : public A, public B { };

This may or may not work (depending on structure alignment):

C++
sptr<C> c = new C;
sptr<B> b = c;

This is because b does not point to the same address as c (this is not the place for a discussion on multiple-inheritance, but you can try the following code to convince yourself):

C++
C* c = new C;
B* b = c; 
std::cout << ((uintptr_t)b - (uintptr_t)c) << std::endl;

There is no problem accessing the object via a standard pointer:

C++
sptr<C> c = new C;
B* b = c;

Performance

A small benchmark is provided with the code, implementing a simple linked-list which is repeatedly traversed.

The list's node is:

C++
struct node
{
   int val;
   node_ptr prev;
   node_ptr next;
};

The size of the node is therefore 20 bytes with standard pointers vs. 12 bytes with sptr.

However, this comes at a tradeoff - the extra bit shuffling impacts performance - the standard pointer version will perform 1.5x faster on the linked-list benchmark.

Accessing the map accounts for much of this performance hit. If it could be guaranteed that all addresses are allocated from the region 0x00 0000 0000 - 0x10 0000 0000, we could discard the map, significantly improving performance.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionGood solution, would add just1 improvement Pin
JCems16-Jan-15 7:04
JCems16-Jan-15 7:04 
GeneralUse with caution Pin
tobywf2-Sep-08 20:18
tobywf2-Sep-08 20:18 
GeneralRe: Use with caution Pin
Stefan_Lang3-Sep-08 4:00
Stefan_Lang3-Sep-08 4:00 
AnswerRe: Use with caution Pin
cppnow10-Oct-08 4:04
cppnow10-Oct-08 4:04 
GeneralRe: Use with caution Pin
Stefan_Lang10-Oct-08 4:14
Stefan_Lang10-Oct-08 4:14 
GeneralRe: Use with caution Pin
cppnow10-Oct-08 6:16
cppnow10-Oct-08 6:16 
GeneralRe: Use with caution Pin
Stefan_Lang10-Oct-08 6:30
Stefan_Lang10-Oct-08 6:30 
GeneralRe: Use with caution Pin
cppnow10-Oct-08 6:58
cppnow10-Oct-08 6:58 
GeneralRe: Use with caution Pin
Stefan_Lang12-Oct-08 22:52
Stefan_Lang12-Oct-08 22:52 
cppnow wrote:
From my (limited) experience, boost pool doesn't scale that well.


That's what I said Wink | ;) IIRC the pool implementation is meant to allocate memory blocks at ever increasing sizes unto a certain limit - however, the limitation part appears to be broken and to my knowledge was never fixed so far. Someone else on the boost list should fix it, as the original implementer seems to have dropped maintenance.

To me it appeared quite easy to modify it so it doesn't indefinitely double the allocated blocks' sizes. Unfortunately I never had the time to properly test that modification although it appeared to work quite well for the tests I ran. I did it for only one type of pool only (fast_pool) and my knowledge isn't quite at the level of the orignal author, so I'm reluctant to jump in that gap.

cppnow wrote:
Otherwise, you may find yourself reallocating a huge array (not to mention loose pointers that may be pointing to it).


Erm, pools don't reallocate memory. Not at all! They just add new blocks of memory to the available pool, or release blocks that are no longer used. So you won't ever have to worry about pointer issues. Basically what you have is a list of memory blocks with a wrapper that simulates continuous memory.

Effectively a pool is a memory manager that specializes in allocating chunks of memory of a given size, instead of just byte size. The specialization consists of two parts:

1. minimizing the effort of allocating (and releasing) huge numbers of objects and

2. optimizing memory allocation to seamlessly fill memory block sizes.

The latter aspect is exactly what the above article appears to focus on, this is what gave me the idea to just go for pools instead.

Regarding your example - if you have large size data blocks in each node then your gain in memory for using 32 bit pointers will be negligible. If you have small data blocks however, then you'll run into another problem, which is the performance of allocating (and possibly deallocating) billions of minute chunks of memory one at a time. Try allocating, say, one million blocks of 32 byte each using new. Get yourself a coffee while your program is busy. Now do the same with one billion blocks... In such a case you absolutely will need something like pools!
GeneralRe: Use with caution Pin
cppnow13-Oct-08 5:20
cppnow13-Oct-08 5:20 
QuestionBenchmarks? Pin
Leo Davidson25-Aug-08 13:10
Leo Davidson25-Aug-08 13:10 
AnswerRe: Benchmarks? Pin
cppnow26-Aug-08 4:20
cppnow26-Aug-08 4:20 
GeneralError Pin
steveb25-Aug-08 9:24
mvesteveb25-Aug-08 9:24 
GeneralRe: Error Pin
JeanLuc_25-Aug-08 10:30
JeanLuc_25-Aug-08 10:30 
GeneralRe: Error Pin
cppnow25-Aug-08 10:49
cppnow25-Aug-08 10:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.