Click here to Skip to main content
15,867,594 members
Please Sign up or sign in to vote.
3.00/5 (2 votes)
See more:
Given a thirty-two bit integer value as a parameter, write a function or method to return the number of "1" bits in the number.

So if You get passed 42, it should return 3 - 42 decimal is 2A hex, or 101010 binary - hence 3 "1" bits.

Special care should be taken to make it as efficient as possible: loops are a poor idea!

What I have tried:

Setting a Challenge for the first time ever!
The Lounge[^]
Posted
Updated 22-Mar-23 23:20pm
v4
Comments
Richard Deeming 17-Feb-17 11:29am    
SPOILER ALERT! :)

That's going to be trivial in Java[^] and C++[^]! :)

I'm guessing some variation of the Standford method[^] will work best for other languages.
OriginalGriff 17-Feb-17 11:47am    
Maybe trivial - it depends on who wrote the functions, and what method they used. See solution one, which *looks* efficient until you think about what is going on behind the scenes.
Richard Deeming 17-Feb-17 11:49am    
I believe the C++ one just calls the popcnt instruction on supported processors, which is about as efficient as you can get. :)
Jochen Arndt 17-Feb-17 12:05pm    
I was still testing my code while you wrote that :)

And if POPCNT is not supported there is a (probably) well known page about bit twiddling hacks
Richard Deeming 17-Feb-17 12:07pm    
I can neither confirm nor deny that the well-known page exists, nor that there's a link to it in the spoiler text above. :)

unsigned bitcnt(unsigned v){
unsigned r= v & 0x55555555u;
r= r + ((r^v) >> 1);
unsigned s = r & 0x33333333u;
s =s + ((s^r) >>2);
r= s & 0x0F0F0F0F;
r= r + ((r^s)>>4);
r *= 0x01010101;
return r >> 24;
}
 
Share this answer
 
Comments
Richard MacCutchan 23-Mar-23 6:26am    
Very clever piece of code. I saw something similar almost 50 years ago, which converted a binary number to a printable string.
I'm not sure it this is correct or how efficient it is.
But I was thinking you could hijack the enum functionality.

C#
	public int GetEnumCnt(int Number){
		Enum32 EnumNumber = (Enum32) Number;
		int EnumCnt = 0;
		
		foreach (Enum32 EnumItem in Enum.GetValues(typeof(Enum32))){
			if (EnumNumber.HasFlag(EnumItem)){
//			if ((EnumNumber & EnumItem) != 0){
				EnumCnt++;
			}
		}
		
		return EnumCnt;
	}

	public enum Enum32 {
		//_00 = 0,	//0
		_01 = 1 << 0,	//1
		_02 = 1 << 1,	//2
		_03 = 1 << 2,	//4
		_04 = 1 << 3,	//8
		_05 = 1 << 4,	//16
		_06 = 1 << 5,	//32
		_07 = 1 << 6,	//64
		_08 = 1 << 7,	//128
		_09 = 1 << 8,	//256
		_10 = 1 << 9,	//512
		_11 = 1 << 10,	
		_12 = 1 << 11,	
		_13 = 1 << 12,	
		_14 = 1 << 13,	
		_15 = 1 << 14,	
		_16 = 1 << 15,	
		_17 = 1 << 16,	
		_18 = 1 << 17,	
		_19 = 1 << 18,	
		_20 = 1 << 19,	
		_21 = 1 << 20,	
		_22 = 1 << 21,	
		_23 = 1 << 22,	
		_24 = 1 << 23,	
		_25 = 1 << 24,	
		_26 = 1 << 25,	
		_27 = 1 << 26,	
		_28 = 1 << 27,	
		_29 = 1 << 28,	
		_30 = 1 << 29,	
		_31 = 1 << 30,	
		_32 = 1 << 31	
	}
 
Share this answer
 
Comments
Richard Deeming 23-Feb-17 15:43pm    
Not only does that have a loop, which was specifically mentioned in the question; but Enum.GetValues is much less efficient than a simple for loop! :)
jaket-cp 24-Feb-17 4:18am    
Yep a bit naughty that, but it was similar to solution 10 by ppolymorphe so I thought it would be okay :) - well I think it is/was similar in any case.
I should have mentioned it was ppolymorphe's solution which made me think of the Enum.
Just wanted to get it out there - to see if was a viable way of doing it :)
Obviously, it will be difficult to beat the processor instruction popcnt.
In high level languages, there is mainly 2 options:
- Counting bits 1 by 1, if size of data double, it takes twice the time.
C++
int BitsCnt(unsigned long int Bits) {
    int Cnt=0;
    do {
        Cnt += Bits && 1;
        Bits = Bits >> 1;
    }
    while (Bits != 0)
    return Cnt;
}

Variant can unroll the loop.
Variant can also take advantage of processor flags.
Subvariant: With a lookup table, one can treat in chunks bigger than 1 bit. problem with lookup tables is that it tend to trash processor cache.

- Counting bits using the parallel programming of the poor, if size of data double, it takes only 1 more step.
C++
int BitsCnt(unsigned long int Bits) {
    Bits= (Bits && 0x55555555) + ((Bits >> 1) && 0x55555555);
    Bits= (Bits && 0x33333333) + ((Bits >> 2) && 0x33333333);
    Bits= (Bits && 0x0f0f0f0f) + ((Bits >> 4) && 0x0f0f0f0f);
    Bits= (Bits && 0x00ff00ff) + ((Bits >> 8) && 0x00ff00ff);
    Bits= Bits + (Bits >>16);
    return Bits&& 0xff;
}
 
Share this answer
 
v4
Comments
Graeme_Grant 20-Feb-17 21:55pm    
Very popular method on the interwebs ... lookup table will be quicker.
Patrice T 20-Feb-17 22:12pm    
My solution is 2 methods I like.
Not meant to be exhaustive.
What do you think about this :
VB
Function BitSum(intVar As Integer) As Integer

    If intVar = 0 Then Return 0

    Dim count As Integer = 0
    If intVar < 0 Then
        intVar -= Integer.MinValue
        count += 1
    End If
    count += (intVar Mod 2) + BitSum(intVar \ 2)

    Return count

End Function
 
Share this answer
 
Comments
OriginalGriff 20-Feb-17 12:15pm    
Recursion is a "hidden loop" ...
Richard Deeming 20-Feb-17 12:43pm    
And one with a significantly higher overhead than a non-hidden loop! :)
Ralf Meier 20-Feb-17 13:19pm    
Of course ... if you have a purpose which originally requires a loop and you try to realize it without a loop you will allways get a higher overhead ... I'm sorry ...
Ralf Meier 20-Feb-17 13:17pm    
OK ... if Recursion is also a loop then I don't know another way to solve this ... :(
OriginalGriff 20-Feb-17 14:04pm    
That's what coding challenges are about - finding alternatives, and clever algorithms rather than brute forcing the solution.
Doing some calculations within a bash shell script it crosses to my mind that this challenge can be also implemented with shells that support numeric calculations:

[EDIT]
There have been complaints that I was cheating using well known code.
So I decided to provide a solution of my own which is not the best. It uses a loop with max. 8 iterations. If loops are unwanted, it can be unrolled using 8 lines of code. But with small values like 42 the loop performs better.
[/EDIT]

Bash
#!/bin/bash

# There must be one command line argument
if [ $# -ne 1 ]; then
    echo "Usage: $0 <unsigned integer value>"
    exit 1
fi

# Digits only; optionally as hex
if [[ ! ($1 =~ ^[0-9]+$ || $1 =~ ^0[xX][0-9a-fA-F]) ]]; then
    echo "'$1' is not an unsigned integer"
    exit 1
fi

# Calculate number of bits set
# Algorithm from https://graphics.stanford.edu/~seander/bithacks.html
#let "c=$1 - (($1 >> 1) & 0x55555555)"
#let "c=($c & 0x33333333) + (($c >> 2) & 0x33333333)"
#let "c=(($c + ($c >> 4) & 0xF0F0F0F) * 0x1010101) >> 24"

# A new version of my own.
# Calculates the bit count for each nibble (4-bits) until
#  no more bits are set.
c=0
let "v=$1"
while [ $v -ne 0 ]; do
    let "c=c + ((0xe994 >> (($v & 7) << 1)) & 3) + (($v >> 3) & 1)"
    let "v=v >> 4"
done

echo "Value $1 has $c bits set"
 
Share this answer
 
v2
Comments
Richard Deeming 20-Feb-17 12:44pm    
Well, we could all have copied the magic-voodo-bit-twiddling code from Stanford, but I thought that would be cheating! :)
OriginalGriff 20-Feb-17 14:02pm    
I've been half expecting *someone" to copy this code:
https://www.codeproject.com/Tips/365002/A-Csharp-implementation-of-AI-MEMO-Item-Co
:laugh:
Jochen Arndt 21-Feb-17 3:25am    
It would be impudent to use a solution written by the one who setup the challenge :)

It is a very good one!
I did not know it before because knowing the Stanford site I had no reason to search for other implementations.
OriginalGriff 21-Feb-17 4:31am    
I love the way someone has sat down and worked out the math for it and a little bulb has lit up above his head! :laugh:
I used to wrote PDP assembler, in another life...
Jochen Arndt 21-Feb-17 3:21am    
But that's what a developer does:
Use existing algorithms instead of reinventing the wheel.

And there must be at least one that just does it :)
A C one (I am aware it is very similar to Richard's CountSetBits4).
uint8_t bits(uint32_t w)
{
  const uint8_t m[] =
  {
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
  };
  
  uint8_t * p = (uint8_t *) & w;
  
  uint8_t b = m[*p++];
  b += m[*p++];
  b += m[*p++];
  b += m[*p++];
  b += m[*p++];
  return b;
}
 
Share this answer
 
Comments
Graeme_Grant 20-Feb-17 16:50pm    
You realise that Richard used my idea of a lookup table?
CPallini 20-Feb-17 16:57pm    
No, I didn't realize. Is it important?
Graeme_Grant 20-Feb-17 16:59pm    
lmao... I find it flattering...
CPallini 23-Feb-17 5:27am    
Well, as matter of fact, I end up independently with such a code. On posting it I recognized at a glance, the similarity with Richard's one. I didn't investigate further.
u8_t NumSetBits(u32_t arg)
{
	u8_t num_set_bits = 0;
	u8_t count;
	for(count=0; count<32; count++)
	{
		if((1U << count) & arg)
			num_set_bits++;
	}
	return num_set_bits;
}

int main(void)
{
	u32_t arg = 0xffff;
	u8_t ones;

	ones = 	NumSetBits(arg);
	printf("arg(%x) has (%d) set ones\n", arg, ones);
	return 0;
}
 
Share this answer
 
Comments
Maciej Los 20-Feb-17 8:06am    
Seems, you haven't read requirements: "Special care should be taken to make it as efficient as possible: loops are a poor idea!"
Since copying the magic-voodo-bit-twiddling code would be cheating, here's the obvious and slightly more understandable C# solution:
C#
public static int CountSetBits(int value)
{
    int result = 0;
    for (int i = 0; i < 8; i++)
    {
        byte nibble = (byte)(value & 0xF);
        value >>= 4;

        switch (nibble)
        {
            case 1:
            case 2:
            case 4:
            case 8:
            {
                result += 1;
                break;
            }
            case 3:
            case 5:
            case 6:
            case 9:
            case 10:
            case 12:
            {
                result += 2;
                break;
            }
            case 7:
            case 11:
            case 13:
            case 14:
            {
                result += 3;
                break;
            }
            case 15:
            {
                result += 4;
                break;
            }
        }
    }

    return result;
}

Benchmarking via BenchmarkDotNet[^]:
C#
public static int CountSetBitsSlow(int value)
{
    int result = 0;
    for (int i = 0, bit = 1; i < 32; i++, bit <<= 1)
    {
        if ((value & bit) != 0)
        {
            result++;
        }
    }

    return result;
}

...

public class CountBitsBenchmark
{
    private readonly int _value;

    public CountBitsBenchmark()
    {
        _value = new Random().Next();
    }

    [Benchmark]
    public int CountSetBits()
    {
        return Int32BitCounter.CountSetBits(_value);
    }

    [Benchmark]
    public int CountSetBitsSlow()
    {
        return Int32BitCounter.CountSetBitsSlow(_value);
    }
}

Results:
           Method |       Mean |    StdDev |
----------------- |----------- |---------- |
     CountSetBits | 11.2227 ns | 0.0482 ns |
 CountSetBitsSlow | 26.1762 ns | 0.2472 ns |



EDIT: A small improvement from unrolling the loop, although it does make the code significantly longer:
C#
[StructLayout(LayoutKind.Explicit)]
private struct Int32Bytes
{
    [FieldOffset(0)]
    public int Int32;

    [FieldOffset(0)]
    public readonly byte Byte0;

    [FieldOffset(1)]
    public readonly byte Byte1;

    [FieldOffset(2)]
    public readonly byte Byte2;

    [FieldOffset(3)]
    public readonly byte Byte3;
}

public static int CountSetBits2(int value)
{
    int result = 0;
    var bytes = new Int32Bytes { Int32 = value };

    switch (bytes.Byte0 & 0xF)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte0 >> 4)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte1 & 0xF)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte1 >> 4)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte2 & 0xF)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte2 >> 4)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte3 & 0xF)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    switch (bytes.Byte3 >> 4)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        {
            result += 3;
            break;
        }
        case 15:
        {
            result += 4;
            break;
        }
    }

    return result;
}

Results:
           Method |       Mean |    StdDev |
----------------- |----------- |---------- |
     CountSetBits | 11.1346 ns | 0.1539 ns |
    CountSetBits2 |  8.2397 ns | 0.1248 ns |
 CountSetBitsSlow | 19.5581 ns | 0.1278 ns |



Edit 2:
With a switch statement for each byte, rather than each nibble, the speed increases again. But so does the length of the code.
C#
public static int CountSetBits3(int value)
{
    int result = 0;
    var bytes = new Int32Bytes { Int32 = value };

    switch (bytes.Byte0)
    {
        case 1:
        case 2:
        case 4:
        case 8:
        case 16:
        case 32:
        case 64:
        case 128:
        {
            result += 1;
            break;
        }
        case 3:
        case 5:
        case 6:
        case 9:
        case 10:
        case 12:
        case 17:
        case 18:
        case 20:
        case 24:
        case 33:
        case 34:
        case 36:
        case 40:
        case 48:
        case 65:
        case 66:
        case 68:
        case 72:
        case 80:
        case 96:
        case 129:
        case 130:
        case 132:
        case 136:
        case 144:
        case 160:
        case 192:
        {
            result += 2;
            break;
        }
        case 7:
        case 11:
        case 13:
        case 14:
        case 19:
        case 21:
        case 22:
        case 25:
        case 26:
        case 28:
        case 35:
        case 37:
        case 38:
        case 41:
        case 42:
        case 44:
        case 49:
        case 50:
        case 52:
        case 56:
        case 67:
        case 69:
        case 70:
        case 73:
        case 74:
        case 76:
        case 81:
        case 82:
        case 84:
        case 88:
        case 97:
        case 98:
        case 100:
        case 104:
        case 112:
        case 131:
        case 133:
        case 134:
        case 137:
        case 138:
        case 140:
        case 145:
        case 146:
        case 148:
        case 152:
        case 161:
        case 162:
        case 164:
        case 168:
        case 176:
        case 193:
        case 194:
        case 196:
        case 200:
        case 208:
        case 224:
        {
            result += 3;
            break;
        }
        case 15:
        case 23:
        case 27:
        case 29:
        case 30:
        case 39:
        case 43:
        case 45:
        case 46:
        case 51:
        case 53:
        case 54:
        case 57:
        case 58:
        case 60:
        case 71:
        case 75:
        case 77:
        case 78:
        case 83:
        case 85:
        case 86:
        case 89:
        case 90:
        case 92:
        case 99:
        case 101:
        case 102:
        case 105:
        case 106:
        case 108:
        case 113:
        case 114:
        case 116:
        case 120:
        case 135:
        case 139:
        case 141:
        case 142:
        case 147:
        case 149:
        case 150:
        case 153:
        case 154:
        case 156:
        case 163:
        case 165:
        case 166:
        case 169:
        case 170:
        case 172:
        case 177:
        case 178:
        case 180:
        case 184:
        case 195:
        case 197:
        case 198:
        case 201:
        case 202:
        case 204:
        case 209:
        case 210:
        case 212:
        case 216:
        case 225:
        case 226:
        case 228:
        case 232:
        case 240:
        {
            result += 4;
            break;
        }
        case 31:
        case 47:
        case 55:
        case 59:
        case 61:
        case 62:
        case 79:
        case 87:
        case 91:
        case 93:
        case 94:
        case 103:
        case 107:
        case 109:
        case 110:
        case 115:
        case 117:
        case 118:
        case 121:
        case 122:
        case 124:
        case 143:
        case 151:
        case 155:
        case 157:
        case 158:
        case 167:
        case 171:
        case 173:
        case 174:
        case 179:
        case 181:
        case 182:
        case 185:
        case 186:
        case 188:
        case 199:
        case 203:
        case 205:
        case 206:
        case 211:
        case 213:
        case 214:
        case 217:
        case 218:
        case 220:
        case 227:
        case 229:
        case 230:
        case 233:
        case 234:
        case 236:
        case 241:
        case 242:
        case 244:
        case 248:
        {
            result += 5;
            break;
        }
        case 63:
        case 95:
        case 111:
        case 119:
        case 123:
        case 125:
        case 126:
        case 159:
        case 175:
        case 183:
        case 187:
        case 189:
        case 190:
        case 207:
        case 215:
        case 219:
        case 221:
        case 222:
        case 231:
        case 235:
        case 237:
        case 238:
        case 243:
        case 245:
        case 246:
        case 249:
        case 250:
        case 252:
        {
            result += 6;
            break;
        }
        case 127:
        case 191:
        case 223:
        case 239:
        case 247:
        case 251:
        case 253:
        case 254:
        {
            result += 7;
            break;
        }
        case 255:
        {
            result += 8;
            break;
        }
    }

    /* 
    *Snip*
    ~870 lines of code! 
    Just repeat the above "switch" for each byte.
    */

    return result;
}

           Method |       Mean |    StdDev |
----------------- |----------- |---------- |
     CountSetBits |  9.1868 ns | 0.0255 ns |
    CountSetBits2 |  8.1354 ns | 0.0107 ns |
    CountSetBits3 |  4.3636 ns | 0.0087 ns |
 CountSetBitsSlow | 23.5334 ns | 0.1123 ns |



Edit 3:
Final attempt, using an array to cache the results for a byte:
C#
private static readonly byte[] BitCache = 
{
    0, 1, 1, 2, 1, 2, 2, 3,
    1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7,
    5, 6, 6, 7, 6, 7, 7, 8, 
};

public static int CountSetBits4(int value)
{
    var bytes = new Int32Bytes { Int32 = value };
    return BitCache[bytes.Byte0] + BitCache[bytes.Byte1] + BitCache[bytes.Byte2] + BitCache[bytes.Byte3];
}

This seems to be slightly faster than method 3 (Edit 2), with much smaller code:
           Method |       Mean |    StdDev |
----------------- |----------- |---------- |
     CountSetBits | 10.6005 ns | 0.0578 ns |
    CountSetBits2 |  8.6753 ns | 0.1706 ns |
    CountSetBits3 |  4.3420 ns | 0.0383 ns |
    CountSetBits4 |  3.1724 ns | 0.0032 ns |
 CountSetBitsSlow | 21.4448 ns | 0.3445 ns |
 
Share this answer
 
v5
Comments
Graeme_Grant 17-Feb-17 13:44pm    
How does it compare with the lookup table version using FastBitCount that I posted?
Richard Deeming 17-Feb-17 14:01pm    
Unsurprisingly, once it's set up, yours is a lot faster:
           Method |       Mean |    StdDev |
----------------- |----------- |---------- |
     CountSetBits |  8.4809 ns | 0.0045 ns |
    CountSetBits2 |  8.1834 ns | 0.0290 ns |
 CountSetBitsSlow | 24.2760 ns | 0.1701 ns |
   CountSetBitsGG |  1.2261 ns | 0.0051 ns |


However, it takes longer to set up, and uses an extra 64KB of memory, because you're pre-calculating and caching the result for every 16-bit number. :)
Graeme_Grant 17-Feb-17 14:07pm    
Thanks for doing that. Yes, there is a table to set up, I could have posted the table in code rather than calculate it. But I guess the core function meets the "efficient as possible" criteria. :)
Graeme_Grant 17-Feb-17 15:00pm    
I see that you took my idea and did a variation of what I was talking about... though a smaller lookup table is slower... ...
Richard Deeming 17-Feb-17 15:07pm    
And I see you've "borrowed" my idea of pre-computing the lookup table. :)

(That's probably why this page is now taking so long to load!)

It's a trade-off between memory consumption and speed. Yours is using 256 times the memory to roughly double the speed. It's up to the customer to decide which is more important.

But of course, neither version is going to beat the magic bit-twiddling option I linked to in the comments on the question, or the low-level popcnt option available in C++ and assembly. :)
A speed boost for solution 1... Requires priming the lookup table first, then calling method FastBitCount.
C#
using System;

namespace FastBitCount
{
    class Program
    {
        static void Main(string[] args)
        {
            InitLookup();

            Console.WriteLine($"42 = {FastBitCount(42)}");
            Console.ReadKey();
        }

        static int FastBitCount(int num)
            => FastBitCount((uint)num);

        static int FastBitCount(uint num)
            => lookup[num & 0xFFFF] + lookup[num >> 16];

        static byte[] lookup = new byte[65536];
        static void InitLookup()
        {
            for (int i = 0; i < lookup.Length; i++)
                lookup[i] = GetOneBits(i);
        }

        static byte GetOneBits(int num)
            => (byte)Convert.ToString(num, 2)
                            .Split(new[] { '0' }, 
                                StringSplitOptions.RemoveEmptyEntries)
                            .Length;
    }
}

UPDATE: To speed up the startup, I've removed the calculation of the lookup table and hard coded it.
C#
using System;

namespace FastBitCount
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"42 = {FastBitCount(42)}");
            Console.ReadKey();
        }

        static int FastBitCount(int num)
            => FastBitCount((uint)num);

        static int FastBitCount(uint num)
            => lookup[num & 0xFFFF] + lookup[num >> 16];

        static byte[] lookup = new byte[0x10000] {
    0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,
    1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,
    1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,
    2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,
    //trimmed
    11,12,12,13,12,13,13,14,12,13,13,14,13,14,14,15,12,13,13,14,13,14,14,15,13,14,14,15,14,15,15,16
            };
    }
}
You can download the project[^] to try it out.
 
Share this answer
 
v5
Comments
OriginalGriff 17-Feb-17 15:17pm    
Could you do us all a favor and move your table to a file and post a link? That's a damn slow page load you are causing, and that's unfair on those with slow or per-meg payment connections.
Graeme_Grant 17-Feb-17 15:21pm    
already in the process of doing it...
Patrice T 20-Feb-17 23:22pm    
If the lookup is what I think, I would have expected it to begin with 0, 1, 1, 2, 1, 2, 2, 3
Graeme_Grant 21-Feb-17 0:50am    
Yes ... must have copied the wrong table when creating the post... fixed.
Patrice T 21-Feb-17 3:18am    
Indeed, better that way.
Quick and dirty for Visual C++ with x86 CPUs (Intel Nehalem and AMD Barcelona or later):
#include <iostream>
#include <stdlib.h>
#include <intrin.h> 

using namespace std;

int _tmain(int argc, _TCHAR* argv[])
{
	if (2 != argc)
	{
		cout << "Usage: Bitcount <32-bit value>" << std::endl;
		return 1;
	}

	TCHAR *stop;
	unsigned long u = _tcstoul(argv[1], &stop, 0);
	if (0 != *stop || u > _UI32_MAX)
	{
		cout << "Usage: Bitcount <32-bit value>" << std::endl;
		return 1;
	}

	cout << "The value " << u << " has " << __popcnt(u) << " bits set" << endl;

	return 0;
}

The __popcnt() function is an intrinsic function that calls the x86 POPCNT instruction.

[EDIT]
With GCC use __builtin_popcount() instead and compile with -mpopcnt.
[/EDIT]
 
Share this answer
 
v2
C#
public int GetOneBits(int number) {
    string hexNum = Convert.ToString(number, 2);
    string[] ones = hexNum.Split('0');
    return ones.Length;
}

public void Test() {
    int ones = GetOneBits(423);
    Console.WriteLine("Number of one bits = " + ones.ToString());
    
    // answer = 4;
}
 
Share this answer
 
v3
Comments
OriginalGriff 17-Feb-17 11:46am    
That's nasty inefficient, and just "hides" the loops: there are two loops, and up to 34 heap allocations in the .NET methods!
Ramza360 17-Feb-17 12:06pm    
yup sure is
Ramza360 17-Feb-17 12:07pm    
oh lol you wanted it efficient. Haha
Dave Kreskowiak 17-Feb-17 13:26pm    
An important part of the challenge is reading and understanding ALL of the requirements.
Ramza360 17-Feb-17 14:38pm    
Well its at least very fast, albeit inefficient.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900