Click here to Skip to main content
15,905,967 members
Articles / Internet of Things

What's Wrong With My CRC?

Rate me:
Please Sign up or sign in to vote.
5.00/5 (1 vote)
20 Jan 2016CPOL4 min read 11.9K   5   1
I'm working on a connectivity library for IoT devices. A serious part of every communication protocol is the data integrity check.

This article is for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers

Get access to the new Intel® IoT Developer Kit, a complete hardware and software solution that allows developers to create exciting new solutions with the Intel® Galileo and Intel® Edison boards. Visit the Intel® Developer Zone for IoT.

Check out the (non) CRC implementation below. What's wrong with it?

I'm working on a connectivity library for IoT devices. A serious part of every communication protocol is the data integrity check. You send a collection of bytes to another machine and that machine has to know that there were no errors on the way. IP for example already has a good integrity check. The problem is that a TCP socket is basically a stream. This means that you don't really know when a buffer begins and ends. By using an integrity check we can verify that we are looking at full buffers.

The basic concept is to have some correlation between the first and the last byte. For example if first 2 bytes are length of the buffer and the last two bytes are 0x0102 then we can keep scanning the buffer until the first two bytes point to 0x0102. The problem is that every number we may choose can appear inside the buffer as part of the data. For example what if we have 10,000 bytes of 0x0102? We will find 0x0102, assume that it is the length, jump forward and find 0x0102. Magic number is probably one of the worst ways to go. Length is a characteristic of the buffer. We need to have another characteristic of the buffer at the end as well. The preferred method is to use all the data in the buffer to generate an integrity check value. For this there are commonly Checksum, XOR, CRC, and Hush. The question is how much CPU and RAM vs. the quality of error detection.

Checksum was originally a simple addition of all bytes (sum...). The problem is that 0x80+0x10 is the same as 0x10+0x80. This means that replacing one byte might be detected, but there is high probability that one or more bytes might compensate to get the same result. Here is another example: 0x10+0x20+0x30 has the same checksum as 0x60+0x00+0x00. Clearly a different buffer.

Using XOR is very simple and low cost for the CPU. The problem is that 0x80^0x80 is the same as 0x81^0x81. This means that one error in the buffer may be detected but there is good probability that a second error will hide the first. XOR is safe as long as there is only one error in each bit of all bytes (or any odd number of errors for any bit).

CRC - Cyclic Redundancy Check is probably the best way to go. There is a nice algorithm which I will detail below. It is more expensive in CPU and RAM but much more reliable. Every error propagates to continually produce completely different numbers. It is very hard to compensate for one error with another error. If you have 0x12 instead of 0x10, there isn't only one number that has to change to a specific location to compensate. CRC16 means 16 bit number and CRC 32 means 32 bit number. This means that CRC16 is 1 in 64K probability that the buffer is correct. If a byte changed and caused a different CRC16 to be produced, it would take several different bytes to compensate for it because one byte cannot repair a 16 bit value.

Advanced Hush functions usually consume too much RAM and CPU but eventually the output number is what matters. if you have a 16 bit integrity value then even the best algorithm will not save you from a 1 : 64K ratio.

I googled "crc16 algorithm c" and found several results. The first to demonstrate the concept is here:

uint16_t crc16_update(uint16_t crc, uint8_t a)
   int i;
   crc ^= a;
   for (i = 0; i < 8; ++i)
      if (crc & 1) crc = (crc >> 1) ^ 0xA001;
      else crc = (crc >> 1);
   return crc;

This example is good because it shows you the cycles. For every bit we shift and based on a selected bit value we either xor with a magic number or not. This loop consumes CPU so we can actually prepare the result for every given byte in a lookup table as seen here:

unsigned int crctable[256] =
   0x0000, 0x1189 ........

while (len--)
   crc = (crc << 8) ^ crctable[((crc >> 8) ^ *c++)];

I tried to find something that I could use on an Arduino device without wasting CPU and without using a great portion of the device's memory. Here is what I got (translated from C# without compiling, I'll explain why below)

myCrc ^= *buffer++;
myCrc = (myCrc) ^ (myCrc << 5) ^ (myCrc << 8);

How is it possible that I don't know about such implementation and it does not come up as first hit in google search. What am I missing here?

The reason I used C# was for the tests (and because this is the project that I had open when I wanted to test this solution).

Here is the function:

private static ushort CRC16(byte[] buffer, out ushort crc, out ushort myCrc)
   myCrc = 0;
   crc = 0;
   int pos = 0;
   while (pos < buffer.Length)
      crc ^= (ushort)(buffer[pos] << 8);
      for (int i = 0; i < 8; ++i)
         if ((crc & 0x8000) != 0) crc = (ushort)((crc << 1) ^ 0x1021);
         else crc = (ushort)(crc << 1);

      myCrc ^= (ushort)(buffer[pos]);
      myCrc = (ushort)((myCrc) ^ (myCrc << 5) ^ (myCrc << 8));
      //myCrc ^= *buffer++;
      //myCrc = (myCrc) ^ (myCrc << 5) ^ (myCrc << 8);

   return (crc);

Here is the tester code:

      private static void BenchmarkCRC16()
         ushort[] crcTable = new ushort[0x10000];
         ushort[] myCrcTable = new ushort[0x10000];
         for (int a = 0; a < 0x17000; a+=13)
            if (0 == (a % 256)) Console.WriteLine("a = 0x" + a.ToString("X"));
            ushort ua = (ushort)a;
            for (int i = 0; i < 0x10000; i++)
               ushort u = (ushort)i;

               ushort crc16 = 0;
               ushort myCrc = 0;
               CRC16(new byte[] { (byte)(u & 0xFF), (byte)((u >> 8) & 0xFF), (byte)(ua & 0xFF), (byte)((ua >> 8) & 0xFF) }, out crc16, out myCrc);
         List<int>   crcUtilization = new List<int>();
         List<int> myCrcUtilization = new List<int>();
         for (int i = 0; i < 0x10000; i++)
            bool ok = false;
            for (int t = 0; t < crcUtilization.Count; t++) { if (crcUtilization[t] == crcTable[i]) { ok = true; break; } }
            if (!ok) crcUtilization.Add(crcTable[i]);
            ok = false;
            for (int t = 0; t < myCrcUtilization.Count; t++) { if (myCrcUtilization[t] == myCrcTable[i]) { ok = true; break; } }
            if (!ok) myCrcUtilization.Add(myCrcTable[i]);
// BREAKPOINT HERE and check out crcUtilization and myCrcUtilization (or print)

      private static void CRCTest(byte[] buffer)
         ushort crc16 = 0;
         ushort myCrc = 0;
         CRC16(buffer, out crc16, out myCrc);

The tests show that myCrc and crc16 both have even spread values so every combination of bytes in the original buffer has a unique number within the limit of 64K. For example if you comment out the line "myCrc = (ushort)((myCrc) ^ (myCrc << 5) ^ (myCrc << 8));" in my test function (only XOR) you will see few results covered repeatedly and many values not used at all (zero).

If such implementation is good then it is clearly suitable for AVX vectorization, as opposed to using a for loop or a lookup table.

I am open for any suggestion or thought.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By
United States United States
You may know us for our processors. But we do so much more. Intel invents at the boundaries of technology to make amazing experiences possible for business and society, and for every person on Earth.

Harnessing the capability of the cloud, the ubiquity of the Internet of Things, the latest advances in memory and programmable solutions, and the promise of always-on 5G connectivity, Intel is disrupting industries and solving global challenges. Leading on policy, diversity, inclusion, education and sustainability, we create value for our stockholders, customers and society.
This is a Organisation

42 members

Comments and Discussions

QuestionMore to CRCs Pin
feanorgem22-Jan-16 15:35
feanorgem22-Jan-16 15:35 
You mentioned TCP which uses 32-bit CRCs but you are using 16-bit CRC's which were more popular with modems (XMODEM, KERMIT, CCITT). Since you are using Intel processors there are more efficient methods. If your processor supports the SSE4.2 instruction extensions then you can use the built in crc32 instruction via intrinsic functions (_mm_crc32_u8, _mm_crc32_u16, _mm_crc32_u32, _mm_crc32_u64) which can process single or multiple bytes in a single instruction. The polynomial for this instruction is the fixed value of 0x1edc6f41 (CRC-32C).

Also you should know that CRC's differ in types of errors they detect as well as sequences to which they are vulnerable. Some CRC's shift left while others shift right, they can use the same polynomial but you have to reflect the bits in the polynomial. Some CRC's initialize to all zeros while others initialize to all ones. Some invert the data bits or the result while others do not. There are reasons for each of these options as some are vulnerable to runs of leading zero bytes while others are vulnerable to runs of trailing zero bytes.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.