15,745,232 members
Articles / Programming Languages / Objective C
Article
Posted 22 Jan 2019

24.9K views
2 bookmarked

# IEEE 754 Conversion

Rate me:
Two ways to make a IEEE 754 conversion (32 bit) pack and unpack

## Introduction

This article will show how to convert a `float` value into an integer according to IEEE 754 rules.

I will show two ways.

One is faster than the other one, particularly on the `unpack` function.

## Background

Sometimes, you need to send a `float` value over a protocol (serial or network) but the protocol you are using does not send/receive `float` value; it means this protocol supports only integer (signed or unsigned). This is very common during the developing of Microcontroller project.

In my personal experience, I've established a serial communication between two different microcontroller families (STM32F3 and STM32F7) using the Modbus rules.

If this scenario happens, you need to find a way to convert a `float` into an integer (from sender point of view), then convert the integer into its `float` value (receiver point of view).

A very common way to do this is using the IEEE 754 conversion.

The code was based on 32 bit but can easily be expanded to 64 bit. This could be necessary if you need to have more precision.

## Using the Code

This code is quite portable because it has been written using very standard C code (I'm using it on Microcontroller, Windows O.S, Linux and Embedded world as well).

I've created two files (header and code) and two functions.

Here is their prototype and explanation.

This function returns an unsigned `long` value that is a representation of the input `float` value.

I have called it `pack`.

C++
```uint32_t pack754_32 ( float f );
```

The `unpaack `function returns `float` values joined to the unsigned `long` value pass to it.

C++
```float unpack754_32( uint32_t floatingToIntValue );
```

If the input value does not have a valid IEEE 754 representation, an undefined value has returned. I have called it `unpack`.

In this example, to have platform portability, I've used the defined type `uint32_t` (stand for unsigned `long`).

C++
```// float unpack754_32 ( uint32_t floatingToIntValue );

FIRST WAY (faster) // Those methods use the implicit conversion of the C language

uint32_t pack754_32( float f )
{
uint32_t   *pfloatingToIntValue;
pfloatingToIntValue = &f;

return (*pfloatingToIntValue);
}

float unpack754_32( uint32_t floatingToIntValue )
{
float *pf, f;
pf = &(floatingToIntValue);
f = *pf;

return f;
}```

By using the union and its implicit conversion, this is another way to convert `float` into unsigned `long` value and vice-versa. This method will use BIT WISE functionalities The union will pack/unpack the value `f` into its representation:

C++
``` typedef union UnFloatingPointIEEE754
{
struct
{
unsigned int mantissa : 23;
unsigned int exponent : 8;
unsigned int sign : 1;
} raw;
float f;
} UFloatingPointIEEE754;```

The `Bit` operation extracts the bit in order to create the desired value (exponent and mantissa), according to IEEE 754 method.

I've used a `#define` instead of a function or inline function, because this is a faster way.

C++
```#define NTH_BIT(b, n) ((b >> n) & 0x1)

#define BYTE_TO_BIN(b)   (( b & 0x80 ) ) |\
(( b & 0x40 ) ) |\
(( b & 0x20 ) ) |\
(( b & 0x10 ) ) |\
(( b & 0x08 ) ) |\
(( b & 0x04 ) ) |\
(( b & 0x02 ) ) |\
( b & 0x01 )

#define MANTISSA_TO_BIN(b)  (( b & 0x400000 ) ) |\
(( b & 0x200000 ) ) |\
(( b & 0x100000 ) ) |\
(( b &  0x80000 ) ) |\
(( b &  0x40000 ) ) |\
(( b &  0x20000 ) ) |\
(( b &  0x10000 ) ) |\
(( b &  0x8000 ) ) |\
(( b &  0x4000 ) ) |\
(( b &  0x2000 ) ) |\
(( b &  0x1000 ) ) |\
(( b &  0x800 ) ) |\
(( b &  0x400 ) ) |\
(( b &  0x200 ) ) |\
(( b &  0x100 ) ) |\
(( b &  0x80 ) ) |\
(( b &  0x40 ) ) |\
(( b &  0x20 ) ) |\
(( b &  0x10 ) ) |\
(( b &  0x08 ) ) |\
(( b &  0x04 ) ) |\
(( b &  0x02 ) ) |\
( b & 0x01 )
```

Finally, here is the definition of `pack`/`unpack` functions.

Those ones use the previous define to return the desired value.

The `pack` function uses the implicity conversion of the union. By assigning the value of the union, the sign, exponent and mantissa will auto-fitted.

C++
```uint32_t pack754_32 ( float f )
{
UFloatingPointIEEE754 ieee754;
uint32_t    floatingToIntValue = 0;
ieee754.f = f;
floatingToIntValue = (((NTH_BIT(ieee754.raw.sign, 0) << 8) |
(BYTE_TO_BIN(ieee754.raw.exponent)))  << 23 ) | MANTISSA_TO_BIN(ieee754.raw.mantissa);
return floatingToIntValue;
}
```

The `unpack` function will use the bit wise operations to create the ad-hoc unsigned `int` value, according to IEEE754 standard.

C++
``` float unpack754_32( uint32_t floatingToIntValue )
{
UFloatingPointIEEE754 ieee754;    unsigned int mantissa = 0;
unsigned int exponent = 0 ;
unsigned int sign = 0;

sign = NTH_BIT(floatingToIntValue, 31);
for( int ix=0; ix<8; ix++)
exponent = (exponent | (NTH_BIT(floatingToIntValue, (30-ix))))<<1;
exponent = exponent>>1;
for( int ix=0; ix<23; ix++)
mantissa = (mantissa | (NTH_BIT(floatingToIntValue, (22-ix))))<<1;
mantissa = mantissa >> 1;

ieee754.raw.sign = sign;
ieee754.raw.exponent = exponent;
ieee754.raw.mantissa = mantissa;
return ieee754.f;
}```

## How to Use It

I have also provided a very simple test function that packs and unpacks some values.

C++
``` void TestPackUnpack ( void )
{
uint32_t n;
float f;

n = 0x3FB4FDF4;   f= 1.414
f = unpack754_32(n);

n = pack754_32(1.414);
f = unpack754_32(n);

n = pack754_32(-1.259921);
f = unpack754_32(n);

n = pack754_32(0.58);
f = unpack754_32(n);

n = pack754_32(-0.588);
f = unpack754_32(n);

n = pack754_32(2);
f = unpack754_32(n);

n = pack754_32(-3);
f = unpack754_32(n);

}```

## Points of Interest

I think this article will highlight some important functionality related to the C Language, such as implicit conversion, bit wise operator and so on.

The IEEE 754 conversion method can be used also to convert integer. In this way, if my protocol doesn't support `float` or floating point values, I can always use those methods to share information over the protocol.

Written By
Software Developer (Senior) Snap-On Equipment
Italy
Senior Software Engineer & Project Manager

 First Prev Next
 pack754_32 function problem Member 147628643-Mar-20 23:02 Member 14762864 3-Mar-20 23:02
 Hi, I used unpack754_32 function and it works correctly, but when I use pack754_32 function in my own code it doesn't work correctly. e.g. when I use this function to covert 23.73 to uint32_t, I receive 2.373 instead of 23.73! Can anybody help me? thanks, M. H. Dehnavi,
 Re: pack754_32 function problem Andrea Ricchetti28-Jan-21 4:15 Andrea Ricchetti 28-Jan-21 4:15
 It can be simpler Joaquin Obregon23-Jan-19 6:24 Joaquin Obregon 23-Jan-19 6:24
 Terminology and semantics YvesDaoust23-Jan-19 2:31 YvesDaoust 23-Jan-19 2:31
 Re: Terminology and semantics feanorgem23-Jan-19 6:22 feanorgem 23-Jan-19 6:22
 Last Visit: 31-Dec-99 18:00     Last Update: 27-Sep-23 13:08 Refresh 1