
Thanks for the reply man. It was helpful.
Jeremy Falcon





Financial apps as far as I know use four decimal digits during calculations to avoid problems with only two, then round at the end.





Yeah, someone else just posed this. I'm gonna do the same then. Thanks man.
Jeremy Falcon





"project that requires complete accuracy on numbers"
A couple of thoughts. Is not the above requirement impossible on a binary system? By the very definition, you are going to lose precision be it float, double, double double.... how far do you want to go?
For me, I work a lot in machine HMIs. Some users want metric, others want English. I've always had a requirement to allow the user to switch between units and maintaining what is displayed. For example, 1" is 25.4 mm. If I switch between metric and English, the value must be consistent.
As for complete accuracy  this for me has always fallen into fixed point arithmetic to avoid rounding errors. COBOL has been mentioned. I've done COBOL  a very long time ago, but as I recall, it did fixed point arithmetic very well. Or I might be missing something...
Please elaborate on what you mean for "complete accuracy"? This sounds like a requirement from someone who really does not understand their request  sort of like a rare steak, but the temp should be 175F....
Charlie Gilley
“They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759
Has never been more appropriate.





You're overthinking it man. The numbers need to be correct, as verifiable by secondary or tertiary means.
Jeremy Falcon





you could do this:
decimal result = decimal(1.0) * decimal(2.0);
or this:
double resulta = Math.Round(x + y, 5);
".45 ACP  because shooting twice is just silly"  JSOP, 2010  You can never have too much ammo  unless you're swimming, or on fire.  JSOP, 2010  When you pry the gun from my cold dead hands, be careful  the barrel will be very hot.  JSOP, 2013





Rounding with every calculation is what I was doing. I decided to move to just using integers and cents. That's for the reply though.
Jeremy Falcon





If you're using c# you can use decimal types and cast if/when you need to go back to floats/doubles.
".45 ACP  because shooting twice is just silly"  JSOP, 2010  You can never have too much ammo  unless you're swimming, or on fire.  JSOP, 2010  When you pry the gun from my cold dead hands, be careful  the barrel will be very hot.  JSOP, 2013





I'm not, but yeah that's actually a great feature of C#. Turns out COBOL has that type too, found that one out.
Jeremy Falcon





It will largely depend on your application and requirements.
For currency type applications, consider using a BCD (Binary Code Decimal, used in COBOL) package. See below for references.
For integer type applications, there are a few "large" int packages.
For scientific applications, there are a number of packages for large number processing.
************ BCD references ********************
https://web.archive.org/web/20081102170717/http://webster.cs.ucr.edu/AoA/Windows/HTML/AdvancedArithmetica6.html#1000255
https://handwiki.org/wiki/Binarycoded_decimal#EBCDIC_zoned_decimal_conversion_table
Notes:
1) BCD numbers can be packed (2 digits/byte) or unpacked (1 digit per byte)
2) The low order byte (right most) of packed is nnnnssss where nnnn is the low order digit and ssss is the sign (0x0D for negative, 0x0F for positive)
3) The spec is (www,ddd) where www is the total bytes and ddd is the digits to right of decimal point. E.g.: 5,2 is a 5 digit number with 2 digits to the right of the decimal point"123.45" This field would require 3 bytes packed, 6 bytes unpacked.
4) From IBM: For a field or array element of length N, if the PACKEVEN keyword is not specified, the number of digits is 2N  1; if the PACKEVEN keyword is specified, the number of digits is 2(N1).
5) Some documentation refers to BCD as DECIMAL but others use DECIMAL to refer to floating point.
********************* For large int ********************
Microsoft SafeInt package
SafeInt Class  Microsoft Learn[^]
The decNumber package can handle decimal integer number of user defined precision
GitHub  dnotq/decNumber: Decimal Floating Point decNumber C Library by IBM Fellow Mike Cowlishaw[^]
(I have not yet used or investigate the cran project.)
CRAN  Package VeryLargeIntegers[^]
******************** For Floating Point ********************
Floating point gets very complex and confusing because there has never been a really good, consistent standard for floating point. Most conventions are vendor/system design dependent.
Here's a list of packages and some brief explanations:
GCC stores and performs operations on a variable defined as long double as fp80 (10 bytes), but 16 bytes are used.
General reference: https://en.wikipedia.org/wiki/List_of_arbitraryprecision_arithmetic_software
https://en.wikipedia.org/wiki/Arbitraryprecision_arithmetic
GCC Floating point: https://gcc.gnu.org/onlinedocs/gcc/extensionstotheclanguagefamily/additionalfloatingtypes.html
Floating point specifications: https://speleotrove.com/decimal/dbspec.html
Performance specs: https://speleotrove.com/decimal/dpquad.html
http://speleotrove.com/decimal/decbits.html
IBM decimal arithmetic package: https://github.com/hercules390/decNumbericu368
Half precision: https://en.wikipedia.org/wiki/Halfprecision_floatingpoint_format
Half precision software (16 bit) https://half.sourceforge.net/
Comparison BID vs DFP:
1) https://www.researchgate.net/publication/224114304_Performance_analysis_of_decimal_floatingpoint_libraries_and_its_impact_on_decimal_hardware_and_software_solutions
2) http://iccd.et.tudelft.nl/Proceedings/2007/Papers/3.3.1.pdf
3) libdpf source: https://github.com/libdfp/libdfp

16 bit (half) 
The IEEE 754 standard[9] specifies a binary16 as having the following format:
* Sign bit: 1 bit
* Exponent width: 5 bits
* Significand precision: 11 bits (10 explicitly stored)
The format is laid out as follows:
The format is assumed to have an implicit lead bit with value 1 unless the exponent field is stored with all zeros. Thus, only 10 bits of the significand appear in the memory format but the total precision is 11 bits. In IEEE 754 parlance, there are 10 bits of significand, but there are 11 bits of significand precision (log10(211) ≈ 3.311 decimal digits, or 4 digits ± slightly less than 5 units in the last place).
32 bit format 
https://en.wikipedia.org/wiki/Singleprecision_floatingpoint_format
64 bit format 
https://en.wikipedia.org/wiki/Doubleprecision_floatingpoint_format
80 bit format
https://en.wikipedia.org/wiki/Extended_precision#x86_extended_precision_format
IBM HFP
https://en.wikipedia.org/wiki/IBM_hexadecimal_floatingpoint
96 bit format
96 bit FP occupies 128 bits in x86 platforms
The Motorola 6888x math coprocessors and the Motorola 68040 and 68060 processors support this same 64bit significand extended precision type (similar to the Intel format although padded to a 96bit format with 16 unused bits inserted between the exponent and significand fields[9]). The followon Coldfire processors do not support this 96bit extended precision format.[10]
The FPA10 math coprocessor for early ARM processors also supports this extended precision type (similar to the Intel format although padded to a 96bit format with 16 zero bits inserted between the sign and the exponent fields), but without correct rounding.





Thanks for this. I should probably say, for my use case in particular, I'm in a Node project. But, it's cool to know this libs exist. Granted, I could make a C/C++ module and use that within Node, but for this project at least I'm trying to keep it zippy since JavaScript isn't as fast as C/C++.
Jeremy Falcon





And you accuse me of overthinking it
Charlie Gilley
“They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759
Has never been more appropriate.





After decades of writing software for industrial, medical, financial and LoB (Line of Business) applications, I found that the following guidelines work:
1) Financial and money, I always use the decimal type for currency, and the smallest sized type that affords me the precision I need. So why would I use any other type in a currency/money app?
Simple example: I'm writing a trading application, where the strike price will be stored in a decimal type, and the number of shares will be store in a float type. Why not use a decimal type for the number of shares? Because there's no guarantee that it will be 3 places to the right of the decimal (that's typical, but not a hard fast rule). I chose float because its the smallest type that offers the precision I seek. By smallest I mean that a double is typically twice the size of a float.
For those that are tempted to respond that floats are 64 bits, and doubles are 128 bit, not necessarily. That's a very PC centric view.
Note: These guidelines typically, but not always, apply to LoB
2) For medical and industrial, which usually require floating point precision to store values that may not be the same as the formatting to the display, I use floats and doubles, using the smallest type that affords the precision required by the application under development.
What do I mean by the smallest type and precision?
The size of the type refers to how large does the floating point type have to be in order to maintain the level of precision (the number of places after the decimal point) while not losing appreciable loss to rounding and implicit conversions (more on that below).
Caveats:
There are several other considerations when choosing and writing floating point code.
A) Rounding loss: This refers to how precise a resulting value is after some operation is performed on it. This is not limited to mathematical operations only (multiplication, division), this also applies to any library calls used to generate a new value e.g. sqrt(...).
B) Conversions: Be very very careful about mixing types i.e. decimal, float and double. When a smaller type is promoted to a larger type, it may introduce random "precision" that actually makes the new value deviate farther from the mean i.e. the new value strays farther from representing the true value.
So for example:
float pi = 3.1415927;
float radius = 5.2;
double circumference = 2.0f * radius * pi;
circumfrence may be holding a value that looks to be precise, but in fact the implicit conversion from float to double introduced some randomness to the number in the right side of the decimal point.
C) Truncations: Conversely storing the result of a larger type in a smaller type will remove the precision to the right of the decimal point.
So for example:
Formula: Simple Interest (SI) = Principal (P) x Rate (R) x Time (T) / 100.
float si;
double principal;
double rate;
int time;
si = principal * rate * time / 100;
We have to watch two types that are involved in this calculation  time, which is an int type, and si (simple interest) which is a float type.
Why would we use float instead of double for the simple interest si?
Suppose you have a method that takes a float as one of its arguments:
float DoSomeCalcOnSimpleInterest(float si)
{
...
}
The only type that this call will accept is a float, so the result must be a float.
Why not make all of the arguments type float? You might need the extra precision during the calculation, even if the answer can be sufficiently represented in a float.
D) Illegal operations: Examples typically include division by 0, generating NaNs (that's a whole other ball of wax), etc. Typically the way to guard against these edge cases, in addition to parameter validation, is to frame the code in an exception handler. This might not always be possible, such as in embedded development where the language support for exception handling may not be included as memory space is a precious commodity.
You might also be writing code on a target processor that does not trap arithmetic exceptions i.e. there is no support to trap/propagate exceptions to the upper layer where the application lives. These insidious types of errors/bugs are difficult to trace if care is not taken in the development of the codebase.
This missive is not meant to be an exhaustive treatise into implementing floating point support in an application. It should, however, provide you with starting guidelines as you design and development your software.
If you are developing in C++ or C#, you might want to take a look at this:
Implicit Operators in C#: How To Simplify Type Conversions
(https://www.codeproject.com/Articles/5378558/ImplicitOperatorsinCsharpHowToSimplifyType?msg=5991279#xx5991279xx)





Thanks for the reply Stacy. These are all great points. For this project, I'm in JavaScript/TypeScript and dealing with money. So there is no decimal type. But, after this chat I decided to just add two extra decimal places of resolution. So, I'll store a currency amount as 1.1234 and only round it off to 2 during reporting.
Stacy Dudovitz wrote: Conversions: Be very very careful about mixing types i.e. decimal, float and double. Tru dat. Not sure about C#, but in JavaScript/TypeScript I only have one level of precision from a data type. As a bonus though, there is a cool way to help to avoid mixing faux types.
export type Distinct<T, DistinctName> = T & {
__TYPE__: DistinctName
};
export type NumericTypeOne = Distinct<number, 'NumericTypeOne'>;
export type NumericTypeTwo = Distinct<number, 'NumericTypeTwo'>;
Stacy Dudovitz wrote: Implicit Operators in C#: How To Simplify Type Conversions If I'm ever in C/C++ land again I'll check it out. Thanks.
Jeremy Falcon





I was a bit alarmed by your reply and solution below:
For this project, I'm in JavaScript/TypeScript and dealing with money. So there is no decimal type. But, after this chat I decided to just add two extra decimal places of resolution. So, I'll store a currency amount as 1.1234 and only round it off to 2 during reporting.
There are two possible solutions:
1) If you are always/only going to traffic in money, a more robust solution would be to use integer math and display formatting.
As an example, the value '$1.23" would be stored in an integer of sufficient size to house the min/max dollar value you wish to traffic in. Using RegEx, it would be trivial to strip off the '$' and '.', yielding a value of the price offset by a factor of 100. To display, you could use simple formatting. You can store the values asis to a data store, or, if you require marshaling of values, divide the value by 100 by first casting the value to float and dividing by 100f.
In this case, I would use Number or BigInt. A quick search on the largest integer type gives the following results:
The biggest integer type in JavaScript is BigInt. It was introduced in ECMAScript 2020. BigInts can store integers of any size, while the Number type can only store integers between (2^53  1) and 2^53  1.
2) You could incorporate decimal.js into your project, which will provide you with the decimal type you seek.
You can find that here:
https://mikemcl.github.io/decimal.js/# :text=decimal.,js,available%20in%20the%20console%20now.
Whichever way you choose, I would implore you NOT to add arbitrary/additional numbers to the right of the decimal place. It will come back to bite you!





I started with integer math. Moved away from it after finding out that doing to 4 places makes it match up well enough. About the library, decimal.js, it's too bloated for my needs (fast calculations) and doesn't really offer anything that I didn't have when I just was rounding everything.
Jeremy Falcon





Maybe you want a currency library? You can control the rounding too with this.
currency.js[^]





Wordle 1,006 5/6*
🟨⬛⬛⬛🟩
🟩🟩🟩⬛🟩
🟩🟩🟩⬛🟩
🟩🟩🟩⬛🟩
🟩🟩🟩🟩🟩





Wordle 1,006 4/6
🟨🟨⬜🟨⬜
🟨⬜🟨🟨⬜
⬜⬜🟩🟩🟩
🟩🟩🟩🟩🟩





Wordle 1,006 2/6
🟩⬛⬛⬛⬛
🟩🟩🟩🟩🟩





Wordle 1,006 4/6
⬜🟨⬜🟨🟩
🟩⬜🟩⬜🟩
🟩⬜🟩🟩🟩
🟩🟩🟩🟩🟩





Wordle 1,006 3/6
🟨🟨🟩⬛⬛
🟩🟩🟩⬛🟩
🟩🟩🟩🟩🟩
Jeremy Falcon





Wordle 1,006 5/6
🟩⬜🟩⬜🟩
🟩⬜🟩⬜🟩
🟩⬜🟩🟩🟩
🟩🟩🟩⬜🟩
🟩🟩🟩🟩🟩
Fat fingered the penultimate attempt.





Wordle 1,006 4/6*
🟨⬜⬜🟨🟩
🟩🟩🟩⬜🟩
🟩🟩🟩⬜🟩
🟩🟩🟩🟩🟩
"I have no idea what I did, but I'm taking full credit for it."  ThisOldTony
"Common sense is so rare these days, it should be classified as a super power"  Random Tshirt
AntiTwitter: @DalekDave is now a follower!





⬜⬜🟩⬜🟩
⬜⬜⬜⬜🟨
⬜⬜🟩🟨🟩
🟩🟩🟩🟩🟩
In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity.  Hunter S Thompson  RIP




