Currently, there are a lot of libraries that offer arbitrary precision arithmetic types (be it integer or float) with a complete interface, high performance, and reliability (GNU MP, Boost.Multiprecision, Crypto C++). However, adding them as a dependency to your project is not a singlestep task. Small standalone libraries do exist, but they usually have certain flaws: the base is a power of 10, it is necessary to compile sources (.cpp), they are not reliably tested, and so on. This article introduces a new library, called Arbitrary Precision, which offers a reliable, flexible, and relatively fast long integer class.
About the Library
Arbitrary Precision (AP) is available on GitHub: https://github.com/arbitraryprecision/ap. Its philosophy is intuitiveness. The library must be easy to add to a project, easy to use, and easy to have as a dependency.
Core traits of the library are:
 Headeronly by default. Source compilation is an optional feature.
 Crossplatform (build verified for MSVC and GCC).
 Supports multiple standards (C++11, C++14, C++17).
 No warnings during compilation.
 Thoroughly tested.
Version 1.2.0 offers two types  signed and unsigned fixedsize long integer. Both have the following interface:
 Copy and move constructors. Both are crossclass  it is possible to copy or move integer of any size and signedness to any other integer.
 Constructors, methods, and operators for
std::string
and const char*
. The verified supported base range is [2, 16].  Arithmetic, bitwise, and comparison operators are overloaded to work with long integers of different sizes or signedness.
 Exception from the above: bit shift operators accept only unsigned long long as a right operand (temporarily).
 All operators and constructors for the basic integer types are overloaded.
istream
/ostream
overloads, with srd::ios_base::fmtflags
detection for base.
Using the Code
Library integration is simple. It is needed to clone the repository and include "ap.hpp" header.
In the global scope, two templates are available: ap_int<size_t>
and ap_uint<size_t>
. Template parameter is a type size in bits.
There is no need to describe and demonstrate the behavior in detail since the types behave like int
. Short example:
#include <iostream>
#include <ap/ap.hpp>
ap_int<256> fact(ap_uint<256> val)
{
ap_int<256> result = 1; for (ap_int<128> i = 1; i <= val; ++i) {
result *= i; }
return result;
}
int main()
{
std::cout << fact(30) << std::endl; std::cout << fact(ap_int<128>(30)) << std::endl; std::cout << fact(ap_uint<256>(30)) << std::endl; std::cout << fact(ap_int<256>(30)) << std::endl; return 0;
}
What to remember when using AP integers:
 Signed integers behave as if they have two's complement representation. There is no extra sign bit.
 Division by zero triggers the native behavior of a platform.
 Bit shifts are arithmetic. If a shift value exceeds the size of an integer, the result is
0
. Exception: if it is a right shift and the integer is negative, the result is 1
(signfill).  Size in bits must be greater than the size of unsigned
long long
.  Size in bits should be a multiple of 64. it is not mandatory, but there is no need to play with fire.
Long integer types also offer flexible string conversion interface. The most flexible method to convert from any string is "set
":
integer& set(const char* str,
index_t size = 0, index_t base = 0, const char* digits = AP_DEFAULT_STR);
There are overloads available, which call this method:
integer& set(const std::string& str, index_t base = 0, const char* digits = AP_DEFAULT_STR)
explicit integer(const std::string& str, index_t base = 0,
const char* digits = AP_DEFAULT_STR)
explicit integer(const char* str, index_t size = 0, index_t base = 0,
const char* digits = AP_DEFAULT_STR);
integer& operator=(const std::string& str)
integer& operator=(const char* str)
There are two methods to convert to std::string
:
std::string str(index_t base = AP_DEFAULT_STR_BASE,
const char* digits = AP_DEFAULT_STR) const;
explicit operator std::string() const;
Example:
#include <iostream>
#include "ap/ap.hpp"
int main()
{
ap_int<128> a{"0b1"}; ap_int<128> b{std::string("Z"), 3, "XYZ"}; ap_int<128> c;
c = "3"; ap_int<128> d;
d.set("1009736", 4, 2);
std::cout << std::string(a) << ' ' << std::string(b)
<< ' ' << std::string(c) << ' ' << std::string(d) << '\n';
std::cout << a.str(2) << ' ' << b.str(2)
<< ' ' << c.str(2) << ' ' << d.str(2) << '\n';
std::cout << a.str(3, "XYZ") << ' ' << b.str(3, "XYZ")
<< ' ' << c.str(3, "XYZ") << ' ' << d.str(3, "XYZ") << '\n';
}
Things to remember:
 Regardless of the
base
, the resulting string
has "sign and magnitude" representation.  If the
base
is binary, octal, or hexadecimal, the corresponding prefix is always appended.
I/O: iostream
operators are available
Input: reads std::string
from an input stream and converts it to a long integer
Output: converts to the base, specified by format flags, and then writes to an output stream
Source Compilation Mode
If macro AP_USE_SOURCES
is defined, then .cpp files must be added to build and compiled. If you prefer the classical way of source code organization, this option is for you. Note that if .cpp files are added to build, but AP_USE_SOURCES
is not defined, compile guard will handle this. Consider pair of sources: asm.hpp with declarations and asm.cpp with definitions:
#ifndef DEOHAYER_AP_ASM_HPP
#define DEOHAYER_AP_ASM_HPP
#ifndef AP_USE_SOURCES
#define DEOHAYER_AP_ASM_CPP 0 // For compile guard compatibility.
#include "asm.cpp"
#endif
#endif
#ifndef DEOHAYER_AP_ASM_CPP
#define DEOHAYER_AP_ASM_CPP 1 // Source compilation case.
#endif
#if (DEOHAYER_AP_ASM_CPP == 1) == defined(AP_USE_SOURCES)
#endif
Performance
Measurement is straightforward. For the given bit size of an integer:
 Generate the left operand, filled by 25%50%. In other words, any bits beyond the lower half will be 0.
 Generate the right operand, filled by 1225%.
 Perform the operation, measure time in nanoseconds.
 Steps above repeated 10000 times, total time is a KPI.
This approach does trigger long division and guarantees that multiplication of the size fits into the type (complete multiplication will be done). The table shows the ratio "AP/Boost" for the given integer bit size and operation:
Bit size
 +
 
 *
 /
 %
 <<
 >>

128
 1.74487
 2.23426
 2.43082
 6.32478
 5.87538
 2.17034
 1.6978

512
 1.23195
 1.43016
 1.16948
 0.862215
 0.96964
 1.43523
 1.63936

4096
 0.960102
 1.12024
 0.980719
 0.444539
 0.487134
 1.21475
 1.38079

10240
 1.41084
 1.23715
 0.933805
 0.380653
 0.408858
 1.32783
 1.36085

Compiled with GCC version: Ubuntu 7.5.03ubuntu1~18.04.
AP is considerably slower only on 128bit values because it is not optimized for such cases.
In all other cases, AP is on par with Boost. Linear functions are slightly slower. Nonlinear ones are faster.
Note: https://github.com/arbitraryprecision/ws/blob/master/src/measure/measure.cpp was used for the measurements. But because Boost uses __int128
internally, I had to switch AP to this type too (in root CMakeLists.txt):
add_executable(measure ${MEASURE_SRC_FILES})
target_compile_options(measure PUBLIC
DAP_WORD=unsigned\ long\ long
DAP_DWORD=unsigned\ __int128
)
__int128
is not fully supported yet  it breaks interaction with the basic types. However, it has no negative impact in other cases, therefore suitable for the measurement.
Implementation
Arbitrary Precision has three fundamental types:
word_t
 an array of word_t
represents a long integer value. Type can be set via AP_WORD
macro. dword_t
 holds a result of an operation on two word_t
s. It allows tracking carry, borrow, etc. Type can be set via AP_DWORD
macro. index_t
 same as size_t
.
There are five core concepts:
dregister<T>
(core.hpp)  POD struct, a lightweight sign and magnitude representation of the long integer. Consists of: words
 pointer to an array of word_t
; capacity  array size; size
 number of digits which represent the current value; sign
 indicates whether the value is negative. T
is a type of the pointer. Either const word_t*
or word_t*
. integer<size_t, bool>
(integer.hpp)  container that owns an array of words. ap_int<>
and ap_uint<>
are partial specializations of this template.  Algorithm functions (asm.*)  pure algorithms. They put strict constraints on their
dregister<>
operands, work only with nonnegative values. Responsibility: perform actual operation wordwise.  Internal functions (integer_api.*)  two sets of functions. Each set treats all
dregister<>
operands either as signed or unsigned. No constraints on operands. Responsibility  adjust operands, call algorithm, perform normalization afterwards.  External functions (integer.hpp)  methods and operators of the
integer<>
class. Responsibility  grant crosstype interaction and call internal functions.
Normalization is a process, which ensures that dregister<>
, represents the value correctly (is normalized):
size
must be set in a way, that words[size  1]
is not 0. Size 0 means that integer represents value 0. sign
must be 0 if size
is 0.  (signed only) Value represented by
dregister<>
must be in range [2^{n}, 2^{n}  1], where n  capacity
of the dregister<>
in bits. It means granting two's complement behavior for the library user.
Testing
The problem is how to cover all the possible scenarios and still have a transparent, reproducible approach. The solution is to use all possible parameter combinations:
capacity
. 2 cases: large (512 bits) or small (256 bits), for 3 operands (left, right, out). 8 combinations. This is the only parameter where the output dregister<>
is considered. The output might be too small to hold the value  wrapping behavior is tested too. size
. 5 cases: 1 word, 25%, 50%, 75%, 100% of the total capacity is used. 25 combinations for the two operands.  signedness (not sign). 2 cases: signed and unsigned. 4 combinations for the two operands.
 Bit pattern. 10 cases, shown in the table. 100 combinations for the two operands.
Name  Pattern 
All zeros  00000000 00000000 ... 00000000 00000000 
All ones  11111111 11111111 ... 11111111 11111111 
Small chess 0  01010101 01010101 ... 01010101 01010101 
Small chess 1  10101010 10101010 ... 10101010 10101010 
Large chess 0  00000000 11111111 ... 00000000 11111111 
Large chess 1  11111111 00000000 ... 11111111 00000000 
Only LSB  00000000 00000000 ... 00000000 00000001 
Only MSB  10000000 00000000 ... 00000000 00000000 
Missing LSB  11111111 11111111 ... 11111111 11111110 
Missing MSB  01111111 11111111 ... 11111111 11111111 
The total number of cases for a single binary operation is 80000. Testing is performed under valgrind on internal and external functions. All tests pass without any complaints from valgrind.
Future Improvements
 C++20 compliance
 Support of implementationspecific integer types (
__int128
)  Optimization of different routines
 Documentation
 More functions (
sqrt
, lcm
, gcd
, pow
, etc.)  Floatingpoint type
 Assembly optimization
History
 13^{th} December, 2021: Initial version
Bachelor of Computer Science, Ivan Franko National University of Lviv, Ukraine.
Primary specializations: C++, OOP, Data Structures and Algorithms.
Secondary specializations: C, Linux, AOSP, Hardware Virtualization.