What should be the API for integer factorizing function?

Question

0.00/5 (No votes)

See more:

I'm writing a C++ library with high-performance operations on integers.
One of the functions will be integer factorization.

Now, I stumbled on the problem, what should the API for the function be?

32-bit unsigned integers can have up to 10 different prime factors.
64-bit unsigned integers can have up to 16 different prime factors.

There can be only one prime factor >= sqrt(number)

The easiest solution would be to just return a std::vector of pairs {prime, power} - but I don't want to perform any allocations. I don't want to depend on any external library.
Also, I want the function to be constexpr, and C++14 standard library limitations disallow usage of std::array here.

But that leads to a problem - the sizeof of the returned value will be pretty large, and even with RVO (return value optimization) it seems not ideal.

What API should this function have? A C-like one, when user provides an output buffer / output iterator ? Returning some class that has the values compressed, and provides a forward iterator to go through them?

Or am I overthinking things, and should just return a struct-wrapped array with all the pairs prime-power and don't care about the size?

This question is opinion-based, so there is no a good or bad answer, but I'd like to hear some opinions about it before deciding.

What I have tried:

For uint32_t, the simplest solution:

C++

struct FactorU32 {
    uint32_t prime;
    uint32_t power;
};

class FactorizationU32 {
    FactorU32 factors[10];
    size_t count;
    // here some accessors, iterators etc.
};

FactorizationU32 factorize(uint32_t n);

For uint64_t, the simplest solution:

C++

struct FactorU64 {
    uint64_t prime;
    uint64_t power;
};

class FactorizationU64 {
    FactorU64 factors[16];
    size_t count;
    // here some accessors, iterators etc.
};

FactorizationU64 factorize(uint64_t n);

One of the technics that would reduce return value size, but would require custom iterators for accessing:

C++

class FactorizationU32 {
    uint16_t primes[10];
    // if powers[i] == -1, then the prime is (primes[i] << 16) | primes[i + 1]
    uint8_t powers[10];
    size_t count;
    // here accessors, custom iterators etc.
};

(and equivalent for uint64_t)

Posted 27-Aug-21 0:56am

Member 15338211

Updated 27-Aug-21 5:18am

v2

Add a Solution

Comments

armagedescu 27-Aug-21 11:29am

>>This question is opinion-based
That's classification on stackoverflow.
>>The easiest solution would be to just return a std::vector of pairs {prime, power}:
no it is not. It is std::map.
>>but I don't want to perform any allocations:
what is exactly the reason for that?
>>I don't want to depend on any external library.
STL is standard library, it is not external. Use any part of STL that is not experimental.
>>I want the function to be constexpr
constexpr doesn't bring any performance improvements unless it can be calculated at compile time. Your problem by definition run time, not compile time. You even should take in consideration dynamic programming.

Member 15338211 27-Aug-21 12:20pm

@armagedescu I'm aware of that. Still, I want my function to be constexpr. I don't have any problems with implementing it, I just can't decide on the API. The library is not just for me - but for general use, that's why API is important. I strive for it to be safe, easy and intuitive to use.

armagedescu 28-Aug-21 6:05am

Ok.
>>just can't decide on the API
Any API which is standard. Especially any part of STL that is not on experimental is a good choice. That will not impose any special prerequisites and thirdparties to any user which is not you.
>>Still, I want my function to be constexpr. I don't have any problems with implementing
I am curious. Are you using any kind numeric database? Will you make this library public?

Member 15338211 28-Aug-21 7:34am

Yes, public library under MIT license. There is quite a lot of number-theoretic algorithms that are not to be found implemented efficiently anywhere (with currently available libraries being at least 8x slower than possible). I'm implementing it with usage of assembly and compiler intrinsics for different platforms. I try to avoid allocations whenever possible, so that it can be used on embedded systems without global allocators too.

Patrice T 27-Aug-21 13:31pm

What is your algorithm for factorization ?

Member 15338211 28-Aug-21 5:47am

Modified Pollard's Rho algorithm with GCD skipping, Brent's cycle finding and Montgomery multiplication + Miller-Rabin deterministic prime 5-test with precomputed magic tables for all 64-bit base 2 SPRP pseudoprimes.

Patrice T 28-Aug-21 15:08pm

Nice.
On my side, I have been playing with compression on the list of primes, and with optimization of the trial division.
Integer Factorization: Dreaded List of Primes[^]
Integer Factorization: Reversing the Multiplication[^]

Member 15338211 30-Aug-21 8:41am

Nice articles! I'm using a bitfield of odd numbers as a moving sieve for prime generation. I didn't think of a compression using a wheel as you did before, but it might be considerably slower due to all the branches.

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Rick York · Answer 1 · 2021-08-27T05:18:00

I think your fourth and fifth sentences are incorrect. A 32-bit value can have up 31 prime factors if they are all 2. Similarly a 64-bit value can have up to 63 prime factors. Unless I am mistaken, you can not have more than that number of factors for any values of those types. When I have done this kind of thing I usually pass in a pointer to an array of integers that has 64 items and all are initialized to zero. The array will be fill with all prime factors of the input value and, due to the nature of the factoring algorithm, the factors will be automatically sorted with smallest ones first. If a factor is used multiple times then it appears multiple times in the array. If the input is 32768 then 2 will appear fifteen times. The algorithm returns the number of factors, fifteen for 32768.