Click here to Skip to main content
15,672,399 members
Articles / Productivity Apps and Services / Microsoft Office / Microsoft Excel
Article
Posted 18 Sep 2018

Stats

7.8K views
2 bookmarked

How Baseball Led to a Quest for Software Coding Optimization

Rate me:
Please Sign up or sign in to vote.
4.93/5 (9 votes)
18 Sep 2018CPOL26 min read
A search for efficient algorithms to find Ruth-Aaron pairs

Background

My story begins on February 6, 1895, in Baltimore, Maryland. On that day, a baby boy by the name of George Herman Ruth Jr. was born. At that time, no one could know the fame that would come his way. Later nicknamed “The Bambino” and “The Sultan of Swat” and “Babe”, he began to show proficiency in the game of baseball. On July 11, 1914, at the age of 19, he made his major league debut as a pitcher with the Boston Red Sox. But wanting to have more play time, he was allowed to convert to an outfielder. In 1919, he broke the MLB single-season home run record. In 1920, Ruth was sold to the New York Yankees where he continued to hit home runs. By the time he retired in 1935, he had 714 lifetime home runs, a record that seemingly could not be broken.

The next part of my story begins in Mobile, Alabama, on February 5, 1934, one day shy of 39 years after Babe Ruth was born, and the last year he played for the New York Yankees. (He played part of another year for the Boston Braves in 1935.) On that day, a baby boy by the name of Henry Louis “Hank” Aaron was born. Also known as “Hammer” or “Hammerin’ Hank”, he made his major league debut on April 13, 1954, at the age of 20. Over the next 22 years, he would play for the Milwaukee Braves (picking up with the Braves where Ruth left off), the Atlanta Braves, and the Milwaukee Brewers. Aaron was a home run hitter like Ruth. By the end of the 1973 season, he had hit 713 lifetime home runs, just one shy of the 38-year-old record set by Ruth.

You can imagine the fervor in Atlanta between the 1973 and 1974 seasons, with many fans anticipating Hank breaking Babe’s lifetime home run record. You could ask, “When’s 715 going to happen?” and everyone knew what you were talking about. It turned out that April 8, 1974, was the answer to that question as Aaron eclipsed Ruth’s record by hitting his 715th home run for the major leagues.

One person that took notice of the excitement, and especially the numbers, was a young assistant professor of mathematics at the University of Georgia by the name of Carl Pomerance. He noticed that the product of 714 and 715 was also the product of the first seven prime numbers. Then a student of a colleague noticed that the sum of 714’s prime factors was equal to the sum of 715’s prime factors. A prime number is evenly divisible by only 1 and itself. Here’s the information in a nutshell.

2 x 3 x 5 x 7 x 11 x 13 x 17 = 714 x 715 = 510,510

714 = 2 x 3 x 7 x 17

715 = 5 x 11 x 13

Sum of 714’s prime factors = 2 + 3 + 7 + 17 = 29

Sum of 715’s prime factors = 5 + 11 + 13 = 29

Interestingly, the sum of the prime factors, 29, is also a prime number, and the product of 714 and 715 is interesting in that it is a repeated 510. One for Ruth and one for Aaron, perhaps? The prime factors of 510 are 2, 3, 5, and 17, which add up to 27. As baseball fans probably know, April 27 is known as National Babe Ruth Day. It was on that day in 1947 that Ruth spoke to a crowd of almost 60,000 people at Yankee Stadium while suffering from the pains of cancer. Concerning Hank Aaron, Wikipedia says that 1955 marked the beginning of the “prime” of his career. In that year, he hit 27 home runs and was honored in various ways. Coincidence?

It turns out that the number of occurrences of two consecutive integers having prime factors that sum up to the same value is very rare. These became known as Ruth-Aaron Pairs. Even more rare is having an RA Pair whose combined prime factors are consecutive. In fact, it seems that the 714-715 combination is unique in this respect except for the trivial pair of 5-6 where the prime factors are 2, 3, and 5. I call it trivial because 5 has only one prime factor. By the way, for you true geeks, notice that Hank Aaron was born on the 5th day of the month and Babe Ruth was born on the 6th day of the month. So, perhaps the unique 5-6 pair should be called an Aaron-Ruth Pair. And for the extreme geek, notice that both Ruth and Aaron were born in the second month of the year, and 2 is the first prime number.

My Introduction to Ruth-Aaron Pairs

I first read about Ruth-Aaron Pairs in a book loaned to me by a friend who purchased the book for 75 cents at a used bookstore. The book was “The Man Who Loved Only Numbers: The Story of Paul Erdos and the Search for Mathematical Truth” by Paul Hoffman. I became so intrigued by the idea of RA Pairs, I decided to use my computer to find these pairs for myself. This was the beginning of a two-month search for programming optimization. Although the process of finding RA Pairs is straightforward, doing so in the most efficient way possible is not.

Let me first start by giving you an overview of the algorithm I used to find RA Pairs.

  1. Step through the integers starting with 2, setting this value to be a Ruth number candidate.
  2. Add 1 to the Ruth number candidate to get the Aaron number candidate.
  3. Find the prime factors for the Ruth number candidate and sum them up.
  4. Find the prime factors for the Aaron number candidate and sum them up.
  5. If the two sums are equal, the numbers are a Ruth-Aaron Pair.
  6. Additionally, the prime factors can be checked to see if they are unique, meaning no factors were duplicated, within and across the Ruth-Aaron Pairs.
  7. Further, if the prime factors are unique, they can be checked for consecutiveness.

Using Excel

My search began using Microsoft Excel 365 and its built-in programming language, Visual Basic for Applications (VBA). I won’t get into the programming specifics until later as there were a lot of twists and turns throughout this entire process. For now, I just want to give you an idea of the things I tried and how they fared.

Let me start by saying that my desktop computer is still fairly powerful even though it is now six years old. It has an Intel Quad Core i7-3770K 3.50 GHz CPU that is overclocked to 4.20 GHz with liquid cooling and has 32 GB of DDR3/1600MHz Dual Channel Memory. It is running the latest version of Windows 10.

Given that an RA Pair search involves prime numbers, it was logical that I would need a list of primes to start with. I originally generated primes myself and put them on a worksheet in my Excel workbook.

I set up another sheet where the starting and ending Ruth numbers could be selected. As the program found RA Pairs, data about those pairs were added to the sheet as shown below. The Time column allowed me to determine how fast the program was running.

Image 1

Full size image: Figure1.jpg

The algorithm for finding the prime factors for each candidate Ruth and Aaron number was as follows:

  1. Check if the Ruth or Aaron candidate is prime. If so, the only prime factor is itself. Done.
  2. If the candidate is not prime, start with the first prime, 2.
  3. If the candidate is evenly divisible by this prime, add it to a factors array. Make the quotient of the division the new number to test.
  4. Keep testing the same prime, adding it again to the factors array each time it is evenly divisible. Each time make the quotient the new number to test.
  5. When that prime is no longer evenly divisible into the number, move to the next prime.
  6. Repeat Steps 3 – 5, until the quotient is itself prime and is no longer evenly divisible by any number other than itself. Add this final quotient to the factors array.
  7. The factors array will now contain all the prime factors for the original number.

You might think that determining whether a number is prime in Step 1 would simply entail searching the prime number list and, if found, flagging the candidate number as prime. But this proved to be very slow, especially for large numbers, even when the primes were first read into an array. A better way is to begin dividing the candidate number by prime numbers, starting with 2. If just one prime is evenly divisible into the candidate, it is not prime. If, however, a prime whose value is more than half of the candidate number is reached and none were evenly divisible into the candidate, then the candidate is itself prime.

Using these methods for finding primes and RA Pairs, I generated the RA Pairs for the first million integers and, to compare speed, the million integers between 9 and 10 million. Here are the results:

Range of Values Searched for RA Pairs Time to Complete
2 – 1,000,000 0 h 31 m 25 s
9,000,001 – 10,000,000 4 h 44 m 33 s

As expected, with larger numbers the time to find RA Pairs dramatically increased. This was due to the greatly increased number of calculations needed to test for primes and to find prime factors.

I used Windows Task Manager to see how much of my CPU’s processing power was being used as the Excel program was running. It was only about 18%. I wondered if I could speed things up by running multiple instances of the Excel program simultaneously. Since I had a four-core CPU with each core capable of two hyperthreads, I decided that eight instances would be the maximum number I could run and still get speed increases. When I tested two instances, the CPU usage went up to 36%, which was encouraging. But it leveled out at about 45-50% for three or more, not much of a speed gain. And it was a real pain setting up all the instances and splitting the duties between them.

Using Visual Basic .NET

At this point, I decided that to get any more speed, I would have to move the program from VBA to Visual Basic .NET since VBA does not fully compile as does VB.NET. I had programmed a lot back in the days of Visual Basic 6 and earlier, but very little with VB.NET. But what the heck, this was an opportunity to learn. VB.NET 2017 Community Edition was installed on my computer, so I took my existing VBA program and copied it over to a VB.NET program. I had to build my own interface and modify the code for everything to work. I ultimately decided that the prime numbers needed to be loaded into an array from a text file rather than from an Excel sheet, but I still thought using Excel for exporting the final results of the RA Pairs search was a good way to go. With everything running, I ran a test to compare against the result presented earlier for the VBA program.

Range of Values Searched for RA Pairs Time to Complete
2 – 1,000,000 0 h 1 m 31 s

This meant that the VB.NET program was over 20 times faster than the VBA program without making any procedural changes to the code. Fantastic! But my need for speed was not yet satisfied.

Multi-Threading

The first thing that came to mind to further increase the search speed was to break the search down into multiple tasks and use multi-threading. This is similar to using multiple instances of Excel, but much easier to implement. Multi-threading is something I had been fascinated by but had never actually implemented in any program. So, it took a bit of research. However, once understood, it was really very simple. The hardest part was figuring out how to divvy up the search between the threads. Once again, since my CPU can handle eight hyperthreads, I guessed that eight threads would be a good maximum. To be safe, however, I allowed the program to run up to 16. Upon making the coding changes to handle the multi-threading, I obtained the following results.

Number of Threads Time to Find RA Pairs from 2 to 1,000,000
1 1 m 36 s
2 0 m 43 s
3 0 m 38 s
4 0 m 28 s
5 0 m 25 s
6 0 m 24 s
7 0 m 21 s
8 0 m 19 s
9 – 16 0 m 19 s

As you can see, my intuition was right. Eight threads maximized the speed. More threads added no additional speed. I also discovered that at eight threads or higher, the CPU was being utilized at 100%. I noticed that when running one thread, the speed was a bit slower than when threading was not being used. I assumed this was due to the overhead associated with threading. But the increase in speed using multiple threads more than compensated for the overhead slowdown.

Since the speed of finding RA Pairs decreases with larger numbers, I wrote my program to automatically split the search such that each thread had both small and large numbers. For example, if I chose to run four threads, the work would be divided as follows.

Thread 1 Pairs Search Thread 2 Pairs Search Thread 3 Pairs Search Thread 4 Pairs Search
(2,3), (6,7), (10,11), … (3,4), (7,8), (11,12), … (4,5), (8,9), (12,13), … (5,6), (9,10), (13,14), …

By dividing the work up this way, each thread had almost the same amount of duties to carry out. But as it turns out, some threads end up running faster because some pairs require less work to test. I could keep up with each thread’s progress via 16 progress bars on my program interface. Since some threads run faster than others, thus finishing early, having 16 threads was useful. With this many initial threads, over half the work could be finished (eight threads complete) before a slow-down occurs. If only eight threads are initially running, efficiency begins to decline starting with the first finished thread. So, having 16 threads available proved to be valuable.

Prime Testing

To recap, I had originally built my RA Pairs software in Excel using VBA and it took about 31.5 minutes to search the first one million potential pairs. I then converted the program to VB.NET and added multi-threading. Using eight or more threads reduced the search time for the first million potential pairs down to 19 seconds, an almost 100 times speed increase. Not bad. But once again, I was not satisfied. I believed there had to be a way to make the search even faster. And since the RA Pairs program spent most of its time either determining if Ruth and Aaron candidates were prime or finding the prime factors of those candidate numbers, I was pretty sure that was where I needed to start looking for more optimized algorithms.

I noticed that the Array class in VB.NET had a method named IndexOf that could search an array for a value. I thought that if this method had been optimized in some way, perhaps it could be used to determine if a number was prime by searching the Prime array for the number being tested. In actuality, the IndexOf method was extremely slow.

Methods of Determining if a Number is Prime

I began thinking about the algorithm I was currently using, which was to start with the lowest prime, 2, and keep stepping through the primes to see if any were evenly divisible into the number being tested. The search was to stop at half the number since no number is evenly divisible by a number bigger than half itself. Here’s the code used for this test.

VB.NET
Private Function IsPrime(n As Integer) As Boolean
    'n/2 Method
    Dim i As Integer
    Dim endVal As Integer = n / 2

    'loop through prime array, if n is evenly divisible
    'by one of these primes, then n is not prime

    i = 0

    Do While PrimeArray(i) <= endVal
        If n Mod PrimeArray(i) = 0 Then
            Return False
        Else
            i += 1
        End If
    Loop

    Return True

End Function

As I thought about how I was limiting the search for evenly divisible primes to n/2, it dawned on me that once 3 had been checked for divisibility, then no primes greater than n/3 needed to be checked. And once 5 had been checked then none greater than n/5 needed to be checked. So, what was the smallest value beyond which no check was needed? Well it would be where the prime number being checked, p, was equal to n/p. To find that point, simply set p equal to n/p and solve for p.

p = n / p
p^2 = n
p = sqrt(n)

Aha! It all made sense. There was no need to check any prime numbers for even divisibility beyond the square root of the number being tested. For if there is any prime beyond this point that is evenly divisible into the number being tested, there would have to be one or more primes smaller than the square root to multiply it by to get the number. And those would have already been tested.

To implement this algorithm, I simply changed the code above to set endVal equal to Sqrt(n) rather than n/2. Another way to obtain the square root of a number is to take the log base 10 of the number, divide that by 2, and then raise 10 to the result, or 10^(Log10(n)/2). I thought perhaps this calculation might be faster than using the square root function.

I needed to test all these different methods of finding primes to see how fast each one was. I wrote a separate program to do this. I had created a text file containing all the prime numbers between 2 and 100,000,000 that could be loaded into an array for testing the different algorithms. (Note: Creating a prime number generator is easy, but they are also available online.) Just to compare with the IndexOf method, I wrote my own routine to directly search the array. In this routine, I first eliminated even numbers by checking n modulo 2, then searched the prime array for the number being tested, stopping the search when the number was found or the array value became larger than the number. Here are the results:

Method Used to Determine if Number is Prime 1 Time to Test Numbers from 2 to 10,000 Time to Test Numbers from 2 to 1,000,000 Time to Test Numbers from 2 to 100,000,000
IndexOf 15.839 s Too Long Way Too Long
Direct Array Search 0.007 s 39.287 s Too Long
Search to n/2 0.002 s 4.576 s Too Long
Search to 10^(Log10(n)/2) 0.001 s 0.104 s 24.193 s
Search to Sqrt(n) 0.001 s 0.053 s 19.191 s

1 It took 1.508 seconds to load the file containing the primes in the first 100 million integers.

As you can see, the method of searching for an evenly divisible prime number up to the square root of the number being tested was indeed the fastest. However, I still felt that there had to be yet another faster way. I came up with two additional ideas to try.

More Methods of Determining if a Number is Prime

One idea was to predetermine the prime-ness of all integers up to 100,000,000 and store that information in a file. That file could be loaded into an array and all that would have to be done to decide if a number is prime would be to check the array value at the index represented by the number being tested. To illustrate, suppose that the array consists of Boolean values. It would look like this:

a(0) = False
a(1) = False
a(2) = True
a(3) = True
a(4) = False
a(5) = True
a(6) = False
a(7) = True
…

Using this array, testing a number for prime-ness is simple.

VB.NET
Private Function IsPrime(n as Integer) as Boolean
      Return a(n)
End Function

It’s so simple, in fact, that the function is not even needed; just put a(n) wherever a test is needed. I initially stored the words True and False on separate lines in the file, but soon discovered that reading that text and converting it to a Boolean array was slow. Ultimately, I found that storing zeros and ones and reading those into a character array, which I call the IsPrime array, was more efficient. However, having a separate line for each value made the file three times larger than necessary since each line had a CR and a LF character at the end. I cut the file size down by storing the zeros and ones all on one line in the file and then reading each value into the character array one at a time.

The second idea made use of the fact that when testing the values for prime-ness in the Ruth-Aaron Pairs search, the numbers being tested are always getting bigger. By setting up a static index value, the Prime array could be searched until the number being tested was found or exceeded. The current index into the array would remain until the next test of a higher number. The search would then continue from that index. This eliminated searching through lower values in the array for each tested number.

Here’s the results from these two tests:

Method Used to Determine if Number is Prime Time to Test Numbers from 2 to 10,000 Time to Test Numbers from 2 to 1,000,000 Time to Test Numbers from 2 to 100,000,000
IsPrimeArray Index 2 0.001 s 0.036 s 2.883 s
Static Index Search 0.000 s 0.003 s 0.306 s

2 It took 1.312 seconds to load a file containing prime information for integers through 100 million.

Obviously, the Static Index Search was the fastest method and the one that should be implemented in the RA Pairs program. However, doing this when multi-threading presented a problem. I couldn’t have just one static index for all threads as some threads could get ahead of others and mess up the index. Therefore, I had to create an array of indexes, one for each thread. But for some reason, this ran extremely slow. I believe it had something to do with each thread attempting to modify the same array at nearly the same time. So, I settled for the IsPrimeArray Index method, which was still almost seven times faster than the square root method for testing pairs up to 100 million. But it did mean having to read in yet another file into an array.

You might be asking yourself why I need to have both a Prime array that only contains prime numbers and an IsPrime array containing a “0” or a “1” for each integer indicating its prime-ness. Well, the first array makes it very fast and easy to find prime factors and determine if they are in consecutive order. The second makes it fast and easy to determine if a number is prime. But still, I did not like the idea that I had two files with redundant information. So, rather than reading a separate file into the Prime array, I decided to just generate that array while reading the other file into the IsPrime array by adding the index value to the Prime array when the value at that index is “1”.

When reading a file that contains zeros and ones for the first 100 million integers, the routine takes less than four seconds. When reading a file with prime information for the first billion integers, it took about 28 seconds.

I tried one other method of reading the IsPrime file into the program. Rather than read the zeros and ones into an IsPrime array, I tried reading the entire file into an IsPrime string. However, when attempting to read the file with information about the first billion integers, VB.NET threw an error, informing me of a memory overflow. I had been compiling the program for 32-bit computers, so switched to compiling for 64-bit to see if strings could be longer. It worked, and it only took about 18 seconds to load the file into a string (3 seconds) and create the Prime array (15 seconds), a 35% increase in speed over the previous method. The method of checking if a number is prime had to be changed from indexing into an array to indexing into a string. But this too showed an increase in speed.

Upon further thought, I realized that the Prime array did not need to include all the primes represented in the IsPrime string. While I would need to test large numbers for being prime, I would only need numbers in the Prime array up to the square root of the largest number represented in the IsPrime array. This is for the same reasons described earlier. By limiting the Prime array size, the time for loading the IsPrime string and creating the Prime array went down to about 3.5 seconds (3 seconds for loading the IsPrime string and 0.5 seconds for creating the Prime array). Here’s the code for the final routine.

VB.NET
    Private Sub LoadIsPrimeStr()
        Dim fn As String = "\IsPrimes-1,000,000,000.txt"
        Dim i As Integer = 0
        Dim j As Integer = 0
        Dim length As Integer
        Dim maxPrimeToLoad As Integer

        'read in IsPrime string
        IsPrimeStr = My.Computer.FileSystem.ReadAllText(txtPath.Text & fn)

        'get length of the string
        length = IsPrimeStr.Length

        'make Prime array bigger than needed, less than 6% of numbers are prime
        ReDim PrimeArray(0.06 * length)

        'populate PrimeArray up to sqrt of quantity in IsPrimeStr
        maxPrimeToLoad = Math.Round(Math.Sqrt(length) + 1)

        For i = 0 To maxPrimeToLoad
            If IsPrimeStr(i) = "1" Then
                PrimeArray(j) = i
                j += 1
            End If
        Next

        'redim to just what was loaded and free up memory
        ReDim Preserve PrimeArray(j - 1)
        GC.Collect()

    End Sub

Finding Prime Factors

Some of the lessons I learned while testing methods of determining the prime-ness of a number I put to use in the routine for finding a number’s prime factors. Here’s what the final iteration of the routine looks like:

VB.NET
    Private Function GetPrimeFactors(n As Integer, ByRef a() As Integer) As Integer
        Dim i As Integer = 0
        Dim k As Integer = 0
        Dim q As Integer = n
        Dim sum As Integer = 0

        ReDim a(0)

        If IsPrimeStr(n) = "1" Then 'number is prime
            a(k) = n
            Return n
        Else 'loop through primes to determine factors for this number
            Do
                If q Mod PrimeArray(i) = 0 Then
                    ReDim Preserve a(k)
                    a(k) = PrimeArray(i)
                    sum = sum + a(k) 'sum up factors
                    q = q / PrimeArray(i)
                    If q = 1 Then Exit Do
                    k += 1
                    If IsPrimeStr(q) = "1" Then 'quotient is prime so finished
                        ReDim Preserve a(k)
                        a(k) = q
                        sum += q
                        Exit Do
                    End If
                Else
                    i += 1
                End If
            Loop

            Return sum

        End If

    End Function

The Final Results

So, after implementing all these optimization changes to the RA Pairs program, what kind of speed did I obtain? Here’s a table showing where I started and ended.

RA Pairs Search Method Time to Complete for 2 to 1,000,000
Excel and VBA with no optimizations 0 h 31 m 25 s
VB.NET with no procedural changes from above 0 h 1 m 31 s
VB.NET with 8 threads 0 h 0 m 19 s
VB.NET with 16 threads using optimization 0 h 0 m 0.271 s
VB.NET compiled to 64-bit with 16 threads using all optimizations including InPrimeStr 0 h 0 m 0.224 s

WOW! From 31.5 minutes to 224 milliseconds. That’s quite an increase in speed. Almost 8500x faster. Given this level of speed, I could test for RA Pairs up to even larger numbers. Using the IsPrime file with prime information for numbers up to a billion, I found the RA Pairs up to that value. It took a second over 6 minutes to find the RA Pairs. There are 6,810 if you are interested.

One Approach Not Mentioned

At one point in my search for optimization, I realized that there were certain facts about integers that did not change, such as whether they are prime, what their prime factors are, the sum of those factors, if their prime factors are unique, and whether they are consecutive. So, why not precalculate that information, store it in a file, then read that file into the RA Pairs program and iterate through it to determine RA Pairs rather than having to calculate everything each time.

The idea worked, but there were two caveats: the time it took to load the file into an array and the amount of memory that array consumed. I could not load the entire file into an array before VB.NET balked, telling me there was a memory overflow. But even so, with all the optimizations I ultimately added to the program, it actually ran faster running the calculations on the fly after taking into account the load time of the precalculated values file.

Much Ado About Nothing?

At this point, you might be asking, “Why bother with all these optimizations? The Ruth-Aaron Pairs don’t change. Why not just let the program run for a week to generate the pairs, and then be done with it?” To that I say, “Good point!”

For me, this whole journey has been about learning. It was thrilling for me to find that one little change that would greatly speed up the program. And keep in mind also that with this increased speed, I can now find way more RA Pairs in a week then I could before the modifications.

But there is yet more to the story. Officially, a Ruth-Aaron Pair must be two consecutive numbers whose prime factors sum to the same value. However, this can be extended to what I call pseudo Ruth-Aaron Pairs. What about finding two numbers separated by two whose prime factors sum to the same value? Or separated by three, or four, or a million? Wouldn’t that be interesting also? With the greatly increased speed comes the ability to quickly search for all these pseudo RA Pairs.

Here’s the interface to my program:

Image 2

Full size image: Figure2.jpg

The Gap Between Pairs value is the amount of separation in the pseudo Ruth and Aaron values. Notice too that there is a place to enter the name of a batch file. This is another nice feature I added where you can fill out information about run parameters in an Excel template file. The program can read that file and make multiple runs unattended. The Factors File is the file containing the precalculated prime and prime factor information mentioned in the previous section. To use it, the Run Type must be set to RAPC.

Earlier, I presented what the output table in Excel looked like when the RA Pairs program was in VBA. That table changed over time. Here is what it currently looks like:

Image 3

Full size image: Figure3.jpg

Also, in the same file is another sheet for recording statistics.

Image 4

Full size image: Figure4.jpg

Latest Additions

The latest code I have added to my RA Pairs program gives it the ability to find All Factors for the RA Pairs, rather than just the prime factors, and determine if their sums equal each other. I wondered if there were any RA Prime Factors Pairs that were also RA All Factors Pairs. I haven’t found any yet. Here’s a table showing the last few Ruth-Aaron All Factors Pairs in the first billion integers.

Image 5

Full size image: Figure5.jpg

The algorithm for finding all factors was similar to the one for finding prime factors. However, since all factors include any integer, whether prime or not, the IsPrimeArray or IsPrimeStr test could not be used. However, the square root test was very useful. Here’s the algorithm I used where n is the number being factorized.

  1. If n = 1, then the only factor is 1.
  2. If n is prime, then the only factors are 1 and n.
  3. Otherwise, step through integers, i, from 1 to the square root of n. Step by 1 for even n, 2 for odd n, since odd numbers can only have odd factors.
  4. If i is evenly divisible into n, then i and n/i are factors.
  5. If n is the square of an integer, then the last factor will be Sqrt(n).

There are 533 Ruth-Aaron All Factors Pairs in the first billion integers. Interestingly, none of the Ruth or Aaron values in those pairs were squares of integers as accounted for in the last step above. I don’t know if this is true for all pairs or not. Perhaps a mathematician somewhere has proven this to be the case, but I am unaware of this proof.

Conclusion

This project has been great fun for me. I haven’t been this excited about programming since my days creating shareware with Visual Basic 6. Before that, I had a lot of fun creating shareware for the Amiga computer using C and C++ and writing articles about coding for Amiga computer magazines.

I hope you found my program optimization journey interesting, and that it gave you some ideas for your own programming.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Retired
United States United States
Randy C. Finch has a Masters degree in Chemical Engineering and worked in that capacity for several years after graduating from the University of Louisville in 1978. However, when the personal computer revolution hit in the early 1980's his interests shifted. Fortunately, he was able to move into computer work at the company for which he worked. Over the past 35 years he has programmed computers both at his job and at home. He became very proficient at Visual Basic, VBA, C, and C++. He also spent many years writing programs for the SAS System. He presented many papers at SAS conferences and eventually became a member of the Executive Council of the SouthEast SAS Users Group, co-chairing a conference in 1998. He spent many years writing Microsoft Office applications, mostly in Excel and Access. He has had over 50 papers published in journals and magazines, some work-related and others personal. He also wrote a regular column about Amiga computers in a computer journal for two years. Apart from his writing about computers, he has written poetry and has three published books; one humor book, one novel, and one non-fiction book. He enjoys photography and has won several awards for his photos as well as having many published. In his retirement years, he has recently become interested in programming with Visual Basic.NET.

Comments and Discussions

 
GeneralMy vote of 5 Pin
L Hills24-Sep-18 3:14
L Hills24-Sep-18 3:14 
QuestionWhy did you not use C++? Pin
Felix Keil20-Sep-18 18:43
Felix Keil20-Sep-18 18:43 
AnswerRe: Why did you not use C++? Pin
Randy C Finch21-Sep-18 8:53
Randy C Finch21-Sep-18 8:53 
GeneralMy vote of 5 Pin
dmjm-h20-Sep-18 11:24
dmjm-h20-Sep-18 11:24 
Well-written, a good read.
GeneralMy vote of 5 Pin
Peter Adam20-Sep-18 9:43
professionalPeter Adam20-Sep-18 9:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.