## Introduction

This article describes how to solve a logic problem using a Genetic Algorithm. It assumes no prior knowledge of GAs. In fact, half of this article is dedicated to explaining the internal structure of a Genetic Algorithm.

### So what is the problem domain we are trying to solve?

Well, GAs can be used to solve many problems. In fact, GAs have been used to grow new mathematical syntax trees, train multi-layer neural networks, to name but a few instances.

However, for this example, I have used a simple card splitting excercise, which is as detailed here:

- You have 10 cards numbered 1 to 10
- You have to divide them into two piles so that:
- The sum of the first pile is as close as possible to 36.
- And the product of all in the second pile is as close as possible to 360.

Now, I am not saying that this could **not** be done by hand, using old fashioned brain juice, it's just better suited to a GA, as it could take 100s or even 1000s of different combinations to get the correct result. Well, probably not that many for this simple problem, but it certainly could take a lot of combinations for a more difficult problem. Suffice to say, it is just good fun to do it with a GA. So, let's carry on.

### So what is a Genetic Algorithm?

Well, Wikipedia says this:

A genetic algorithm is a search technique used in computing, to find true or approximate solutions to optimization and search problems, and is often abbreviated as GA. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).

Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem evolves towards better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals, and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly mutated) to form a new population. The new population is then used in the next iteration of the algorithm.

Follow that?? If not, let's try a diagram. (**Note** that this is a Microbial GA, there are lots of GA types, but I just happen to like this one, and it's the one this article uses.)

I prefer to think of a GA as a way of really quickly (well, may be quite slow, depending on the problem) trying out some evolutionary programming techniques, that mother nature has always had.

### So how does this translate into an algorithm (this article uses a Microbial GA, but there are many other varieties)?

The basic operation of the Microbial GA training is as follows:

- Pick two genotypes at random
- Compare Scores (Fitness) to come up with a Winner and Loser
- Go along genotype, at each locus (Point)

That is:

- With some probability (randomness), copy from Winner to Loser (overwrite)
- With some probability (randomness), mutate that locus of the Loser
So

**only**the Loser gets changed, which gives a version of Elitism for free, this ensures that the best in the breed remains in the population.

That's it. That is the complete algorithm.

But there are some **essential** issues to be aware of, when playing with GAs:

- The genotype
**will**be different for a different problem domain - The Fitness function
**will**be different for a different problem domain

These two items **must **be developed again, whenever a new problem is specified.

For example, if we wanted to find a person's favourite pizza toppings, the genotype and fitness would be different from that which is used for this article's problem domain.

These two essential elements of a GA (for this article's problem domain) are specified below.

### 1. The Geneotype

//the genes array, 30 members, 10 cards each private int[,] gene = new int[30, 10];

Well, for this article, the problem domain states that we have 10 cards. So, I created a two dimensional genes array, which is a 30*10 array. The 30 represents a population size of 30. I picked this. It could be any size, but should be big enough to allow some dominant genes to form.

### 2. The Fitness Function

Remembering that the problem domain description stated the following:

- You have 10 cards numbered 1 to 10
- You have to divide them into two piles so that:
- The sum of the first pile is as close as possible to 36.
- And the product of all in the second pile is as close as possible to 360.

Well, all that is being done is the following :

- Loop through the population member's genes
- If the current gene being looked at has a value of 0, the gene is for the sum pile (pile 0), so add to the running calculation
- If the current gene being looked at has a value of 1, the gene is for the product pile (pile 1), so add to the running calculation
- Calculate the overall error for this population member. If this member's geneotype has an overall error of 0.0, then the problem domain has been solved

//evaluate the the nth member of the population //@param n : the nth member of the population //@return : the score for this member of the population. //If score is 0.0, then we have a good GA which has solved //the problem domain private double evaluate(int n) { //initialise field values int sum = 0, prod = 1; double scaled_sum_error, scaled_prod_error, combined_error; //loop though all genes for this population member for (int i = 0; i < LEN; i++) { //if the gene value is 0, then put it in the sum (pile 0), //and calculate sum if (gene[n,i] == 0) { sum += (1 + i); } //if the gene value is 1, then put it in the product (pile 1), //and calculate sum else { prod *= (1 + i); } } //work out how food this population member is, based on an overall error //for the problem domain //NOTE : The fitness function will change for every problem domain. scaled_sum_error = (sum - SUMTARG) / SUMTARG; scaled_prod_error = (prod - PRODTARG) / PRODTARG; combined_error = Math.Abs(scaled_sum_error) + Math.Abs(scaled_prod_error); return combined_error; }

## Using the code

The demo project attached actually contains a Visual Studio 2005 solution, with the following two classes.

#### Program class

Is the main entry point into the Simple_GeneticAlgorithm application. All this class does is create a new `Simple_GeneticAlgorithm`

object and call its `run()`

method.

using System; using System.Collections.Generic; using System.Text; namespace Simple_GeneticAlgorithm { class Program { //main access point static void Main(string[] args) { //create a new Microbial GA Simple_GeneticAlgorithm GA = new Simple_GeneticAlgorithm(); GA.run(); //read a line, to stop the Console window closing Console.ReadLine(); } } }

#### Simple_GeneticAlgorithm class

Runs the GA to solve the problem domain.

using System; using System.Collections.Generic; using System.Text; namespace Simple_GeneticAlgorithm { public class Simple_GeneticAlgorithm { //population size private int POP = 30; //geneotype private int LEN = 10; //mutation rate, change it have a play private double MUT = 0.1; //recomination rate private double REC = 0.5; //how many tournaments should be played private double END = 1000; //the sum pile, end result for the SUM pile //card1 + card2 + card3 + card4 + card5, MUST = 36 for a good GA private double SUMTARG = 36; //the product pile, end result for the PRODUCT pile //card1 * card2 * card3 * card4 * card5, MUST = 360 for a good GA private double PRODTARG = 360; //the genes array, 30 members, 10 cards each private int[,] gene = new int[30, 10]; //used to create randomness (Simulates selection process in nature) //randomly selects genes Random rnd = new Random(); //empty constructor public Simple_GeneticAlgorithm() { } //Runs the Microbial GA to solve the problem domain //Where the problem domain is specified as follows // //You have 10 cards numbered 1 to 10. //You have to divide them into 2 piles so that: // //The sum of the first pile is as close as possible to 36 //And the product of all in second pile is as close as poss to 360 public void run() { //declare pop member a,b, winner and loser int a, b, Winner, Loser; //initialise the population (randomly) init_pop(); //start a tournament for (int tournamentNo = 0; tournamentNo < END; tournamentNo++) { //pull 2 population members at random a = (int)(POP * rnd.NextDouble()); b = (int)(POP * rnd.NextDouble()); //have a fight, see who has best genes if (evaluate(a) < evaluate(b)) { Winner = a; Loser = b; } else { Winner = b; Loser = a; } //Possibly do some gene jiggling, on all genes of loser //again depends on randomness (simulating the //natural selection //process of evolutionary selection) for (int i = 0; i < LEN; i++) { //maybe do some recombination if (rnd.NextDouble() < REC) gene[Loser, i] = gene[Winner, i]; //maybe do some muttion if (rnd.NextDouble() < MUT) gene[Loser, i] = 1 - gene[Loser, i]; //then test to see if the new population member //is a winner if (evaluate(Loser) == 0.0) display(tournamentNo, Loser); } } } //Display the results. Only called for good GA which has solved //the problem domain //@param tournaments : the current tournament loop number //@param n : the nth member of the population. private void display(int tournaments, int n) { Console.WriteLine("\r\n==============================\r\n"); Console.WriteLine("After " + tournaments + " tournaments, Solution sum pile " + "(should be 36) cards are : "); for (int i = 0; i < LEN; i++) { if (gene[n,i] == 0) { Console.WriteLine(i + 1); } } Console.WriteLine("\r\nAnd Product pile " + "(should be 360) cards are : "); for (int i = 0; i < LEN; i++) { if (gene[n,i] == 1) { Console.WriteLine(i + 1); } } } //evaluate the the nth member of the population //@param n : the nth member of the population //@return : the score for this member of the population. //If score is 0.0, then we have a good GA which has solved //the problem domain private double evaluate(int n) { //initialise field values int sum = 0, prod = 1; double scaled_sum_error, scaled_prod_error, combined_error; //loop though all genes for this population member for (int i = 0; i < LEN; i++) { //if the gene value is 0, then put it in //the sum (pile 0), and calculate sum if (gene[n,i] == 0) { sum += (1 + i); } //if the gene value is 1, then put it in //the product (pile 1), and calculate sum else { prod *= (1 + i); } } //work out how food this population member is, //based on an overall error //for the problem domain //NOTE : The fitness function will change // for every problem domain. scaled_sum_error = (sum - SUMTARG) / SUMTARG; scaled_prod_error = (prod - PRODTARG) / PRODTARG; combined_error = Math.Abs(scaled_sum_error) + Math.Abs(scaled_prod_error); return combined_error; } //initialise population private void init_pop() { //for entire population for (int i = 0; i < POP; i++) { //for all genes for (int j = 0; j < LEN; j++) { //randomly create gene values if (rnd.NextDouble() < 0.5) { gene[i,j] = 0; } else { gene[i,j] = 1; } } } } } }

### The results

Taking the last good population member results found, let's test it out.

2 + 7 + 8 + 9 + 10 = 36 in Pile 0, this is all good 1 * 3 * 4 * 5 * 6 = 360 in Pile 1, this is all good

## Points of Interest

I hope this article has demonstrated how to write a simple GA to solve a problem that we as humans would probably find hard to do manually. Remember this is a simple problem. what would happen if we upped the problem domain? A GA really is the way to go.

I will very shortly publish an article on a GA training a multi layer neural network to solve some logic problems. So if your'e into this sort of stuff, watch this space.

## History

- v1.0 - 08/11/06.