Introduction
The following article describes an approach that develops the Genetic Algorithm design in stages by developing a GA to solve the Binary F6 function. Once the GA has been developed using the Binary F6 it can then be applied and further developed within the problem domain, thereby allowing for more domain specific needs. This approach was discussed in Davis (1991), see references below.
The developing and testing process involves creating a simple GA and improving this by the development of new and enhanced GA features. Seven GAs are developed (GA-A to GA-G) for use with the Binary F6 function. Each algorithm is summarised below. The graph shows the relative performance of GA-A to GA-D with respect to the number of evaluations. These four GAs represent the development from a simple generation replacement algorithm to an elitist approach with duplicate handling. GA-E to GA-G are steady state (Davis, 1991) algorithms. These are discussed separately.
Binary F6
The image shows a plot of the Binary F6 function (Shaffer, et. al., 1989), with an input range between -100 to +100. It has many local minima and is ideal for testing a GA. The basic code for solving the Binary F6 is shown at the bottom of the article.
Fitness and Terminate Functions
Before we get started creating the GA examples we need to create some suitable Fitness and Terminate functions. It is not necessary to understand these functions, we can simply make use of them. they can be found in the basic GA code at the bottom of the page.
GA Types Under Test
GA | Type | Operators | Description |
---|
A | Generational Replacement | Crossover (0.65)
Mutation (0.008) | Starting point for the GA development process. |
B | Generational Replacement | Crossover (0.65)
Mutation (0.008) | Same as A except that the Linear Normalisation was applied. |
C | Elitist | Crossover (0.65)
Mutation (0.008) | Same as B except that an Elitist approach was adopted (5%). |
D | Elitist without duplicates | Crossover (0.65)
Mutation (0.008) | Same as C except that duplicates were prevented from entering the population. |
E | Steady State | Crossover (0.65)
Mutation (0.008) | Steady State where only two children were added to the population at each generation. |
F | Steady State without duplicates | Crossover (0.65)
Mutation (0.008) | Same as E except that duplicates were prevented from entering the population. |
G | Steady State without duplicates | Crossover (0.8)
Mutation (0.04) | Same as F except that the crossover and mutation parameters were modified. |
Generational Replacement (GA-A)
The first algorithm, GA-A, is the simplest of the algorithm designs tested in this research. It uses the Single Point Crossover and Binary Mutation genetic operators. Each generation is subjected to the genetic operators to create a new generation. This new generation replaces the current generation. The Roulette Wheel parent selection method was chosen for these algorithms. Roulette Wheel parent selection provides a means by which the probability of a solution being selected as a parent, is related to the fitness value of that solution. Therefore, the fitter an individual is, the more likely they will be selected as a parent.
The advantage of this type of algorithm is that it is simple to implement. However, the disadvantage is that as each generation is replaced, any extremely fit individuals from one generation may not pass through to the next generation without being modified. Although probability factor is used to determine to what degree the operators are applied, there is no mechanism to protect the best individuals within the population.
Linear Normalisation (GA-B)
GA-B is similar to GA-A except it introduces linear normalisation. Linear normalisation, as implemented within the GAF, simply orders the solutions by decreasing evaluation, and creates a fitness based on this ordering. For example for a population of 100, the fittest would be 1.0 and each subsequent solution would decrease linearly in value down to the worst solution with a fitness of 0. The starting point and decrement parameters of this approach are fixed within the GAF.
To add linear normalisation, simply change the useLinearlyNormalisedFitness parameter to true in the constructor of the Population object.
Linearly normalising fitness in this way helps maintain natural selection. Natural selection occurs when the fittest individuals are selected, as parents for the next generation, proportionally to their fitness. However, in some problem domains, where the difference between the fitness value of the fittest and others within the solution is very small, each has a similar probability of being selected as a parent. By normalising the fitness of the population between a minimum and maximum value natural selection can be improved.
Elitism (GA-C)
GA-C expands upon GA-A and GA-B in that it introduces Elitism. Elitism allows the best solutions to pass through to the next generation without being modified. GA-C used a percentage value, 5%, which, with a population of 100, means that the top 5 individuals form part of the next generation. This means that any future generation is at least as good as the current generation.
GA-C helps protect good individuals. However, it can be shown that as the percentage of Elites is increased, the number of duplicates within the population increases. This is normal behaviour for an approach that does not explicitly handle duplicates and is simply due to the convergence of the algorithm (Davis, 1991). However, the GAF uses a globally unique identifier (GUID) for each chromosome, the GUID can be used to trace the origin of the chromosome and in this case highlights a different phenomenon to that of convergence. The following output represents the first 16 chromosomes from a run using GA-C. It can be seen that there are many cases of the same chromosome existing more than once. For example chromosomes 1 and 2 are the same solution but are derived from different chromosomes as the GUID is different, this is due to convergance. However, chromosomes 3 and 9 are the same chromosome, these have been duplicated due to the combination of Elitism and crossover probability settings.
GUID | Chromosome |
---|
1 | ccd5ba05-f0a6-43c9-a8b8-78f206dd6341 | 01111110010000110001010111110001100000110011 |
2 | d16cfab2-797f-4821-a967-ee441a9ea33a | 01111110010000110001010111110001100000110011 |
3 | 41b88bd6-c872-4c8e-ae0b-2944fa421fdc | 01111110010000110001010111110001100000110011 |
4 | bf9bbb00-da6d-40e9-9acd-ad1eb5df7d8d | 01111110010000110001010111110001100000110011 |
5 | 4c9bc696-824f-471b-ad36-1584dfdcddee | 01111110010000110001010111110001100000110011 |
6 | 227ec944-b8d7-4020-a046-a9f49293d9c8 | 01111110010000110001010111110001100000110011 |
7 | 948ba7b9-964f-46f6-8704-7c05786acffe | 01111110010000110001010111110001100000110011 |
8 | 7b390c2c-9ab4-4ef8-bbe0-fa0b5992fb09 | 01111110010000110001010111110001100000110011 |
9 | 41b88bd6-c872-4c8e-ae0b-2944fa421fdc | 01110110010000110001010111110001100000110011 |
10 | bf9bbb00-da6d-40e9-9acd-ad1eb5df7d8d | 01111110010000010001010111110001000000110011 |
11 | 4c9bc696-824f-471b-ad36-1584dfdcddee | 01111110010000110001010111100001100000110001 |
12 | 227ec944-b8d7-4020-a046-a9f49293d9c8 | 01111110010000110001110111110001100000110011 |
13 | 948ba7b9-964f-46f6-8704-7c05786acffe | 01111110010000110001010111111001110000110011 |
14 | 1d24c8d6-b6c7-4b6c-bb17-f9da544499f4 | 01111110010000010000010111010001100010111011 |
15 | 7b390c2c-9ab4-4ef8-bbe0-fa0b5992fb09 | 11111110010000100001010111110001110000110011 |
Further investigation shows that duplicates can occur due to the combination of both the Elite and Crossover genetic operators. The Elite implementation simply takes the top percentage of solutions based on fitness and adds them to the next generation’s population without modification. The remaining solutions were generated from parents selected using the Roulette Selection. Roulette selection favours the selection of the better solutions as parents, therefore in many cases the parents selected were those already identified as Elite. In cases where the probability of crossover was not satisfied, the selected parents were allowed to pass into the next generation’s population without modification. This process allows Elites to be added to the next generation’s population more than once.
Duplicate Handling (GA-D)
Modifying GA-C to prevent duplicates from being added to the population is simply a case of changing the constructor parameter for the appropriate operators.
Steady State Algorithm (GA-E)
The algorithm GA-E has been described as a Steady State Genetic Algorithms (Davis, 1991). This type of algorithm is designed to maintain a population through each generation, changing only one or two children where the children are considered to be fitter than the worst of those in the population. This approach has shown itself to be better when combined with techniques to prevent duplicates being added to the population. However, in the experiments carried out during this project, this could not be determined.
Steady State Algorithm without Duplicates (GA-F)
GA-F is the same as GA-E but with duplicates prevented from entering the population. Modifying GA-E to prevent duplicates from being added to the population is simply a case of changing the constructor parameter for the appropriate operators.
Steady State Algorithm without Duplicates (GA-G)
In the Steady State configuration, only a few children being added at each generation, therefore the majority of the population is protected from crossover and mutation, this means that the crossover and mutation probability values can be safely increased. GA-G is the same as GA-F except that the crossover and mutation probability settings are significantly increased. The results show the performance of these algorithms when used with the Binary F6 function.
Noisy Environments
The Binary F6 evaluation is not noisy, that is, the fitness value would not change if the solution were to be re-evaluated. In situations such as these, the evaluation process can be optimised not to re-evaluate each solution at each generation. The GAF handles this type of optimisation internally, as a result it can be shown that by adding only two children at each generation, there are only two evaluations at each generation. Naturally there are many more generations than the other algorithms described.
The disadvantage of this approach is that when using this GA with noisy evaluations, for example where a re-evaluation of a solution would return a different result from the previous evaluation, the optimisation described above, cannot be applied. The result is that each generation needs to be fully evaluated. Due to the large number of generations involved, this reduces the performance compared to algorithms GA-A to GA-D, described above.
For noisy environments each solution should be evaluated at each generation. This can be achieved by setting the reEvalutaeAll parameter in the constructor of the Population object.
- Davis, L. (1991) Handbook of Genetic Algorithms. New York: Van Nostrand Rienhold
- J. D. Schaffer, R. A. Caruana, L. J. Eshelman, and R. Das (1989) A study of control parameters affecting online performance of genetic algorithms for function optimization. In Davis, L. (1991) Handbook of Genetic Algorithms. New York: Van Nostrand Rienhold
This code and the supporting project is available from BitBucket.
<code>using System;
using GAF;
using GAF.Operators;
namespace BinaryF6
{
internal class Program
{
private static void Main(string[] args)
{
const double crossoverProbability = 0.85;
const double mutationProbability = 0.08;
const int elitismPercentage = 5;
var population = new Population(100, 44, false, false);
var elite = new Elite(elitismPercentage);
var crossover = new Crossover(crossoverProbability, true)
{
CrossoverType = CrossoverType.SinglePoint
};
var mutation = new BinaryMutate(mutationProbability, true);
var ga = new GeneticAlgorithm(population, EvaluateFitness);
ga.OnGenerationComplete += ga_OnGenerationComplete;
ga.Operators.Add(elite);
ga.Operators.Add(crossover);
ga.Operators.Add(mutation);
ga.Run(TerminateAlgorithm);
}
public static double EvaluateFitness(Chromosome chromosome)
{
double fitnessValue = -1;
if (chromosome != null)
{
var rangeConst = 200 / (System.Math.Pow(2, chromosome.Count / 2) - 1);
var x1 = Convert.ToInt32(chromosome.ToBinaryString(0, chromosome.Count / 2), 2);
var y1 = Convert.ToInt32(chromosome.ToBinaryString(chromosome.Count / 2, chromosome.Count / 2), 2);
var x = (x1 * rangeConst) - 100;
var y = (y1 * rangeConst) - 100;
var temp1 = System.Math.Sin(System.Math.Sqrt(x * x + y * y));
var temp2 = 1 + 0.001 * (x * x + y * y);
var result = 0.5 + (temp1 * temp1 - 0.5) / (temp2 * temp2);
fitnessValue = 1 - result;
}
else
{
throw new ArgumentNullException("chromosome", "The specified Chromosome is null.");
}
return fitnessValue;
}
public static bool TerminateAlgorithm(Population population, int currentGeneration, long currentEvaluation)
{
return currentGeneration > 1000;
}
private static void ga_OnGenerationComplete(object sender, GaEventArgs e)
{
var chromosome = e.Population.GetTop(1)[0];
var x1 = Convert.ToInt32(chromosome.ToBinaryString(0, chromosome.Count/2), 2);
var y1 = Convert.ToInt32(chromosome.ToBinaryString(chromosome.Count/2, chromosome.Count/2), 2);
var rangeConst = 200 / (System.Math.Pow(2, chromosome.Count / 2) - 1);
var x = (x1 * rangeConst) - 100;
var y = (y1*rangeConst) - 100;
Console.WriteLine("x:{0} y:{1} Fitness{2}", x, y, e.Population.MaximumFitness);
}
}
}
</code>