Genetic programming is a branch of genetic algorithms. The
main difference between genetic programming and genetic algorithms is
the representation of the solution. Genetic programming creates computer
programs in the lisp or scheme computer languages as the solution.
Genetic algorithms create a string of numbers that represent the
solution. Genetic programming, uses four steps to solve problems: 1)
Generate an initial population of random compositions of the functions
and terminals of the problem (computer programs).
2) Execute
each program in the population and assign it a fitness
value according to how well it solves the problem.
3) Create a
new population of computer programs. i) Copy the best existing
programs ii) Create new computer programs by mutation. iii)
Create new computer programs by crossover(sexual
reproduction).
4) The best computer program that appeared in any
generation, the best-so-far solution, is designated as the result of
genetic programming [Koza
1992].
 Figure 1: genetic programming Flowchart.
Fitness
Function The most difficult and most important concept of
genetic programming is the fitness function. The fitness function
determines how well a program is able to solve the problem. It varies
greatly from one type of program to the next. For example, if one were
to create a genetic program to set the time of a clock, the fitness
function would simply be the amount of time that the clock is wrong.
Unfortunately, few problems have such an easy fitness function; most
cases require a slight modification of the problem in order to find
the fitness.
Gun Firing Program A more
complicated example consists of training a genetic program to fire a
gun to hit a moving target. The fitness function is the distance that
the bullet is off from the target. The program has to learn to take
into account a number of variables, such as wind velocity, type of gun
used, distance to the target, height of the target, velocity and
acceleration of the target. This problem represents the type of
problem for which genetic programs are best. It is a simple fitness
function with a large number of variables.
Water Sprinkler
System Consider a program to control the flow of water through
a system of water sprinklers. The fitness function is the correct
amount of water evenly distributed over the surface. Unfortunately,
there is no one variable encompassing this measurement. Thus, the
problem must be modified to find a numerical fitness. One possible
solution is placing water-collecting measuring devices at certain
intervals on the surface. The fitness could then be the standard
deviation in water level from all the measuring devices. Another
possible fitness measure could be the difference between the lowest
measured water level and the ideal amount of water; however, this
number would not account in any way the water marks at other measuring
devices, which may not be at the ideal mark.
Maze Solving
Program If one were to create a program to find the solution to
a maze, first, the program would have to be trained with several known
mazes. The ideal solution from the start to finish of the maze would
be described by a path of dots. The fitness in this case would be the
number of dots the program is able to find. In order to prevent the
program from wandering around the maze too long, a time limit is
implemented along with the fitness function.
Functions and Terminals The terminal
and function sets are also important components of genetic
programming. The terminal and function sets are the alphabet of the
programs to be made. The terminal set consists of the variables and
constants of the programs. In the maze example, the terminal set would
contain three commands: forward, right and left. The function set
consists of the functions of the program. In the maze example the
function set would contain: If "dot" then do x else do y. In the gun
firing program is the terminal set would be composed of the different
variables of the problem. Some of these variables could be the
velocities and accelerations of the gun, the bullet and target. The
functions are several mathematical functions, such as addition,
subtraction, division, multiplication and other more complex
functions.
Crossover
Operation Two primary operations exist for modifying structures
in genetic programming. The most important one is the crossover
operation. In the crossover operation, two solutions are sexually
combined to form two new solutions or offspring. The parents are
chosen from the population by a function of the fitness of the
solutions. Three methods exist for selecting the solutions for the
crossover operation.
The first method uses probability based on
the fitness of the solution. If is the fitness of the solution Si
and
- is the total sum of all the members of the population, then the
probability that the solution Si will be copied to the next generation
is [Koza 1992]:
Another method for selecting the solution to be copied is
tournament selection. Typically the genetic program chooses two
solutions random. The solution with the higher fitness will win. This
method simulates biological mating patterns in which, two members of
the same sex compete to mate with a third one of a different sex.
Finally, the third method is done by rank. In rank selection,
selection is based on the rank, (not the numerical value) of the
fitness values of the solutions of the population [Koza
1992].
The creation of the offsprings from the crossover
operation is accomplish by deleting the crossover fragment of the
first parent and then inserting the crossover fragment of the second
parent. The second offspring is produced in a symmetric manner. For
example consider the two S-expressions in Figure 2, written in a
modified scheme programming language and represented in a
tree.
Figure 2: Crossover operation for genetic
programming. The bold selections on both parents are swapped to create
the offspring or children. (The child on the right is the parse tree
representation for the quadratic equation.)
Figure 3: This figure illustrates
one of the main advantages of genetic
programming over genetic algorithms. In genetic
programming identical parents can yield different
offspring, while in genetic algorithms identical
parents would yield identical offspring. The bold
selections indicate the subtrees to be
swapped.
An important improvement that genetic programming displays
over genetic algorithms is its ability to create two new solutions
from the same solution. In the Figure 3 the same parent is used twice
to create two new children.
Mutation Mutation is another important
feature of genetic programming. Two types of mutations are possible.
In the first kind a function can only replace a function or a terminal
can only replace a terminal. In the second kind an entire subtree can
replace another subtree. Figure 4 explains the concept of
mutation:
Figure 4: Two different types of mutations. The top
parse tree is the original agent. The bottom left parse tree illustrates
a mutation of a single terminal (2) for another single terminal (a). It
also illustrates a mutation of a single function (-) for another single
function (+). The parse tree on the bottom right illustrates a the
replacement of a subtree by another subtree.
Summary Genetic programming is much more
powerful than genetic algorithms. The output of the genetic algorithm
is a quantity, while the output of the genetic programming is a
another computer program. In essence, this is the beginning of
computer programs that program themselves.
Genetic programming
works best for several types of problems. The first type is where
there is no ideal solution, (for example, a program that drives a
car). There is no one solution to driving a car. Some solutions drive
safely at the expense of time, while others drive fast at a high
safety risk. Therefore, driving a car consists of making compromises
of speed versus safety, as well as many other variables. In this case
genetic programming will find a solution that attempts to compromise
and be the most efficient solution from a large list of
variables.
Furthermore, genetic programming is useful in
finding solutions where the variables are constantly changing. In the
previous car example, the program will find one solution for a smooth
concrete highway, while it will find a totally different solution for
a rough unpaved road.
 REFERENCES
Koza,
John R. 1992. Genetic Programming: On the Programming of
Computers by Means of Natural Selection. Cambridge, MA: The MIT
Press.
Cramer, Nichael Lynn: "A Representation for the Adaptive
Generation of Simple Sequential Programs", Proceedings, International
Conference on Genetic Algorithms and their Applications, July 1985
[CMU], pp183-187.
|