# Genetic Algorithms (GA)

• GA could solve problems where finding solution take too long
• GA techniques uses the principle of evolution to search for solutions to algorithmic problems
• The process in nature works as follows:

• Species need to be able to reproduce

• Reproduction is simply executing the encoded rules necessary to build an organism
• These rules are stored in strings of DNA called chromosomes
• Chromosomes are build up from small modular sequences called genes

• Genes are various permutation of four basic proteins (T, A, C, G) each gene holds info about the setting or alleles for a particular trait (سمه)
• When two parents reproduce, their DNA is split and child’s DNA is come from half of each one of them. This is called crossover or generic recombination
• The problem of bad traits and good traits

• The quality measure of any one creature’s mix of traits is called fitness
• A gene within a child organism somehow changed so that it’s completely new and can’t be traced as one of its parents this is called mutation
• The process can be thought of as "evolutionary search" in that we are still searching across the field of all the possible solutions to a given game problem but we are going to use method of searching like what happened in the evolution process
• The method is split into two parts: evolving a solution & use a solution
• Evolving a solution is usually performed during production of game, but there are some games (Creatures Series, Black & White) that include GA Learning in their games but in a limited way
• GA are computationally expensive
• GA belongs to class of stochastic search (discussed in appendix)
• Problem of local maximums and global maximums solutions
• GA separates their algorithm from the problem representation. Which able them to easily find solutions is systems of mixed variables types (discrete, continuous variables)
• Common data structure used here is String and any data structure could be used if:

• Each individual can encode a complete solution
• Genetic operations can be constructed for the data structure
• Basic Genetic Method:

• Initialize:

• Initialize a starting population of individuals
• These individuals can be generated randomly or with promising values from previous knowledge
• Size of population depends on: resources, how much time devoted, what seems work
• Evaluation:

• Evaluate each individual’s Success within the problem space
• Each individual is evaluated using fitness function that returns a value indicating overall performance of this individual
• A fitness test that requires each individual top play for 5 minutes (to try getting any actions) given a population of 100 would thus require 8.33 hours per generation
• Generation:

• Generate new individuals using reproduction
• Selection is another important part of genetic process
• If you select best genes then you could cover local max but if you select random genes it generates good solution
• There are three common methods to generate children:

• Crossover (sexual reproduction)
• Mutation (genetic variation)
• Elitism: taking the most fit genes from last generation and carrying them into the next
• Representing the Problem:

• Gene and Genome:

• Determine the structure of the genes in your problem what are the traits are included and determine their alleles (on/off, real value), what their ranges do any of these traits depend on each other
• This part of representation is the most important
• Advantages (many good) and disadvantages (longer, unwanted) of big search space for GA – P: 447
• Example: Gene and Genome of Pac-Man player (447)
• GA are good for problems where you can’t formulate a good way of determining solutions
• Fitness Function:

• Pac-Man fitness example (499)
• Your fitness function is really the essence of what your GA is trying to optimize so take care in design
• Fitness data should be scaled to prevent permutation convergence (PMC) and stagnation

• PMC
occurs when exceptional individuals are stop generating in early generations that so they are not better than their parents
• Stagnation occurs when many individuals have similar high numerical fitness
• Some of ways scaling the data:

• Sigma Truncation:

• . Where:

• F’ is the fitness value
• F^ average fitness
• Sigma is population standard deviation
• c is reasonable multiplier (usually between 1 and 3)
• Negative results are set to 0
• Rank Scaling:

• Replaces the fitness score with its position in the sorted order of fitness score
• Sharing Scaling:

• Only similar individual are scaled
• The number of genes that are shared among many genomes is recorded
• Genomes are then grouped by how many shared genes they have
• The fitness score of each genome is scaled by the number of other genomes in its sharing group
• Reproduction:

• There are two main reproduction types:

• Generational reproduction

• Using last generation as a tool to create the next either by copying directly (elitism), crossover and mutation ; completely replacing the original generation

• Few individuals that are created via crossover or mutation replace specific individuals in the original generation but the main body of population remains unchanged
• Who is replaced: worst, randomly, most similar, parents
• Elitism Technique:

• Advantages: ensures that we don’t miss best being in any population (alter the selection routine)
• Disadvantages: lessens diversity (يقلل التنوع)& speed up convergence (يكثر من التقارب)
• Selection Techniques:

• Roulette Wheel Selection:

• Genome that have highest score will have the largest chance of being selected

• This genome will has the biggest roulette wheel
• High fitness individual may be selected multiple times
• It’s random chance thus, the fittest individual is not guaranteed to be selected

• For this reason elitism is a common practice in GA genome selection
• Stochastic Universal Selection:

• Roulette wheel have pointers (as shown in figure) equal to the number of wanted offspring and then choose the pointed individuals by the pointer
• This technique has the advantages of keeping generic diversity high
• Tournament Selection:

• Number of individuals (k) are randomly drawn from the pool
• Highest fitness individual goes to next generation
• Then, all goes back into the pool (selected individual could be deleted)
• The previous steps is repeated several times
• Notes:

• If tournament size is larger, weak individuals have a smaller chance to be selected
• 1-way tournament is equivalent to random selection
• This technique works fine on parallel architecture
• After choosing individuals for crossover:

• Generate random number (between 0 and 1) for each pair
• If this number if less than exact number (say 0.7f) then crossover operation is applied and other 2 offspring are generated
• Otherwise pair is copied into next generation without alteration
• Suggestion: each crossover technique has a crossover operator value so when this value appears the technique is used
• Which crossover operator applied to a pair depends on several things:

• Type of variables structure your genomes are using
• Healthy dose of experimentation
• Some of the binary crossover operators are the following:

• Single Point Crossover:

• A position is randomly chosen somewhere along the length of the genome and then swap all the genes after this position among parents
• Two Point Crossover:

• Same as single point except that genes between 2 points are swapped
• Uniform Crossover (Every Point):

• It’s could be named multipoint crossover, it’s like 2-point but with multiple points
• This technique seems like doing a mutation
• Some of the continuous value variable crossover operators:

• Discrete Crossover: Swaps the variable values between individuals
• Intermediate Crossover:

• Offspring variable values are calculated:
• Scaling factor is chosen randomly from (-d, d+1). Normal intermediate uses d=0
• Line Crossover: Same as intermediate crossover but all variables have the same scale factor
• Some of the order specific crossover operators:

• Partially Mapped Crossover (PMX):

• You map a substring from one parent to another, then in crossover when you find mapped gene swap it with its value
• Example is the figure below
• Order-Based Crossover:

• As is the figure, choose several random genes from P1 and impose the same order they are found in the same genes within P2 by swapping values as needed
• Position-Based Crossover:
• After crossover mutation occur:
• Mutation means applying an operator to genes in offspring
• Rate of mutation or chance for gene will be mutated can vary widely depending on the problem
• Many academic papers discussed this issue (455)
• The specific type of mutation operator to apply is related to structure of genome
• Common mutation operators in order specific genomes:

• Exchange: swapping a gene with another
• Displacement: select 2 random positions within a genome defining substring, then re-insert it
• Insertion: Works as displacement but on one gene. This is the recommended technique
• Non-Order specific operators include the following:

• Binary Mutation: flip a bit within the genome
• Real-value Mutation: Add delta value to the gene
• Remember, GA are all about experimentation and finding out what operators to use as well as
• For GA the more time you put into them, the better results you will receive
• Highly scaled game-time problem: some calculations will be missed because that problem like collision(473)
• GA are brute force method that can find solutions in very difficult or computationally expensive areas of game AI
• Areas of using GA:

• When you have a number of parameters that interact in highly nonlinear or surprising ways
• When you have many local maximums and are searching for the best one (car physics parameters example)
• Solutions that involve discontinuous output (need explanation, page 474)
• When actual computation of a decision might be too costly (like trying to substitute in heavy math function)
• Pros of Genetic Algorithms:

• Easy set up
• Easy start getting results
• GAs are often a very strong optimization algorithm, meaning that you can find optimal solution
• They tend to find global maximum solutions rather than finding local maximum solutions
• They can operate in parallel
• Cons of Genetic Algorithms:

• Time Consuming
• GA performs evolution offline
• Hit or miss performance: To get suitable best combination this requires time and experimentation
• Weak definition of success and failure:

• It’s hard to know what’s the flaw in your structure
• It’s hard to tell the difference between buggy code and un-evolved code
• Not guaranteed optimal solution
• Tough to tune, and even tougher to add functionality

• Ant Colony Algorithms:

• Collective intelligence: الذكاء الجماعي – Ants help each other so they seems intelligent
• Ants but marks in trails that are near food so other ants go there
• Here we are building solutions based on the successes of the entire population rather than the success of one individual
• Co-evolution (478):

• Concept of cooperative and competitive evolution
• In cooperative, fitness increases when 2 creatures work together in competitive one increase and the other decrease
• This type of GA could be used in RTS civilization Games

• During the evolution process, crossover or mutation rates are influenced and changed
• Also try to apply most suitable operators on genomes
• Genetic Programming:

• Here we are evolving program (code) rather than parameters
• If we use data-driven game AI (Scripting) where your data is a small program instructions that represent behavior, the technique could be used to evolve AI character scripts instead of having to create them
• Design Considerations:

• Types of Solutions: it’s more suitable for tactical level
• Agent Reactivity: support reactivity for your system
• System Realism: because of its nature of optimized solutions sometimes it behaves in non-realistic manner
• Genre: could be used in RTS Games in: building order, finding working tactics (such as rushing)
• Platform: that’s not an issue because the work is mostly done offline
• Development Limitations:

• That’s really our concern. Because:

• GA are not debuggable in any real sense
• Do you really have time and patience to find optimal solution with different parameters, operators, gene design
• It your feedbacks to game will require changes? That’s hard thing to do
• Entertainment Limitations:

• Game difficulty levels could be handled by separate GA for each level with separate fitness function
• Appendix:

• Stochastic optimization (SO) methods are optimization
algorithms which incorporate probabilistic (random) elements, either in the problem data (the objective function, the constraints, etc.), or in the algorithm itself (through random parameter values, random choices, etc.), or in both. The concept contrasts with the deterministic optimization methods, where the values of the objective function are assumed to be exact, and the computation is completely determined by the values sampled so far