Genetic Algorithms and Evolutionary Approaches
Presented by Melih Sözdinler For CHE 516
Applications
When you look at wikipedia;
−
−
You will see huge number of problems that is solved using Genetic Algorithms(GA). GA applied on 57 different problems.
In famous Artificial Intelligence book “AI A Modern Approach”, author says that the third approach to try before proposing an algorithm is GAs. Second ? First ?
Applications
Some Applications
− − − − −
Plant floor layout. Gene expression profiling analysis Mutation testing Protein folding and protein/ligand docking. Traveling Salesman Problem.
− − − − −
Bioinformatics: RNA structure prediction. Bioinformatics Multiple sequence alignment. Building phylogenetic trees Game Theory Equilibrium Resolution Molecular Structure Optimization
Presentation Outline
Evolutionary Algorithms; Definitions and examples
− − −
Genetic Algorithms Evolutionary Programming Evolutionary Strategies
Genetic Algorithms and its applications Clustering and Biclustering(Optional, if time is available)
GA for Protein Conformation
Genetic Algorithms Starts with a population like other search methodologies. Population has k randomly generated states conditions. Each state is represented with an alphabet. What would be the alphabet for our conformation problem? Fitness Function Crossover, Reproduce and Mutate
Genetic Algorithms 24/(24+23+20+11) = 31%
+
=
Goal of Genetic Algorithms
Very simple goal;
− − −
Optimize your objective function Utilize the fitness function Search for best fitness(survival of fittest) and objective maximization or minimization.
Genetic Algorithm Pseudocode
Reproduce, Crossover, Mutate
Reproduce
−
Crossover
−
Two random parent from the population comes together. Divide their DNA or alphabet string, and exchange with each other.
Mutate
−
With small probability, change in the DNA state or input alphabet string.
Crossover Techniques
1 point crossover
N point crossover
Uniform crossover
1-point crossover Choose a random point on the two parents Split parents at this crossover point Create children by exchanging tails Typically in range (0.6, 0.9)
n-point crossover Choose n random crossover points Split along those points Glue parts, alternating between parents Generalisation of 1 point (still some positional bias)
Uniform crossover Assign 'heads' to one parent, 'tails' to the other Flip a coin for each gene of the first child Make an inverse copy of the gene for the second child Inheritance is independent of position
Why Genetic Algorithms They are randomized and in some cases generating random childs may be useful. When your search space is too large to search and there are several local minimums. Fast convergence and natural randomization using original states. May quickly converge to desired fitness level. Selection pressure
Evolutionary Programming(EP) Common to each classes; Creating population, evaulation with fitness function, hopefully population evolves into better populations. No crossover operator in EP. Tournament stage; 2k parent and k child bred from parents using mutation.
− −
Each individual compared to M individuals and war tables(win,lose) produced. Ranking occurs. M is important. It should be neither too large or too low due(selectivity pressure and large running time)
Evolutionary Strategies(ES) Crossover and Mutation are the part of ES. Over k population, intentionally, μ = 7k is chosen. Then over the k + μ, the best k new population with fitness function is chosen as representatives. Depending on the implementation, the best k may be chosen using only generated μ childs.
Conclusion We provide the definiton of 3 main evolutionary algorithms( GA, EP, ES ). We give some examples. We provide the key strategies behind the three algorithms. 8 Puzzle Problem Solution using that program, but you need relaxation.
BICLUSTERING
Biclustering Definition Clustering: groups of “Similar” items Biclustering:Simultaneously cluster two dimensions The problem is introduced by [Hartigan 72]. The problem is observed that NP-Hard(means hard to solve, like conformation problem)
Clustering Rows
Clustering Columns
Biclustering
Types of Biclusters All Constant
Constant Row Additive
Constant Row Multiplicative
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
1
1
1
1
3
3
3
3
4
4
4
4
1
1
1
1
4
4
4
4
8
8
8
8
Constant Column Additive
Constant Column Multiplicative
1
2
3
4
1
2
4
8
1
2
3
4
1
2
4
8
1
2
3
4
1
2
4
8
1
2
3
4
1
2
4
8
Types of Biclusters(cont) Coherent Additive Model
Coherent Multiplicative Model
1.0
2.0
4.0
5.0
1.0
2.0
4.0
5.0
2.0
3.0
5.0
6.0
2.0
4.0
8.0
10.0
4.0
5.0
7.0
8.0
0.4
0.8
1.6
2.0
5.0
6.0
8.0
9.0
0.8
1.6
3.2
4.0
Previous work Proposed Algorithms
Cheng and Church’s Algorithm(CC)[Cheng et al’00] Order-Preserving Sub Matrix(OPSM)[Ben Dor et al’02] Conserved gene expression motifs(xMOTIFs)[Murali et al’03] Iterative Signature Algorithm(ISA)[Bergmann et al’03] Statistical-Algorithmic Method for Bicluster Analysis(SAMBA) [Tanay et al’02, Sharan et al’03] Bimax[Prelic et al’06] LEB [Erten et al'09]
Previous work(cont.) Proposed Tools
Biclustering Analysis Toolbox(BicAT)[Barkow et al’06] Click and Expander[Sharan et al’03] Bicoverlapper[Santamaria et al’08] Biggest [Oliveria et al'09] Robinviz [Aladağ et al'10]
Gene Expression Matrix
Saccharomyces cerevisiae, original gene expression matrix(Left), ordered gene expression(Right).(2993x173)
Gene Expression Matrix and Biclustering Gene Expression Matrix
cond1 0.001
gene1 .... gene2993 0.078
cond2 0.012
... ...
cond173 0.890
1.120
...
1.000
Biclusters of Gene Expression Matrix
Bicluster 1 gene1 gene3 gene5 gene100 gene120 … gene1100 cond2 cond3 cond4 con101 … cond173 Bicluster 2 gene101 gene103 gene105 gene200 gene220 … gene2100 cond1 cond2 cond5 con150 … cond173 ...