Monday, May 14, 2007

butt simple computer models of speciation

this is a computer simulation of an idea i had about how disctinct species come about. basically i thought that since creatures can change in so many minute ways (any of their 30,000 genes can change by one letter) that there would be an astronomical number of possible kinds of creatures (2^50,000) and since they have so much time, they can explore all those varieties 2^number of generations, easily 100s of thousands. so they would end up being all over the range of variety for their group. BUT since the population size at any one generation is miniscule compared to those numbers ( a billion is WAY less than 2^50,000=10^13,000 =a billion billion...1000 times) then what is left at any given generation would be a few clumps of critters widely scattered accross the range of varieties. hence distinct species.


I've finally written the fist versions of my computer programs to explore this idea. god it's hard. learning a new programming language debugging... bleach. ain't done it in years.


When i get a chance i will write a more detailed explanation of these ideas, but for now here are the programs and some



program1:
in my first version, the genome was simply a string of binary digits, and each bit could mutate independently. this seemed the more natural way to me.

i start with a population of, say, 100 genomes identically = 0000000000

then each generation, each genome replicates by, say 4, and randomly mutate a bit by flipping it 0 to 1 or 1 to 0, at some low rate per generation.

then i randomly winnow those 400 back down to 100.

i display the population by ploting each genome (x,y)
then i display the population by plotting each gene, taking x=high(n/2)bits and y=low(n/2) bits

well, the genes bounce all over the morphospace, and don't bunch at all. any mutation can send them accross my morphospace.

this is puzzling. not sure what i'm doing here yet.

i suppose my method of visualy representing the genes is my phenotype.

so i tried:
program2

basically my genome is simply a pair of numbers (genes) between 0 and 200.

i start with a population of, say, 100 genomes identically = to (100,100)

then each generation, each genome replicates by, say 4, and i introduce randomly a +/-1 to each gene, at a low rate.

then i randomly winnow those 400 back down to 100.

i display the population by ploting each genome (x,y)

easy to program if you are used to the language you are using.

i was hoping these critters would crawl around morphospace slowly and split into groups, but in my first run i got a pretty cohesive blob that morphs and wanders SLOWLY. all the genomes stay close together in morphospace. and they don't spread much, nor split into distinct blobs. NO SPECIES YET! by 4000 generations i did get two distinct groups for a while, but one went extinct.

now i have to analyse my random number generator, analyse the whole process. i can mess with pop size, genome size alittle, fecundity and mutation rate.

oy.

one note: i suppose the morphospace for my first program is the corners of an n-dimensional cube and the critters can random walk all over it. the second morphospaces is a 200X200 array and the critters do a slower random walk around it.

odd.


Anyway, this looks like a fun way to explore basic ideas in evolutionary biology.


here's the code:

program2:
'species3.bas mutation of gene (x,x) by +\-1 each gene.
'simple model of high birth rate, low mutation rate, extinction

'also i need a way to mark on the display the fact that i've
'got multiples of the same genome


PopSize=200
NumOffspring=8
dim g(PopSize,1)
dim ng(PopSize*NumOffspring,1)
MiddleGenome=300
taken =1000
MutationRate=.07

'set pop genomes all to 300,300
for n=0 to PopSize-1
g(n,0)=MiddleGenome: g(n,1)=MiddleGenome
next n

Gen=1

do

'reproduce from g to ng with mutations

for n= 0 to PopSize-1
..for k=0 to NumOffspring-1
....i=n*NumOffspring+k

....for t=0 to 1 'once for each gene
......ng(i,t)=g(n,t)
......if rnd(1) LessThan MutationRate then
.........c=1-2*int(2*rnd(1)) 'either +1, -1
.........if (ng(i,t) GreaterThan 0
.......................and ng(i,t) LessThan 2*MiddleGenome)
.............or (ng(i,t)=0 and c=1)
.............or (ng(i,t)=2*MiddleGenome and c=-1)
.............then 'don't mutate past the boundaries
...............ng(i,t) = ng(i,t) +c
.........end if
......end if
....next t

..next k
next n



'kill off ng back to g, original population size, randomly
CritterNum=PopSize-1
do

..n=rnd(1)*NumOffspring*PopSize
..'pick an offspring at random to survive
..if ng(n,0) NotEqual taken then
....g(CritterNum,0)=ng(n,0)
....g(CritterNum,1)=ng(n,1)
....ng(n,0)=taken
....CritterNum=CritterNum - 1
..end if

loop until CritterNum =-1



' display gene pool
'for n=1 to 2: next n

print #1, "cls"
for n=0 to PopSize
..x=g(n,0): y=g(n,1)
..plot (x,y)
next n

cls
print Gen
Gen=Gen+1

loop


what's different in program1:

'simple model of high birth rate, low mutation rate, extinction
'my visual representation of morphospace is wrong though,
'because a single mutation in a "high" bit, will send a
'critter accross morphospace far from its ancestor.
'waht i really want to do is display my genes on an GenomeLen
'dimensional cube.

PopSize=200
NumOffspring=8
dim g(PopSize,1)
dim ng(PopSize*NumOffspring,1)
GenomeLen=14
half=2^(GenomeLen/2)
taken =2^GenomeLen+1

'set pop genomes all to 0
for n=0 to PopSize-1
g(n)=0
next n


do
'reproduce from g to ng with mutations

for n= 0 to PopSize-1
..for k=0 to NumOffspring-1
....i=n*NumOffspring+k
....ng(i)=g(n)
....if rnd(1) LessThan MutationRate then
.......ng(i) = ng(i) xor 2^int(rnd(1)*GenomeLen)
.......'flip a random bit in the genome
..next k
next n




'kill off ng back to g
'same


' display gene pool
for n=0 to PopSize
..x=int(g(n)/half): y=g(n)-x*half
..plot x,y
next n

4 comments:

Anonymous said...

I'm no evolutionary biolotist, but it seems that each genetic mutation that occurs in an organism may or may not be expressed physically. There are big parts of our DNA that may be just holding data for future use and some time things may slip into place and change could occur very quickly.

I think the above was one of the problems the folkes at the GOLEM project faced. They ran millions of hours of tests and decided that simple random mutation then survival of the fittest was not sufficent to explain evolution. The GOLEM creatures never kept old code, they used it or dumped it and that is very different from real animals (and we aren't even touching horzontal gene transfer!)

barry goldman said...

ah yes.. but realize that these are only the first two of about 4000 versions of simulations...

just goofing around getting the feel for some simple mathematics really.

i'll have to check out Golem.

i'm aiming towards:

http://www.nis.atr.jp/~ray/pubs/tierra/

i saw a him lecture on it in 91. it's a fascinating little system. i don't know why i've seen so little analysis of it.

Anonymous said...

Goofing off is as good a reason as any I could think of!

I made a simulation on my trusty TI85 a few years ago where a creature had to fill different needs (water, various nutrients, sleep, etc.) These things were picked up in different places so the creature would trot around gorging on something until it filled up and need to trot off in search of something else. The fun part was playing with the constants to make the creature live as long as possible. If it expends one thing too quickly then it can never venture too far from the source in search of the other things it needs.

Every once and a while it would get stuck between two and waver back and forth until it finally croaked!

I've never tried simulating genetics though. I'll settle for doing it vicariously through your blog for now. Post pictures if you can!

barry goldman said...

i suppose i could write some java thing and then people could watch it.

do i want to learn those details too? oy.