Pages Menu

Friday, February 1, 2013

Varying the population size

Here's a nice demonstration of a population bottleneck. A new version of the function, named evolve5(), was written in which the population size starts off as n.pop1, and then suddenly changes to n.pop2 at a specified time.

Here is the evolution of an allele in which the starting frequency is 0.2 and the selective advantage is 0.01. The population remains constant at 5000 throughout the 400 generations. The final allele frequency distribution is wide, with a broad peak centered around 0.4.


Below is the corresponding case when the population hits a "bottleneck". Population size is suddenly reduced from 5000 to 100 at generation 160. (Think dinosaurs.)  Everything else is the same as in the model above. The variance increases at this time and by generation 400 most of the realizations result in loss of the allele in the population. However, in almost all the cases in which the allele is not lost, it becomes fixed in the population. The 'control' model (large, unchanged population, above) would eventually (~1000 generations or more) end up with the allele becoming fixed in the population, but the bottleneck vastly speeded up the evolution.




Finally, I looked at a case in which an allele has negative selection advantage, which would eventually die out in a population with a continuously large population.  The variance caused by the bottleneck at generation 160, when the population drops from 5000 to 100, results in a significant number of realizations (about 25%) in which the allele becomes fixed in the population, quite the opposite of its fate had there not been a bottleneck.  How cool is that!









Sunday, January 20, 2013

Preditor-prey modeling

This could be fun. I wonder if the Monte Carlo code could be modified to make a preditor-prey model. Two populations, each subject to survival advantage forces. But here the survival advantage for one population/species is not fixed, it is a function of it's own population, the population of the other (preditor) species, it's own available food supply, and other environmental forces.

The basic equations are the Lotka–Volterra equations.

Random variable for each population, reproducing as in function evolve(4), with same assumptions about random mating, etc.  But also with preditor rate term in the binomial step.

Of course this is a very well-studied problem, a standard student exercise. I've come across it in a variety of places, including numerous lovely interactive graphic simulations on the web.   There are even Monte Carlo solutions online, so I'm certainly talking about a well-known problem. Still, as an exercise, it will be fun and instructive to do it myself!!

Sunday, January 6, 2013

function mutate1()

mutate1= function( n.gen=50000, n.variables=2, A.init=0.00, mu.init=0.0001, nu.init = 0.00001, title = "Mutation: equilibrating allele frequencies")
{
    # n.samp = 1 means there is no sample population, we're computing the mean.
 
    # mu = mutation rate per generation A to a.
    # nu = mutation rate per generation a to A.
 
    # p = allele frequency of A(for all individuals and for all generations)
 
    # Parameters for environment
    mu = mu.init     # mutation rate per generation A to a.
    nu = nu.init     # mutation rate per generation a to A.
 
    # Initialize matrix
    k = matrix(-999, nr=n.variables, nc=n.gen) #matrix to store values of p
 
    for (i in 1:n.variables)
    {
        k[1,1] = A.init    #  initial freq of A
        k[2,1] = 1- A.init    #  initial freq of a
     
        for (j in 2:(n.gen) )   # n.gen generations
        {
            k[1, j] = (1-mu) *  k[1, j-1] + nu * (1 - k[1, j-1]) # freq of A
            k[2, j] = (1-mu) *  k[2, j-1] + nu * (1 - k[2, j-1]) # freq of a
        }
    }
    #######
    #
    # Plot allele frequency vs. generation
    graphics.off()
    plot(1,1, xlim=c(1,n.gen), ylim=c(0,1), ty='n', xlab='Generation', ylab='Allele Frequency' )
    mtext(line=0, cex=0.7, paste('gen=', n.gen,  'A.init=', A.init,  'a.init=', 1-A.init, 'mu.init=', mu.init, 'nu.init=', nu.init ))
    mtext(line=1, cex=1.0, title)
    #
    lines(1:n.gen, k[1, ], lwd=2, col=1, lty=1)  # allele A, black
    lines(1:n.gen, k[2, ], lwd=2, col=4, lty=2)  # allele a, blue
    legend(45000, 0.9, c("A","a"), cex=0.8,
    col=c(1,4), lty=c(1,1), lwd=2)
 
} ###### END




Saturday, January 5, 2013

function evolve5()

evolve5= function( n.pop1=5000, n.pop2=100, a.pop=0.4, n.gen=400, n.samp=100, f.init=0.2, s.adv=0.01, dom = TRUE, histogram=TRUE, plot.mean=FALSE, color=TRUE, title = "") 


{
#Based on code by Lex Flagel in his blog A Hopeful Monster
#
#      -allows for a linear time-varying population
#      -uses simplified forms of the Hardy-Weinberg expressions for calculating the change of allele frequency  (see Falconer and Mackay 4th ed., p.29, equations 2.7 and 2.8)
#      -case of either no dominance or complete dominance
#      -assumes: s.adv is small, or allele frequency is small.  Usually I use 0 < s.adv < 0.2, as in F&M page 29.

#   evolve5() employs the Heaviside function to create a step in population at a specified time. The population size starts out at n.pop1, and then suddenly changes to n.pop2 at the time in the n.gen generations defined by a.pop
#
#  Usage: 

#   Example of bottleneck (sudden drop in population size):
#   evolve5( n.pop1=5000, n.pop2=100, a.pop=0.4, n.gen=400, n.samp=100, f.init=0.2, s.adv=0.015, dom = TRUE, histogram=TRUE, plot.mean=FALSE, color=TRUE, title = "Example of bottleneck")

#  Counter-example - no bottleneck (population remains large throughout):
#   evolve5( n.pop1=5000, n.pop2=5000, a.pop=0.4, n.gen=400, n.samp=100, f.init=0.2, s.adv=0.015, dom = TRUE, histogram=TRUE, plot.mean=FALSE, color=TRUE, "Population remains large") 





# n.pop1 = starting population size
# n.pop2 = ending population size
# a.pop = normalized time (value between 0 and 1) of population change from n.pop1 to n.pop2 
# n.gen = number of generations to run
# n.samp = number of individual lines (monte carlo realizations) to calculate
# s.adv = reproductive advantage (=0.0 for just random drift)
# f.init = initial allele frequency in population
#
    k = matrix(-999, nr=n.samp, nc=n.gen-1) #matrix to store results
    sz.vector = n.pop1 + (n.pop2-n.pop1)*Heaviside(1:n.gen, as.integer(a.pop*n.gen))

    graphics.off()
#
for (i in 1:n.samp)
 {
#  Falcon & Mackay parameters
      q = f.init  # allele frequency
      p = 1-q
      s = s.adv
#
   for (j in 1:(n.gen-1) )   # n.gen generations require n.gen - 1 samples
{     
      dq = 0.5 * s * q * p                  #  No dominance
      if (dom) {dq = s * q * q * p}      #  Complete dominance
      prob= min((q + dq), 1.0)    #truncate prob from above
      prob= max(prob, 0.0)         #truncate prob from below
      sz = sz.vector[j]                    # get population size from pre-calculated size vector
      q = rbinom(1, size=sz, prob) / sz
      p = 1 - q
#      
      k[i, j] = q  #update matrix
}
}
#
k = cbind(rep(f.init, n.samp), k) #bind first gen. first column of matrix
#
#
#######
##       that’s it for sampling, code below is plotting
#
# Plot all n.samp random walks
dev.new(width=5,height=4)
#jpeg(“randomwalks.jpg”)
plot(1,1, xlim=c(1,n.gen), ylim=c(0,1), ty='n', xlab='generation', ylab='allele freq')
abline(h=0.5) #
mtext(line=0, cex=0.7, paste('pop=', n.pop1, n.pop2, 'gen=', n.gen, 'samp=', n.samp, 'f.init=', f.init, 'adv=', s.adv, 'dom=', dom ))
mtext(line=1, cex=1.0, title)
#
for (i in 1:n.samp){
ic=1
if (color) {ic = round(502 * i/n.samp, digits=0)}
  lines(1:n.gen, k[i, ], lwd=.3, col=ic)  #draw n.samp random walks as lines in color col=ic
  points(n.gen+0.1, k[i, n.gen-1], pch='-'    )  #add red rug on right
}

#
if (histogram) {
#  Histogram of final allele frequencies
dev.new(width=5,height=4)
hist (k[, n.gen], xlim=c(0,1),  main='', xlab="allele freq") 
mtext(line=0, cex=0.7, paste('Gen', n.gen, 'mean=', round(mean(k[, n.gen]),3), 'sigma=', round(sd(k[, n.gen]),3) ) )
mtext(line=1, cex=1, "Final allele frequency distribution" )
}

if (plot.mean)  {
# Plot the mean and std. dev. as function of generation
dev.new(width=5,height=4)
q.mean =  rep(-999, n.gen)
q.std =   rep(-999, n.gen)
q.sqmean = rep(-999, n.gen)
for (j in 1:(n.gen))
   {    # accumulate mean and sd of freq at each generation
          q.mean[j] = mean(k[,j])
          q.sqmean[j] = mean( (k[,j])*(k[,j]) )
          q.std[j] = sd(k[,j])
   }
plot(1,1, xlim=c(1,n.gen-1), ylim=c(0, 1 ), ty='n'  , xlab='Generation', ylab='Means of q and q^2')
lines (1:n.gen,  q.mean,  lwd=2)
lines (1:n.gen,  q.mean-q.std, lwd=0.5, lty=3)
lines (1:n.gen,  q.mean+q.std, lwd=0.5, lty=3)
lines (1:n.gen,  q.sqmean, lwd=2)
}

} ###### END 


###############################
#  Turn off all graphics windows
#  graphics.off()

#

Friday, January 4, 2013

function evolve4()

evolve4= function( n.pop1=1000, n.pop2=1000, n.gen=60, n.samp=100, f.init=0.05, s.adv=0.2, dom = FALSE, histogram=FALSE, plot.mean=TRUE, color=FALSE, title = "") 

{
#Based on code by Lex Flagel in his blog A Hopeful Monster
#
#      -allows for a linear time-varying population
#      -uses simplified forms of the Hardy-Weinberg expressions for calculating the change of allele frequency  (see Falconer and Mackay 4th ed., p.29, equations 2.7 and 2.8)
#      -case of either no dominance or complete dominance
#      -assumes: s.adv is small, or allele frequency is small.  Usually I use 0 < s.adv < 0.2, as in F&M page 29.

#  Usage: 
#     Recreate Fig 2.3 (upper panel) in F&M
#        evolve4(f.init=0.95, s.adv= -0.2, title="No dominance, neg. advantage", histogram=TRUE, color=TRUE) 
#       evolve4(f.init=0.05, s.adv=  0.2, title="No dominance, pos. advantage", histogram=TRUE, color=FALSE)  
#       evolve4(f.init=0.005, s.adv=  0.2, n.gen=150, title="No dominance, pos. advantage", histogram=TRUE, color=TRUE) 
#       Drift in 5 generations#     evolve4(n.pop1=100, n.pop2=100, n.gen=5, n.samp=1000, f.init=0.5, s.adv=0.0)
#       evolve4(f.init=0.5, s.adv=  0.1, n.pop1=200, n.pop2=200, n.gen=200, title="No dominance, pos. advantage", histogram=TRUE, color=TRUE, dom=FALSE) 
#      evolve4(f.init=0.5, s.adv=  -0.1, n.pop1=1000, n.pop2=1000, n.gen=200, title="No dominance, neg. advantage", histogram=TRUE, color=TRUE, dom=FALSE) 
#      evolve4(f.init=0.1, s.adv=  0.005, n.pop1=5000, n.pop2=5000, n.gen=200, title="", histogram=FALSE, color=TRUE, dom=FALSE) 





# n.pop1 = starting population size
# n.pop2 = ending population size
# n.gen = number of generations to run
# n.samp = number of individual lines (monte carlo realizations) to calculate
# s.adv = reproductive advantage (=0.0 for just random drift)
# f.init = initial allele frequency in population
#
    k = matrix(-999, nr=n.samp, nc=n.gen-1) #matrix to store results
#  graphics.off()
#  setwd("/Users/paulreasenberg/Desktop/evolution/plots")
#
for (i in 1:n.samp)
 {
#  Falcon & Mackay parameters
      q = f.init  # allele frequency
      p = 1-q
      s = s.adv
#
   for (j in 1:(n.gen-1) )   # n.gen generations require n.gen - 1 samples
{     
      dq = 0.5 * s * q * p                  #  No dominance
      if (dom) {dq = s * q * q * p}      #  Complete dominance
      prob= min((q + dq), 1.0)    #truncate prob from above
      prob= max(prob, 0.0)         #truncate prob from below
      sz = as.integer( 2* (n.pop1 + ( (n.pop2-n.pop1) *j / (n.gen-1) ) ) )  
      q = rbinom(1, size=sz, prob) / sz
      p = 1 - q
#      
      k[i, j] = q  #update matrix
}
}
#
k = cbind(rep(f.init, n.samp), k) #bind first gen. first column of matrix
#
#
#######
##       that’s it for sampling, code below is plotting
#
# Plot all n.samp random walks
dev.new(width=5,height=4)
#jpeg(“randomwalks.jpg”)
plot(1,1, xlim=c(1,n.gen), ylim=c(0,1), ty='n', xlab='generation', ylab='allele freq')
abline(h=0.5) #
mtext(line=0, cex=0.7, paste('pop=', n.pop1, n.pop2, 'gen=', n.gen, 'samp=', n.samp, 'f.init=', f.init, 'adv=', s.adv, 'dom=', dom ))
mtext(line=1, cex=1.0, title)
#
for (i in 1:n.samp){
ic=1
if (color) {ic = round(502 * i/n.samp, digits=0)}
  lines(1:n.gen, k[i, ], lwd=.3, col=ic)  #draw n.samp random walks as lines in color col=ic
  points(n.gen+0.1, k[i, n.gen-1], pch='-'    )  #add red rug on right
}

#
if (histogram) {
#  Histogram of final allele frequencies
dev.new(width=5,height=4)
hist (k[, n.gen], xlim=c(0,1),  main='', xlab="allele freq") 
mtext(line=0, cex=0.7, paste('Gen', n.gen, 'mean=', round(mean(k[, n.gen]),3), 'sigma=', round(sd(k[, n.gen]),3) ) )
mtext(line=1, cex=1, "Final allele frequency distribution" )
}

if (plot.mean)  {
# Plot the mean and std. dev. as function of generation
dev.new(width=5,height=4)
q.mean =  rep(-999, n.gen)
q.std =   rep(-999, n.gen)
q.sqmean = rep(-999, n.gen)
for (j in 1:(n.gen))
   {    # accumulate mean and sd of freq at each generation
          q.mean[j] = mean(k[,j])
          q.sqmean[j] = mean( (k[,j])*(k[,j]) )
          q.std[j] = sd(k[,j])
   }
plot(1,1, xlim=c(1,n.gen-1), ylim=c(0, 1 ), ty='n'  , xlab='Generation', ylab='Means of q and q^2')
lines (1:n.gen,  q.mean,  lwd=2)
lines (1:n.gen,  q.mean-q.std, lwd=0.5, lty=3)
lines (1:n.gen,  q.mean+q.std, lwd=0.5, lty=3)
lines (1:n.gen,  q.sqmean, lwd=2)
}

} ###### END 


###############################
#  Turn off all graphics windows
#  graphics.off()

#