Just to make sure I don't forget how to do this, here's a test page. The following are screen shots of Google Maps images of Saint Malo, Fr., showing where the book "All the Things We Cannot See" by Anthony Doerr.
A Beginner's Notebook
Monday, June 8, 2015
Friday, April 10, 2015
Saturday, December 20, 2014
Genetic drift: Monte Carlo simulations
The starting point for this blog is a simulation of genetic drift using a Monte Carlo model. The model is included in Lex Flagel's blog, which I recently stumbled upon. In the blog, Dr. Flegel has posted on many topics including genetics, math and statistics, art and nature, topics I'm quite interested in. In the genetic drift section, Professor Flegel kindly included some of his R-code. First, I took his code verbatum and for convenience repackaged it as an R function. Then I made a couple of additions to the model: I allowed the population size to vary over the course of the model; and I allowed for non-zero selective advantage in each generation. Both of these modifications are described below. But first, I wanted to make sure that the unmodified code worked as expected for plain vanilla genetic drift.
Function evolve4() was used to reproduce Flagel's model, in which the breeding population (of 100 individuals) remains constant and there is no selective advantage. The result below looks a lot like Flagel's figure labeled "1000 Random Walks".
evolve4(n.pop1=100, n.pop2=100, n.gen=5, n.samp=1000, f.init=0.5, s.adv=0.0)
Model assumptions. Mating is random in each generation, and there is no generation overlap in mating. The population size is forced to remain at 100 individuals. Every individual mates in each generation.
Varying population size. The size of the mating population in evolve4() can vary linearly from a specified starting size to a specified ending size. But, in fact, so far I've been keeping the population constant. It should be interesting to play around with varying of the population size. While the current code uses a linear function, an arbitrary population history could be imposed. For example, a bottleneck. More on this soon.
Selective Advantage.
Function evolve4() was used to reproduce Flagel's model, in which the breeding population (of 100 individuals) remains constant and there is no selective advantage. The result below looks a lot like Flagel's figure labeled "1000 Random Walks".
evolve4(n.pop1=100, n.pop2=100, n.gen=5, n.samp=1000, f.init=0.5, s.adv=0.0)
Figure 1. In this simulation, population size = 100 (200 alleles), initial allele frequency = 0.5, neutral survival advantage. 1000 random walks, 5 generations
Model assumptions. Mating is random in each generation, and there is no generation overlap in mating. The population size is forced to remain at 100 individuals. Every individual mates in each generation.
Varying population size. The size of the mating population in evolve4() can vary linearly from a specified starting size to a specified ending size. But, in fact, so far I've been keeping the population constant. It should be interesting to play around with varying of the population size. While the current code uses a linear function, an arbitrary population history could be imposed. For example, a bottleneck. More on this soon.
Selective Advantage.
Labels:
genetic drift
Sunday, June 1, 2014
Adding selection to Monte Carlo drift model
Function evolve4() can model a population under the pressure of a non-neutral selective advantage. Without getting into the survival details or even the phenotypes involved, we again just track the frequency of one allele in a pair of alleles carried in each individual in the population. I've done this by adding Hardy-Weinberg expressions for population change to the realized frequency in the n'th generation and using this sum as the input probability in the binomial function that calculates the (n+1)'th generation.
I used simplified forms of the Hardy-Weinberg expressions for calculating the change of allele frequency (which I took from Falconer and Mackay 4th ed., p.29):
Figure 2 shows the evolution of an allele in a population of 100 individuals. The initial allele frequency is 5%, on average, at the start (10 of the 200 alleles). The allele has a survival advantage of 0.2. Sixty generations are shown. Arbitrarily we specify that this allele is not dominant. This model corresponds to the one used in an example in Falconer and Mackay, p. 29, and Fig. 2.3.
I used simplified forms of the Hardy-Weinberg expressions for calculating the change of allele frequency (which I took from Falconer and Mackay 4th ed., p.29):
Figure 2 shows the evolution of an allele in a population of 100 individuals. The initial allele frequency is 5%, on average, at the start (10 of the 200 alleles). The allele has a survival advantage of 0.2. Sixty generations are shown. Arbitrarily we specify that this allele is not dominant. This model corresponds to the one used in an example in Falconer and Mackay, p. 29, and Fig. 2.3.
Figure 2.
Figure 3. Mean value of allele frequency (q) +/- 1 s.d. in the Monte Carlo model above.
Lower solid curve is mean of q-squared.
Here's the same model over 100 generations:
Figure 4. Same model as in Fig. 2, over 100 generations.
The approximations of the Hardy-Weinberg relations in Falconer and Makay, Equations 2.7 and 2.8 are the differential form of the logistic equation, a standard model for population growth with finite carrying capacity.
Figure 5. Initial allele frequency here is 0.005, corresponding to 10 alleles randomly distributed in the population of 1000 individuals. Each sample approximates a logistic function.
Labels:
selection
Friday, February 1, 2013
Varying the population size
Here's a nice demonstration of a population bottleneck. A new version of the function, named evolve5(), was written in which the population size starts off as n.pop1, and then suddenly changes to n.pop2 at a specified time.
Here is the evolution of an allele in which the starting frequency is 0.2 and the selective advantage is 0.01. The population remains constant at 5000 throughout the 400 generations. The final allele frequency distribution is wide, with a broad peak centered around 0.4.
Here is the evolution of an allele in which the starting frequency is 0.2 and the selective advantage is 0.01. The population remains constant at 5000 throughout the 400 generations. The final allele frequency distribution is wide, with a broad peak centered around 0.4.
Below is the corresponding case when the population hits a "bottleneck". Population size is suddenly reduced from 5000 to 100 at generation 160. (Think dinosaurs.) Everything else is the same as in the model above. The variance increases at this time and by generation 400 most of the realizations result in loss of the allele in the population. However, in almost all the cases in which the allele is not lost, it becomes fixed in the population. The 'control' model (large, unchanged population, above) would eventually (~1000 generations or more) end up with the allele becoming fixed in the population, but the bottleneck vastly speeded up the evolution.
Finally, I looked at a case in which an allele has negative selection advantage, which would eventually die out in a population with a continuously large population. The variance caused by the bottleneck at generation 160, when the population drops from 5000 to 100, results in a significant number of realizations (about 25%) in which the allele becomes fixed in the population, quite the opposite of its fate had there not been a bottleneck. How cool is that!
Labels:
population size
Sunday, January 20, 2013
Preditor-prey modeling
This could be fun. I wonder if the Monte Carlo code could be modified to make a preditor-prey model. Two populations, each subject to survival advantage forces. But here the survival advantage for one population/species is not fixed, it is a function of it's own population, the population of the other (preditor) species, it's own available food supply, and other environmental forces.
The basic equations are the Lotka–Volterra equations.
Random variable for each population, reproducing as in function evolve(4), with same assumptions about random mating, etc. But also with preditor rate term in the binomial step.
Of course this is a very well-studied problem, a standard student exercise. I've come across it in a variety of places, including numerous lovely interactive graphic simulations on the web. There are even Monte Carlo solutions online, so I'm certainly talking about a well-known problem. Still, as an exercise, it will be fun and instructive to do it myself!!
The basic equations are the Lotka–Volterra equations.
Random variable for each population, reproducing as in function evolve(4), with same assumptions about random mating, etc. But also with preditor rate term in the binomial step.
Of course this is a very well-studied problem, a standard student exercise. I've come across it in a variety of places, including numerous lovely interactive graphic simulations on the web. There are even Monte Carlo solutions online, so I'm certainly talking about a well-known problem. Still, as an exercise, it will be fun and instructive to do it myself!!
Sunday, January 6, 2013
function mutate1()
mutate1= function( n.gen=50000, n.variables=2, A.init=0.00, mu.init=0.0001, nu.init = 0.00001, title = "Mutation: equilibrating allele frequencies")
{
# n.samp = 1 means there is no sample population, we're computing the mean.
# mu = mutation rate per generation A to a.
# nu = mutation rate per generation a to A.
# p = allele frequency of A(for all individuals and for all generations)
# Parameters for environment
mu = mu.init # mutation rate per generation A to a.
nu = nu.init # mutation rate per generation a to A.
# Initialize matrix
k = matrix(-999, nr=n.variables, nc=n.gen) #matrix to store values of p
for (i in 1:n.variables)
{
k[1,1] = A.init # initial freq of A
k[2,1] = 1- A.init # initial freq of a
for (j in 2:(n.gen) ) # n.gen generations
{
k[1, j] = (1-mu) * k[1, j-1] + nu * (1 - k[1, j-1]) # freq of A
k[2, j] = (1-mu) * k[2, j-1] + nu * (1 - k[2, j-1]) # freq of a
}
}
#######
#
# Plot allele frequency vs. generation
graphics.off()
plot(1,1, xlim=c(1,n.gen), ylim=c(0,1), ty='n', xlab='Generation', ylab='Allele Frequency' )
mtext(line=0, cex=0.7, paste('gen=', n.gen, 'A.init=', A.init, 'a.init=', 1-A.init, 'mu.init=', mu.init, 'nu.init=', nu.init ))
mtext(line=1, cex=1.0, title)
#
lines(1:n.gen, k[1, ], lwd=2, col=1, lty=1) # allele A, black
lines(1:n.gen, k[2, ], lwd=2, col=4, lty=2) # allele a, blue
legend(45000, 0.9, c("A","a"), cex=0.8,
col=c(1,4), lty=c(1,1), lwd=2)
} ###### END
{
# n.samp = 1 means there is no sample population, we're computing the mean.
# mu = mutation rate per generation A to a.
# nu = mutation rate per generation a to A.
# p = allele frequency of A(for all individuals and for all generations)
# Parameters for environment
mu = mu.init # mutation rate per generation A to a.
nu = nu.init # mutation rate per generation a to A.
# Initialize matrix
k = matrix(-999, nr=n.variables, nc=n.gen) #matrix to store values of p
for (i in 1:n.variables)
{
k[1,1] = A.init # initial freq of A
k[2,1] = 1- A.init # initial freq of a
for (j in 2:(n.gen) ) # n.gen generations
{
k[1, j] = (1-mu) * k[1, j-1] + nu * (1 - k[1, j-1]) # freq of A
k[2, j] = (1-mu) * k[2, j-1] + nu * (1 - k[2, j-1]) # freq of a
}
}
#######
#
# Plot allele frequency vs. generation
graphics.off()
plot(1,1, xlim=c(1,n.gen), ylim=c(0,1), ty='n', xlab='Generation', ylab='Allele Frequency' )
mtext(line=0, cex=0.7, paste('gen=', n.gen, 'A.init=', A.init, 'a.init=', 1-A.init, 'mu.init=', mu.init, 'nu.init=', nu.init ))
mtext(line=1, cex=1.0, title)
#
lines(1:n.gen, k[1, ], lwd=2, col=1, lty=1) # allele A, black
lines(1:n.gen, k[2, ], lwd=2, col=4, lty=2) # allele a, blue
legend(45000, 0.9, c("A","a"), cex=0.8,
col=c(1,4), lty=c(1,1), lwd=2)
} ###### END
Labels:
allele frequency,
mutation,
mutation rate,
R
Subscribe to:
Comments (Atom)

















