Pages Menu

Saturday, December 20, 2014

Genetic drift: Monte Carlo simulations

The starting point for this blog is a simulation of genetic drift using a Monte Carlo model. The model is included in  Lex Flagel's blog, which I recently stumbled upon. In the blog, Dr. Flegel has posted on many topics including genetics, math and statistics, art and nature, topics I'm quite interested in. In the genetic drift section, Professor Flegel kindly included some of his R-code. First, I took his code verbatum and for convenience repackaged it as an R function. Then I made a couple of additions to the model:  I allowed the population size to vary over the course of the model; and I allowed for non-zero selective advantage in each generation. Both of these modifications are described below. But first, I wanted to make sure that the unmodified code worked as expected for plain vanilla genetic drift.

Function evolve4() was used to reproduce Flagel's model, in which the breeding population (of 100 individuals) remains constant and there is no selective advantage. The result below looks a lot like Flagel's figure labeled "1000 Random Walks".

           evolve4(n.pop1=100, n.pop2=100, n.gen=5, n.samp=1000, f.init=0.5, s.adv=0.0)

Figure 1. In this simulation, population size = 100 (200 alleles), initial allele frequency = 0.5, neutral survival advantage. 1000 random walks, 5 generations

Model assumptions. Mating is random in each generation, and there is no generation overlap in mating. The population size is forced to remain at 100 individuals.  Every individual mates in each generation.

Varying population size. The size of the mating population in evolve4() can vary linearly from a specified starting size to a specified ending size. But, in fact, so far I've been keeping the population constant. It should be interesting to play around with varying of the population size. While the current code uses a linear function, an arbitrary population history could be imposed. For example, a bottleneck. More on this soon.

Selective Advantage



Sunday, June 1, 2014

Adding selection to Monte Carlo drift model

Function evolve4() can model a population under the pressure of a non-neutral selective advantage. Without getting into the survival details or even the phenotypes involved, we again just track the frequency of one allele in a pair of alleles carried in each individual in the population. I've done this by adding Hardy-Weinberg expressions for population change to the realized frequency in the n'th generation and using this sum as the input probability in the binomial function that calculates the (n+1)'th generation.

I used simplified forms of the Hardy-Weinberg expressions for calculating the change of allele frequency (which I took from Falconer and Mackay 4th ed., p.29):


Figure 2 shows the evolution of an allele in a population of 100 individuals. The initial allele frequency is 5%, on average, at the start (10 of the 200 alleles). The allele has a survival advantage of 0.2. Sixty generations are shown. Arbitrarily we specify that this allele is not dominant. This model corresponds to the one used in an example in Falconer and Mackay, p. 29, and Fig. 2.3.
Figure 2.



Figure 3. Mean value of allele frequency (q) +/- 1 s.d. in the Monte Carlo model above.
 Lower solid curve is mean of q-squared.

Here's the same model over 100 generations:
Figure 4. Same model as in Fig. 2, over 100 generations.


The approximations of the Hardy-Weinberg relations in Falconer and Makay, Equations 2.7 and 2.8 are the differential form of the logistic equation, a standard model for population growth with finite carrying capacity. 

Figure 5. Initial allele frequency here is 0.005, corresponding to 10 alleles randomly distributed in the population of 1000 individuals. Each sample approximates a logistic function. 


Below is an example of negative survival advantage, with 1000 individuals, a starting allele frequency of 0.5,  and survival advantage -0.1.