A model of polygenic adaptation in an infinite population
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
How do allele frequencies change in response to selection? Answers to that question include ”it depends”, ”we don’t know”, ”sometimes a lot, sometimes a little”, and ”according to a nonlinear differential equation that actually doesn’t look too horrendous if you squint a little”. Let’s look at a model of the polygenic adaptation of an infinitely large population under stabilising selection after a shift in optimum. This model has been developed by different researchers over the years (reviewed in Jain & Stephan 2017).
Here is the big equation for allele frequency change at one locus:
That wasn’t so bad, was it? These are the symbols:
- the subscript i indexes the loci,
is the change in allele frequency per time,
is the effect of the locus on the trait (twice the effect of the positive allele to be precise),
is the frequency of the positive allele,
the frequency of the negative allele,
is the strength of selection,
is the phenotypic mean of the population; it just depends on the effects and allele frequencies
is the mutation rate.
This breaks down into three terms that we will look at in order.
The directional selection term
is the term that describes change due to directional selection.
Apart from the allele frequencies, it depends on the strength of directional selection , the effect of the locus on the trait
and how far away the population is from the new optimum
. Stronger selection, larger effect or greater distance to the optimum means more allele frequency change.
It is negative because it describes the change in the allele with a positive effect on the trait, so if the mean phenotype is above the optimum, we would expect the allele frequency to decrease, and indeed: when
this term becomes negative.
If you neglect the other two terms and keep this one, you get Jain & Stephan's “directional selection model”, which describes behaviour of allele frequencies in the early phase before the population has gotten close to the new optimum. This approximation does much of the heavy lifting in their analysis.
The stabilising selection term
is the term that describes change due to stabilising selection. Apart from allele frequencies, it depends on the square of the effect of the locus on the trait. That means that, regardless of the sign of the effect, it penalises large changes. This appears to make sense, because stabilising selection strives to preserve traits at the optimum. The cubic influence of allele frequency is, frankly, not intuitive to me.
The mutation term
Finally,
is the term that describes change due to new mutations. It depends on the allele frequencies, i.e. how of the alleles there are around that can mutate into the other alleles, and the mutation rate. To me, this is the one term one could sit down and write down, without much head-scratching.
Walking in allele frequency space
Jain & Stephan (2017) show a couple of examples of allele frequency change after the optimum shift. Let us try to draw similar figures. (Jain & Stephan don’t give the exact parameters for their figures, they just show one case with effects below their threshold value and one with effects above.)
First, here is the above equation in R code:
pheno_mean <- function(p, gamma) { sum(gamma * (2 * p - 1)) } allele_frequency_change <- function(s, gamma, p, z_prime, mu) { -s * gamma * p * (1 - p) * (pheno_mean(p, gamma) - z_prime) + - s * gamma^2 * 0.5 * p * (1 - p) * (1 - p - p) + mu * (1 - p - p) }
With this (and some extra packaging; code on Github), we can now plot allele frequency trajectories such as this one, which starts at some arbitrary point and approaches an optimum:
Animation of alleles at two loci approaching an equilibrium. Here, we have two loci with starting frequencies 0.2 and 0.1 and effect size 1 and 0.01, and the optimum is at 0. The mutation rate is 10-4 and the strength of selection is 1. Animation made with gganimate.
Resting in allele frequency space
The model describes a shift from one optimum to another, so we want want to start at equilibrium. Therefore, we need to know what the allele frequencies are at equilibrium, so we solve for 0 allele frequency change in the above equation. The first term will be zero, because
when the mean phenotype is at the optimum. So, we can throw away that term, and factor the rest equation into:
Therefore, one root is . Depending on your constitution, this may or may not be intuitive to you. Imagine that you have all the loci, each with a positive and negative allele with the same effect, balanced so that half the population has one and the other half has the other. Then, there is this quadratic equation that gives two other equilibria:
These points correspond to mutation–selection balance with one or the other allele closer to being lost. Jain & Stephan (2017) show a figure of the three equilibria that looks like a semicircle (from the quadratic equation, presumably) attached to a horizontal line at 0.5 (their Figure 1). Given this information, we can start our loci out at equilibrium frequencies. Before we set them off, we need to attend to the effect size.
How big is a big effect? Hur långt är ett snöre?
In this model, there are big and small effects with qualitatively different behaviours. The cutoff is at:
If we look again at the roots to the quadratic equation above, they can only exist as real roots if
because otherwise the expression inside the square root will be negative. This inequality can be rearranged into: