[This article was first published on Gregor Gorjanc (gg), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
While reading UseR conference abstracts I came across this sentence: “Sugarcane is polypoid, i.e., has 8 to 14 copies of every chromosome, with individual alleles in varying numbers.” Vau! This generates really complex genotype system. Say we have biallelic gene with alleles being A and B. In diploids the possible genotypes are AA, AB, and BB. Given the above sentence in sugarcane possible genotypes are any permutation of A’s and B’s in a series of 8 to 14 alleles. I am not sure if 9, 11, and 13 are also allowed, that is having even number of chromosomes. In any case such permutations result in really large numbers!
Thinking about this a bit further it appears that the whole system is not that complex once we realize that genotyping does not tell as about the order of alleles (we can not distinguish between AB and BA), which simplifies from all possible permutations to all possible combinations, e.g., for biallelic gene in tetraploids this would correspond to 5 combinations and 16 permutations.
Bellow is an R snippet that shows how to enumerate all possible combinations or permutations
## Load package having nice combinatorial functions library(package="gtools") ## Specify alleles - just two for simplicity alleles <- c("A", "B") ## Possible genotypes for diploids combinations(n=length(alleles), r=2, v=alleles, repeats.allowed=TRUE) ## [,1] [,2] ## [1,] "A" "A" ## [2,] "A" "B" ## [3,] "B" "B" ## Possible genotypes for tetraploids combinations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE)ž ## [,1] [,2] [,3] [,4] ## [1,] "A" "A" "A" "A" ## [2,] "A" "A" "A" "B" ## [3,] "A" "A" "B" "B" ## [4,] "A" "B" "B" "B" ## [5,] "B" "B" "B" "B" permutations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE) ## [,1] [,2] [,3] [,4] ## [1,] "A" "A" "A" "A" ## [2,] "A" "A" "A" "B" ## [3,] "A" "A" "B" "A" ## [4,] "A" "A" "B" "B" ## [5,] "A" "B" "A" "A" ## [6,] "A" "B" "A" "B" ## [7,] "A" "B" "B" "A" ## [8,] "A" "B" "B" "B" ## [9,] "B" "A" "A" "A" ## [10,] "B" "A" "A" "B" ## [11,] "B" "A" "B" "A" ## [12,] "B" "A" "B" "B" ## [13,] "B" "B" "A" "A" ## [14,] "B" "B" "A" "B" ## [15,] "B" "B" "B" "A" ## [16,] "B" "B" "B" "B" ## Possible genotypes for 8-14 ploids spectrum <- seq(from=8, to=14, by=2) nS <- length(spectrum) retC <- vector(mode="list", length=nS) retP <- vector(mode="list", length=nS) for(i in 1:nS) { retC[[i]] <- combinations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE) retP[[i]] <- permutations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE) } combC <- sapply(retC, nrow) combP <- sapply(retP, nrow) cbind(spectrum, combC, combP) ## spectrum combC combP ## [1,] 8 9 256 ## [2,] 10 11 1024 ## [3,] 12 13 4096 ## [4,] 14 15 16384