Determining Ancestral Flow Across Populations

In my paper, “A New Model of Computational Genomics” [1], I introduced an algorithm that allows you to test whether genome A is the ancestor of genomes B and C. You can read [1] to understand how it works, but for intuition, if both B and C begin identical to A, and then evolve independently in different environments, the number of bases they have in common going forward has to be the result of chance. Therefore, the genomes A and B, and A and C, should have more bases in common than B and C (i.e., both B and C deviate away from A independently of each other).

If we apply this test to three populations of genomes, instead of a single genome, we can count how many times the population A satisfied the inequality above, over all combinations of genomes selected from populations A, B, and C. Every time it is satisfied, we increment a counter associated with population A. We can then treat B and C as the ancestor (of course separately), again counting how many times the inequality is satisfied, producing a total of three counters. This will allow us to compare A, B, and C, and select the population that produced the greatest number of satisfied tests as the most likely to be the ancestor of the three populations. This is exactly what I did yesterday.

Each of the populations will again be associated with a counter that tells you how many times the inequality above is satisfied. If we divide by the total number of comparisons (i.e., the number of times we tested the inequality), we can produce a percentage. If e.g., we assume that Norway is the ancestor of Nigeria and Ancient Egypt, it turns out that test is satisfied 39.722% of the time. We can represent this as a graph with three vertices, one for each population, and a labelled edge running from the purported ancestor population to the two dependent populations. This is crudely represented below in my diagram, with an edge from Norway to both Nigeria and Ancient Egypt, labelled with the percentage of successful ancestry tests.

If we run the same tests again, this time treating each of Nigeria and Ancient Egypt as the ancestor population, we will produce an additional four edges, together producing a complete di-graph on three vertices in the form below. I nixed the labels for clarity, but did however include a new label “f”, representing the net percentage, and therefore net flow between the populations. This is easy to calculate, you just take the difference between e.g., the edge connecting A to B, and the edge connecting B to A, producing a single net figure from A to B. If the figure is positive, it means that A is more likely to be the ancestor of B, and if negative, that B is more likely to be the ancestor of A.

Below is some code that calculates the net flows for three populations, but I haven’t written any graph software yet (hence the “artwork” above). Astonishingly, Norway seems to be the root population for Nigeria and Ancient Egyptian. Now, this is subject to falsification, and there could be some other triplet of genomes that implies otherwise. That said, preliminary testing so far suggests that Northern Europeans, in particular Norwegians and Swedes, really might be the source population for a simply enormous number of people. And again, this could explain why South East Asians are literally white, genetically, which makes no sense, because it’s an extremely hot climate.

Interpreting the output below, we see that, in the context of the three populations, Norway is the best-fit ancestor of the Nigerians and Ancient Egyptians, and the Nigerians fit as a descendant of the Ancient Egyptians. What’s amazing about this, is you can use mtDNA, and literally impute geographic, directional flow, in this case, pointing mostly south, from Norway to West Africa and North East Africa. Thinking through all of this, it’s astonishing, because it’s just mtDNA, but you can literally project this onto a map, and determine geographic flows among populations.

This sounds crazy, Ancient Egyptians, they’re African! Sure, geographically, but when you look at their mtDNA, they appear to be South East Asian, and when you look at their artwork prior to Ancient Rome, they also look South East Asian. One sensible explanation, is that after the last Ice Age, about 10,000 years ago, populations that were otherwise stuck in Scandinavia started migrating. They don’t have to conquer to spread, they just have to survive with a higher probability than other populations. Given that they likely lived through Hell in a frozen tundra, once conditions improved, I think it’s perfectly sensible to assume that they survived with a higher probability than many other populations. This would cause them to spread without conquest, which seems to be the case, and I think what happened is, they ended up in South East Asia, migrating by foot (note Norwegians also test as the ancestor of Mongolians, Chinese, and Indian people, consistent with this hypothesis). Then, I think a group of South East Asians came back to the Middle East and North Africa (specifically the Pre-Roman Egyptians) by boat, which kicked off the ferocious competition of the Classical World. This is also consistent with the fact that the modern day Sri Lankans and Phoenicians are literally identical on the maternal line. This makes no sense, in the absence of interactions between the Classical World (or prior) and Asia. Also, there are obviously Asian Churches in Norway and several Buddha statues found at Viking sites. I don’t know for sure, because I’m only looking at mtDNA, but common sense suggests it’s something like this, otherwise you don’t have white people in a tropical climate like Thailand. I’ve been there, and I got horrible sunburn, it doesn’t make sense, absent an explanation like this.

Here’s the code, the dataset and any missing code can be found in [1].

Leave a comment