Saturday, July 18, 2015
One of the toughest nuts to crack in population genetics has proved to be the story of the people of the Hindu Kush. However, using TreeMix and ancient genomes from the recent Allentoft et al. and Haak et al. papers, I'm seeing most of the Kalash and Pathan individuals from the HGDP modeled as ~65% Late Neolithic/Early Bronze Age (LN/EBA) European and ~35% Central Asian. This, to me at least, makes a lot of sense. For instance:
The Kalash and Pathan samples that can't be modeled in this way, at least with the reference populations that I'm using, are fitted within a framework that closely resembles the old two-way Ancestral South Indian/Ancestral North Indian model (ASI/ANI). They usually score ~12% admixture from the branch leading to the Dai of southern China, which is obviously the proxy for ASI.
Both of these models are correct; they just show the same thing in different ways. So if we mesh them together the Kalash and Pathans come out ~65% LNE/EBA European (which includes substantial Caucasus or Caucasus-related ancestry), ~12% ASI, and ~23% something as yet undefined.
If I had to guess, I'd say the mystery ~23% was Neolithic admixture from what is now Iran. But ancient DNA has thrown plenty of curve balls at us already, so that's a low confidence prediction, even though it does make good sense.
It's also interesting to see the migration edges running from the Ulchi of east Siberia to the LN/EBA Europeans. This might be a signal of minor Eastern non-African (ENA), in other words East Eurasian, admixture. Then again, it might just be the algorithm trying to compensate for something, like excess Eastern Hunter-Gatherer (EHG) ancestry.
The full output from my analysis can be downloaded here. The reference samples and markers are listed here and here.
The Poltavka outlier
The real thing
The enigma of the Kalash