search this blog

Loading...

Thursday, July 31, 2014

Turks probably came from south Siberia


The good people from the Estonian Biocentre have just put out a preprint at bioRxiv focusing on the genetic origins of Turkic-speaking nomads. It's a solid effort based on a wide range of samples and several standard analyses, including a massive fastIBD run. The authors' conclusions are very sensible and probably correct:

Most of the Turkic peoples studied, except those in Central Asia, genetically resembled their geographic neighbors, in agreement with the elite dominance model of language expansion. However, western Turkic peoples sampled across West Eurasia shared an excess of long chromosomal tracts that are identical by descent (IBD) with populations from present-day South Siberia and Mongolia (SSM), an area where historians center a series of early Turkic and non-Turkic steppe polities. The observed excess of long chromosomal tracts IBD (> 1cM) between populations from SSM and Turkic peoples across West Eurasia was statistically significant. Finally, we used the ALDER method and inferred admixture dates (~9th–17th centuries) that overlap with the Turkic migrations of the 5th–16th centuries. Thus, our results indicate historical admixture among Turkic peoples, and the recent shared ancestry with modern populations in SSM supports one of the hypothesized homelands for their nomadic Turkic and related Mongolic ancestors.


However, even tough the paper includes a lot of detail, I still find it somewhat underwhelming. The blame lies with Lazaridis et al. 2013/2014, which really raised the bar for papers of this sort, using several ancient genomes and very sophisticated techniques to try and unravel the deep ancestry of Europeans (see here and here). It's probably unreasonable of me to expect most population genetics papers to be so thorough, but it's still disappointing when they're not.

Also, thanks to Lazaridis et al. as well as a few other recent ancient DNA studies, we now know that the prehistory of Eurasia was probably more complex than anyone had imagined only a few years ago. Once upon a time is was OK to blame any sort of seemingly eastern genetic signals on Genghis Khan or Attila the Hun. These days you'd look like a bit of an idiot trying that sort of thing.

So yes, in this case the authors probably got it right, and they probably did pick up signals of Turkic migrations from south Siberia and surrounds. But let's wait and see what a good number of ancient genomes reveal about the origins, direction and time frames of population movements across the Eurasian steppe and Taiga belt.

Citation...

Bayazit Yunusbayev, Mait Metspalu, Ene Metspalu, et al., The Genetic Legacy of the Expansion of Turkic-Speaking Nomads Across Eurasia, bioRxiv posted online July 30, 2014

Monday, July 28, 2014

Shared drift between seven ancient genomes and 163 present-day populations


I've just figured out a more effective way of running lots of f3-statistics, using the 3-Population (qp3Pop) Test offered as part of the Admixtools package. I'll be updating this post as new ancient genomes are published, but here's what I've got so far:

Ajvide58 (Late Neolithic Gotland hunter-gatherer)

Ajvide52 (Late Neolithic Gotland hunter-gatherer)

StoraFörvar11 (Mesolithic Gotland hunter-gatherer)

La Brana-1 (Mesolithic Iberian hunter-gatherer)

Gokhem2 (Late Neolithic Swedish farmer)

MA-1 (Upper Paleolithic Siberian hunter-gatherer)

AG-2 (Upper Paleolithic Siberian hunter-gatherer)

I'll also throw in the results for Karitiana Indians and Dai from southern China: see here and here. These should prove useful to anyone wanting to analyze the MA-1 output in more detail.

Below are a couple of graphs based on the shared drift stats for MA-1, Gokhem2 and La Brana-1. The datasheets with the full keys can be downloaded here and here. You can open them with any text editor, but they're best viewed with Past3, which is freely available here.


See also...

f3-stats: 100 present-day populations plus MA-1

The Gokhem2 factor

Saturday, July 26, 2014

The Gokhem2 factor


Gokhem2 is a late Neolithic genome from Sweden published by Skoglund et al. earlier this year. It's a very important sample because it probably represents the typical Western and Central European of its time; mostly of ancient Near Eastern origin but with substantial (perhaps as much as 25%) indigenous Western European Hunter-Gatherer (WHG) ancestry. Moreover, in all likelihood it belonged to one of the last people alive just before much of Europe experienced large-scale shifts in material culture and DNA during the early metal ages.

I ran an f3 analysis of 65 present-day populations plus Gokhem2 to see whether the descendants and/or close relatives of this 5,000 year-old individual made a significant impact on the modern European gene pool. The results suggest that indeed they did.

f3-statistics are used to confirm admixture; if the f3 ratio is significantly negative, then the test group is considered to be admixed.

In my analysis the lowest f3 means for almost all West Eurasians, except some Northeast Europeans, involve the Mbuti Pygmies and Gokhem2. This pairing seems to represent something very basal, and if we ignore it, we find that Northern Europeans are best characterized as Gokhem2 plus Amerindians or Siberians, and most Southern Europeans as Gokhem2 plus North Indians or Pakistanis.

Below are the five lowest f3 means (along with the standard errors and Z-scores) for several present-day European groups, after ignoring the Mbuti/Gokhem2 pairing. The full output from this test can be downloaded here.

Belorussian;Karitiana,Gokhem2 -0.002573 0.000774451 -3.32236
Belorussian;Gokhem2,Chukchi -0.00225896 0.000646711 -3.493
Belorussian;Gokhem2,Pima -0.00216931 0.000670227 -3.23669
Belorussian;Selkup,Gokhem2 -0.00203724 0.000592401 -3.43896
Belorussian;Shors,Gokhem2 -0.00192983 0.0005887 -3.27813

Bulgarian;Gokhem2,Chukchi -0.00354553 0.000601482 -5.89465
Bulgarian;Karitiana,Gokhem2 -0.00348264 0.000702038 -4.96076
Bulgarian;Gujarati3,Gokhem2 -0.00334141 0.000350863 -9.52341
Bulgarian;Gokhem2,Koryak -0.00332069 0.000647444 -5.12892
Bulgarian;Gujarati2,Gokhem2 -0.00331262 0.000331569 -9.99073

Chuvash;Gokhem2,Chukchi -0.00686169 0.000556132 -12.3383
Chuvash;Gokhem2,Koryak -0.00669996 0.000567184 -11.8127
Chuvash;Sardinian,Koryak -0.00604931 0.000186753 -32.392
Chuvash;Sardinian,Chukchi -0.0059954 0.000175401 -34.1812
Chuvash;French_Basque,Koryak -0.00598742 0.000179266 -33.3996

East_Sicilian;Gujarati3,Gokhem2 -0.0027439 0.000414587 -6.6184
East_Sicilian;Gujarati2,Gokhem2 -0.00268618 0.000403019 -6.66514
East_Sicilian;Sindhi,Gokhem2 -0.00262624 0.000353489 -7.42948
East_Sicilian;Balochi,Gokhem2 -0.00244354 0.000355967 -6.86452
East_Sicilian;Gujarati1,Gokhem2 -0.00235444 0.000412289 -5.71065

French_Basque;Shors,Gokhem2 -9.07978E-005 0.000580612 -0.156383
French_Basque;Gujarati3,Gokhem2 -4.85325E-005 0.000398641 -0.121745
French_Basque;Gujarati2,Gokhem2 -3.56909E-005 0.000386198 -0.092416
French_Basque;Sindhi,Gokhem2 4.45144E-006 0.000364349 0.0122175
French_Basque;Karitiana,Gokhem2 0.00010138 0.000748248 0.13549

Orcadian;Karitiana,Gokhem2 -0.00203301 0.000747568 -2.7195
Orcadian;Gokhem2,Pima -0.00178257 0.000658361 -2.70759
Orcadian;Gokhem2,Chukchi -0.00158611 0.000640651 -2.47578
Orcadian;Shors,Gokhem2 -0.00156046 0.000566159 -2.75623
Orcadian;Selkup,Gokhem2 -0.00151986 0.00058037 -2.61877

Portuguese;Karitiana,Gokhem2 -0.00355338 0.000769315 -4.61889
Portuguese;Gujarati3,Gokhem2 -0.00349678 0.000497986 -7.02185
Portuguese;Gujarati2,Gokhem2 -0.00335917 0.00048259 -6.96072
Portuguese;Shors,Gokhem2 -0.00322054 0.000638617 -5.04299
Portuguese;Sindhi,Gokhem2 -0.00320858 0.000469164 -6.83894

Swedish;Karitiana,Gokhem2 -0.00262634 0.000716148 -3.66732
Swedish;Gokhem2,Pima -0.00243545 0.000623376 -3.90688
Swedish;Gokhem2,Chukchi -0.0023131 0.000623257 -3.71131
Swedish;Shors,Gokhem2 -0.00226576 0.000577058 -3.9264
Swedish;Selkup,Gokhem2 -0.00223314 0.000570344 -3.91542

The reasons for the strong showing by the Karitiana should be obvious by now; Amazon Indians carry high levels of Ancient North Eurasian (ANE) ancestry, so they're simply the best proxies for the ANE-rich people who apparently pushed deep into Europe during the early metal ages. Pakistanis and North Indians usually produce lower f3 means for Southern Europeans probably because they carry lower levels of ANE than the Karitiana, and also harbor significant Neolithic ancestry from the Near East, which is much more important in Southern Europe than Northern Europe.

Unfortunately, this doesn't get us any closer to knowing the precise genetic composition of the aforementioned post-Neolithic invaders of Europe. They certainly weren't like the Karitiana, who derive almost 60% of their genetic structure from East Asia, because East Asian admixture is basically lacking in most of Europe apart from some current and former Turkic and Uralic-speaking regions (see here). They also couldn't have been exactly like Pakistanis and Indians, who carry South Asian admixture which, again, is essentially missing from Europe.

Hopefully the ancient genomes from the Samara Valley currently being analyzed at the Reich lab are as useful in helping to solve this riddle as they're shaping up to be (see here).

Interestingly, the f3-statistics from my latest experiment correlate very nicely with this Principal Component Analysis (PCA) of West Eurasia that I ran a few weeks ago to test the present-day genetic affinities of Gokhem2. Note that most Europeans on this plot can basically be described as a two-way mixture between Gokhem2 and something from the east.


Also worth noting is that Western Europeans are involved in the lowest f3 means for a few groups from deep in Asia. Again, the underlying cause for this is the correct balance of ancient components among the reference samples. At this stage, we can only speculate why in these instances that balance is the correct one.

Gujarati1;Dai,Orcadian -0.00268514 0.00020487 -13.1066
Gujarati1;Dai,Cyprian -0.00267843 0.000199391 -13.4331
Gujarati1;Dai,North_Italian -0.0026674 0.000203923 -13.0804
Gujarati1;Dai,Tuscan -0.00265071 0.000223236 -11.874
Gujarati1;Dai,Greek -0.00264462 0.000182486 -14.4922

Hakas;Sardinian,Koryak -0.00506239 0.000211893 -23.8912
Hakas;German,Yukaghir -0.00497669 0.000269175 -18.4887
Hakas;Irish,Yukaghir -0.00496753 0.000259862 -19.116
Hakas;French_Basque,Koryak -0.00492849 0.000215496 -22.8704
Hakas;French_Basque,Yukaghir -0.00490055 0.000247992 -19.7609

Shors;French_Basque,Koryak -0.00197323 0.00041609 -4.74232
Shors;Sardinian,Koryak -0.00193602 0.000408341 -4.74119
Shors;French_Basque,Yukaghir -0.00185561 0.000425466 -4.36137
Shors;Irish,Yukaghir -0.00173416 0.00044614 -3.88703
Shors;French,Yukaghir -0.00168648 0.000416114 -4.05293

By the way, shared drift statistics with Gokhem2 using f3(Mbuti;Gokhem2,Test) are available in a spreadsheet here. As expected, Sardinians easily top the list followed by the French Basques.

See also...

f3-stats: 100 present-day populations plus MA-1

More ancient genomes from Sweden: Pitted Ware forager Ajvide58 and TRB farm girl Gokhem2

Tuesday, July 22, 2014

f3-stats: 100 present-day populations plus MA-1


Here's a list of f3-statistics featuring 100 present-day populations, mostly from Eurasia, and MA-1, the now famous 24,000 year-old genome from south Siberia. The test was based on 125K SNPs and run with the threepop program, which is part of the TreeMix package.

Threepop stats (7.5MB download)

f3-statistics are used to confirm admixture; if the f3 ratio is significantly negative, then the test group is considered to be admixed.

Interestingly, the lowest f3-statistic in most cases involves MA-1. That's probably because it's the only ancient genome in this dataset, but very likely also because Ancient North Eurasian (ANE) groups related to MA-1 contributed considerable gene flow to the vast majority of modern populations across Eurasia and the Americas.

Unfortunately, the thing about f3 tests is that they're limited to just two reference groups, when often three or four references would be more helpful. Nevertheless, the results are still very useful if you actually know what to do with them. For instance, below is a list of shared drift statistics between the aforementioned present-day populations and MA-1, using f3(Mbuti;MA-1,Test). The Karitiana Indians of the Amazon easily top this list, followed by various Siberian groups and then Northeast Europeans. This makes good sense based on everything we've seen to date.

Shared drift with MA-1 (spreadsheet)

Below is a graph of f3(Mbuti;MA-1,Test) versus f3(Mbuti;Dai,Test), which I think is a useful way to compare ANE against ENA (eastern non-African or just East Eurasian) influence across Eurasia. The datasheet, which is in PAST format, including the full key can be donwnloaded here.


Note how the West Eurasians form a sloping cline that runs from the Arabian Peninsula to the Eastern Baltic. That's because the MA-1 statistic inflates the Dai statistic, so only the samples that clearly deviate from this cline towards the right harbor ENA admixture. Moreover, Sub-Saharan (SSA) and the so called Basal Eurasian admixtures depress both statistics, which is probably why, for instance, the Makrani, who are documented to be of partly Sub-Saharan origin, are outliers within the Central Asian sample set.

Despite the fact that Northeast Europeans are at the top of the graph, they don't actually carry the most significant levels of ANE. That's because their affinity to MA-1 is in large part mediated via so called Western European Hunter-Gatherer (WHG) ancestry. So if we assume that WHG is not found in Asia, as per Lazaridis et al. 2013, then it's the Kalash and Lezgins of the Hindu Kush and Northeast Caucasus, respectively, who come out the most ANE here.

Update 24/07/2014: I've added another 32 populations to the list of shared drift with MA-1 (see spreadsheet here) and five extra populations to the ANE vs. ENA f3 graph (see datasheet here). Below is the updated graph, with the above mentioned sloping cline marked by a line intersecting the Estonians at the top and the Samaritans at the bottom. I chose these samples because they show only trivial ENA and SSA admixtures, respectively.

See also...

TreeMix stuff: looking for MA-1 related ancestry in the Hindu Kush

Saturday, July 12, 2014

TreeMix stuff: looking for MA-1 related ancestry in the Hindu Kush


I finally got around to installing TreeMix. It's a cool little program for drawing up phylogenetic trees and inferring migration edges or admixture events from genome-wide allele frequency data. It seems to be popular with scientists working with ancient DNA and has featured recently in several important papers on ancient genomes.

I'm still learning how to use it and interpret the output, but with that in mind, here are some TreeMix graphs from tests designed to look for MA-1 or Ancient North Eurasian (ANE) related ancestry among the Kalash of the Hindu Kush. This is an interesting issue, considering the isolated nature of the Kalash, the archaic form of Indo-Iranian spoken by them, and the recent talk of ANE potentially being a genetic signal of the Proto-Indo-Europeans (see here).








The values for the migration edges shown in each tree can be downloaded here. The trees are unsupervised except for the San bushmen being specified each time as the outgroup. The first tree was produced with 130K SNPs and the rest with a higher quality subset of 30K SNPs (at read depth of 3x or above in the MA-1 genome). I found that using the higher quality markers kept the MA-1 branch at a respectable length and also made for more sensible trees when running with five or more migration edges.

Based on these results I'd say that the Kalash do harbor significant MA-1 related ancestry. The first tree, featuring 18 populations and four migration edges, shows that it might be as high as 43%.

However, after the extra populations and migration edges are added only the Karitiana Indians of the Amazon and Evens of Siberia show direct signals of admixture from the branch leading to MA-1. The Kalash, on the other hand, receive admixture from points located near the root of the MA-1 branch. I can't say with any great certainty what this means, but I suspect it shows that ANE arrived in the Hindu Kush in different waves and in a mixed form.

Interestingly, when the Turkic-speaking Chuvash are added they're shown to be partly East Eurasian, with 20-24% admixture from the ancestors of the Evens. This makes sense, because the early Turks are thought to have originated somewhere east of the Altai, possibly in south Siberia.

By the way, I have already looked at the issue of ANE ancestry among present-day Asians using the ADMIXTURE software, and came up with a figure of 33.5% for a Kalash individual (see here). At the moment, I'm not sure whether my TreeMix results invalidate this figure? Any thoughts?

Update 20/07/2014: The TreeMix package also offers a threepop test, which computes f3 statistics of the form of f3(Test;Reference1,Reference2). The test population is considered to be admixed if the f3 statistic is significantly negative. I took advantage of this option to look for admixture in 42 Eurasian populations, including the Kalash and Chamar, plus MA-1, using just over 127K SNPs.

None of the f3 stats for the Kalash proved to be significant, while the Chamar were just barely confirmed as a mixture between the Sakilli of South India and MA-1. However, that's probably because the threepop test has a hard time dealing with genetic isolates and unusually endogamous groups, like the Kalash and Chamar, respectively. Below are their ten lowest f3 stats, along with the standard errors and Z-scores.

Kalash;MA-1,Samaritan 0.00487064 0.000664486 7.32994
Kalash;MA-1,Armenian 0.00547845 0.000401906 13.6312
Kalash;MA-1,Georgian 0.00553588 0.000434261 12.7478
Kalash;Sardinian,MA-1 0.00557093 0.000490111 11.3667
Kalash;Abhkasian,MA-1 0.00621502 0.000421821 14.7338
Kalash;MA-1,Kurdish 0.00631716 0.000473973 13.3281
Kalash;MA-1,Iranian 0.0065704 0.000427014 15.3868
Kalash;Georgian,Dai 0.00702268 0.000239 29.3836
Kalash;North_Ossetian,MA-1 0.0070735 0.000460335 15.366
Kalash;Dai,Armenian 0.00723076 0.000233725 30.937

Chamar;Sakilli,MA-1 -0.000333833 0.000487363 -0.684979
Chamar;Dai,Armenian 0.000146969 0.000240917 0.610039
Chamar;Makrani,Dai 0.00020635 0.000211436 0.975946
Chamar;Punjabi_Jat,Dai 0.000217682 0.000260224 0.836519
Chamar;Balochi,Dai 0.000229282 0.00021638 1.05963
Chamar;Georgian,Dai 0.000260187 0.000251537 1.03439
Chamar;Dai,Samaritan 0.000304204 0.000340771 0.892692
Chamar;Brahui,Dai 0.000326537 0.000210435 1.55172
Chamar;Gujarati,Dai 0.000346848 0.000162122 2.13942
Chamar;Dai,Kurdish 0.000404303 0.000267694 1.51032

As expected, all of the Europeans in this run were clearly best characterized as mixtures between Sardinians and MA-1, apart from the Turkic-speaking Chuvash from far Eastern Europe, who were best fitted as a mixture of Lithuanians and the Dai from southern China, and the Sardinians, who were found to be unmixed. The full output from the test can be downloaded at the link below.

Threepop test: 42 populations plus MA-1 (127K SNPs)

See also...

f3-stats: 100 present-day populations plus MA-1

Wednesday, July 9, 2014

More ancient genomes from Sweden: Pitted Ware forager Ajvide58 and TRB farm girl Gokhem2


Ajvide58 is a male Neolithic forager from Gotland, dated to 4,900-4,600 cal. years B.P., belonging to the Pitted Ware culture, and carrying Y-chromosome haplogroup I2 (most likely I2a1) and mitochondrial (mtDNA) haplogroup U4d. Gokhem2 is a female Neolithic farmer from mainland southern Sweden, dated to 5,050-4,750 cal. B.P., belonging to the Trichterbecher Kultur (TRB culture), and carrying mtDNA haplogroup H1c. Both of these genomes were published earlier this year by Skoglund et al. 2014 (see here).

My analysis shows that Ajvide58 is very similar to Mesolithic Swedish forager StoraFörvar11 (see here), and also in part Ancient North Eurasian (ANE). This can be seen in the 4 Ancestors Oracle results in which one of the best fits for Ajvide58 is 3/4 Iberian Mesolithic forager La Brana-1 and 1/4 Upper Paleolithic Siberian forger and ANE proxy MA-1.

However, the Eurogenes K15 ancestry proportions suggest to me that the level of ANE in this sample is lower than in StoraFörvar11. That's because Ajvide58 shows less of the Eastern European component (16.87% vs. 23.23%), and none of the South Asian component. These two components, along with the Amerindian component, dominate MA-1's K15 results (see here).

On the other hand, Gokhem2 appears not to harbor any ANE ancestry; note the complete lack of the Eastern Euro, Amerindian and South Asian components in her K15 proportions, and absence of MA-1 in the Oracle results. This is in line with all scientific literature to date, which indicates that ANE was basically missing from Western and Central Europe during the Mesolithic and Neolithic. Indeed, this sample's best matching population in the Oracle are the Sardinians, one of the few present-day European groups without any detectable ANE admixture.

The absence of ANE in Gokhem2 and all other ancient European genomes from a farming context, like Stuttgart and Oetzi, is a very important point. That's because Neolithic farmers largely replaced indigenous hunter-gatherers across most of Europe, including in Scandinavia. As a result, it's probably safe to assume that this process reduced the amount of ANE in Scandinavia to much less than what was carried there by the indigenous foragers (15-19%). However, present-day Scandinavians carry around 17% of ANE, which must mean that there was another migration wave into Northern Europe after the Neolithic, coming from an area rich in ANE. This was probably the Indo-European expansion from the middle Volga region (see here).

Nevertheless, Gokhem2 does have forager admixture, which can be seen in her non-trivial levels of Eurogenes K15 components associated with indigenous European forager ancestry: North Sea 12.65%, Southeast Asian 5.22%, Baltic 5.06%, Oceanian 4% and Siberian 2.3%. What this suggests is that the admixture event between the Near Eastern and European ancestors of the TRB farmers didn't take place in Scandinavia, but rather somewhere on the European mainland where ANE wasn't present at the time. Again, the Oracle results are in agreement, because they feature La Brana-1 well ahead of Ajvide58.

Eurogenes K15 results for Ajvide58

North_Sea 31.02
Atlantic 13.58
Baltic 23.11
Eastern_Euro 16.87
West_Med 0.38
West_Asian 0
East_Med 0
Red_Sea 0
South_Asian 0
Southeast_Asian 4.44
Siberian 2.1
Amerindian 5.8
Oceanian 2.39
Northeast_African 0.31
Sub-Saharan 0

4 Ancestors Oracle results (with StoraFörvar11)

4 Ancestors Oracle results (without StoraFörvar11)

Principal Component Analyses (PCA) featuring West Eurasian, Eurasian and global reference sets, respectively, show that Ajvide58 is outside the range of modern West Eurasian genetic variation, which is in line with the results of all other ancient European foragers sequenced to date. The cross marks the spot (click on the images to download high resoution PDFs of the plots):




Eurogenes K15 results for Gokhem2

North_Sea 12.65
Atlantic 21.49
Baltic 5.06
Eastern_Euro 0
West_Med 38.42
West_Asian 0
East_Med 8.19
Red_Sea 2.47
South_Asian 0
Southeast_Asian 5.22
Siberian 2.3
Amerindian 0
Oceanian 4
Northeast_African 0.21
Sub-Saharan 0

4 Ancestors Oracle results for Gokhem2

The PCA basically show the same outcomes, with the TRB farm girl positioned just north of present-day Sardinians on the West Eurasian plot, between the Near East and Northern Europe on the Eurasian plot, and with Lithuanians on the global plot.




The Eurogenes K15 and Alexandr Burnashev's 4 Ancestors Oracle are available for use free of charge at GEDmatch for anyone with genotype data from 23andMe and similar personal genomics companies. Look for the Ad-mix option and then the Eurogenes tab.

See also...

PCA of five ancient genomes

4 Ancestors Oracle results for Anzick-1, La Brana-1 and MA-1


Saturday, July 5, 2014

Analysis of Mesolithic Swedish forager StoraFörvar11


StoraFörvar11, or SfF11, is a late Mesolithic genome from a cave on the small island of Stora Karlsö, just off the coast of Gotland. It was published earlier this year by Skoglund et al. along with several other ancient genomes dating to the Neolithic from Gotland and mainland Sweden (see here). Belonging to Northeast European-specific mitochondrial haplogroup U5a1, SfF11 appears to be the archytypal Scandinavian forager, with no detectable Neolithic farmer admixture but considerable Ancient North Eurasian (ANE) ancestry related to Upper Paleolithic hunter-gatherers from Siberia, such as MA-1 and AG2 (see here).

Please note, Sf11 was superimposed onto the first Principal Component Analysis (PCA) plot below, which initially only included La Brana-1, the ancient Mesolithic genome from northern Spain, and present-day West Eurasians. I did this to avoid creating a cluster with the two ancient genomes based not on genuine genetic affinities between them but their relatively poor quality. I obtained the PC coordinates for Sf11 from an almost identical 13K SNP PCA plot which can be seen here.

Note also the clear eastern affinity shown by SfF11 relative to La Brana-1, which in all likelihood is the result of the above mentioned shared ANE ancestry with MA-1, featured on the second PCA. To date, all ancient genomes from Western and Central Europe have basically lacked this admixture, while Scandinavian hunter-gatherers carried it at levels of 15-19%. As hypothesized by Lazaridis et al. 2013, it's likely that Eastern European hunter-gatherers harbored even greater levels of ANE, and it's probably a good bet that they introduced it into Scandinavia during and/or before the Mesolithic.


I also ran a couple of PCA with reference samples from across North Eurasia and the globe. Both were based on 14K SNPs.



Below are the Eurogenes K15 ancestry proportions for SfF11, and below that the 4 Ancestors Oracle results. Even though the K15 test was based on just 8K SNPs, the outcome appears robust, and correlates closely with results from more sophisticated formal mixture tests in scientific literature, in which European hunter-gatherers show a strong relationship to present-day East Baltic populations, especially Lithuanians. Moreover, among the best 4-way Oracle fits for SfF11 is 3/4 La Brana-1 and 1/4 MA-1, which is extremely close to the actual genetic structure of Scandinavian foragers: around 80% Western European Hunter-Gatherer (WHG) and around 20% ANE.

The unusually high South and Southeast Asian scores can probably be explained by shared ANE ancestry with South Asians and lack of the so called Basal Eurasian admixture, respectively. Indeed, the latter is a very good bet considering the complete absence of any sort of Mediterranean and Near Eastern signals in these results.

Eurogenes K15

Baltic 29.24
North_Sea 23.97
Eastern_Euro 23.23
Southeast_Asian 5.97
Atlantic 5.62
Amerindian 4.52
South_Asian 4.36
Oceanian 2.17
Northeast_African 0.58
Siberian 0.34
West_Med 0
West_Asian 0
East_Med 0
Red_Sea 0
Sub-Saharan 0

4 Ancestors Oracle

Least-squares method.

Using 1 population approximation:
1 Estonian @ 14.153281
2 Erzya @ 14.620788
3 Kargopol_Russian @ 14.700492
4 Southwest_Russian @ 15.448751
5 Ukrainian @ 15.825631
6 Lithuanian @ 15.842059
7 Ukrainian_Belgorod @ 16.110345
8 East_Finnish @ 16.435534
9 Belorussian @ 16.531115
10 Ukrainian_Lviv @ 16.638975
11 Estonian_Polish @ 16.671571
12 Polish @ 17.379799
13 South_Polish @ 17.805012
14 Russian_Smolensk @ 17.812963
15 Finnish @ 18.279374
16 La_Brana-1 @ 19.903407
17 Southwest_Finnish @ 21.942936
18 Moldavian @ 23.158096
19 Croatian @ 23.266324
20 Hungarian @ 24.020402

Using 2 populations approximation:
1 Erzya+Estonian @ 12.292066
2 Estonian+Kargopol_Russian @ 13.190123
3 Erzya+La_Brana-1 @ 13.192429
4 Erzya+Lithuanian @ 13.414829
5 Erzya+Ukrainian @ 13.440955
6 Erzya+Ukrainian_Lviv @ 13.540859
7 Erzya+Finnish @ 13.602815
8 East_Finnish+Lithuanian @ 13.693698
9 Kargopol_Russian+Lithuanian @ 13.735122
10 Estonian+Southwest_Russian @ 13.994994
11 East_Finnish+Erzya @ 14.077424
12 Estonian+Ukrainian_Belgorod @ 14.113102
13 Kargopol_Russian+Ukrainian @ 14.126683
14 Estonian+Estonian @ 14.153281
15 Belorussian+Erzya @ 14.180946
16 Erzya+Southwest_Russian @ 14.186181
17 Kargopol_Russian+Ukrainian_Lviv @ 14.247527
18 Estonian+Ukrainian @ 14.247854
19 Erzya+Polish @ 14.291491
20 Estonian+Lithuanian @ 14.31161

Using 3 populations approximation:
1 50% Estonian +25% Lithuanian +25% MA-1 @ 11.982448
2 50% Lithuanian +25% Estonian +25% MA-1 @ 12.169832
3 50% Estonian +25% Estonian +25% MA-1 @ 12.225538
4 50% Erzya +25% Estonian +25% La_Brana-1 @ 12.250755
5 50% Erzya +25% Estonian +25% Estonian @ 12.292066
6 50% Lithuanian +25% La_Brana-1 +25% MA-1 @ 12.473574
7 50% Erzya +25% La_Brana-1 +25% Lithuanian @ 12.480595
8 50% Lithuanian +25% Finnish +25% MA-1 @ 12.547096
9 50% Erzya +25% Estonian +25% Ukrainian_Lviv @ 12.657215
10 50% Erzya +25% Estonian +25% Ukrainian @ 12.660239
11 50% Erzya +25% Estonian +25% Lithuanian @ 12.661794
12 50% Estonian +25% Erzya +25% Kargopol_Russian @ 12.679962
13 50% Erzya +25% Erzya +25% La_Brana-1 @ 12.695461
14 50% Erzya +25% La_Brana-1 +25% Ukrainian @ 12.707643
15 50% Estonian +25% Erzya +25% Estonian @ 12.716859
16 50% Erzya +25% Finnish +25% Lithuanian @ 12.72455
17 50% Erzya +25% Estonian +25% Finnish @ 12.737834
18 50% Erzya +25% La_Brana-1 +25% Ukrainian_Lviv @ 12.753404
19 50% Lithuanian +25% Lithuanian +25% MA-1 @ 12.768751
20 50% Estonian +25% Belorussian +25% MA-1 @ 12.780747

Using 4 populations approximation:
1 Estonian+Estonian+Lithuanian+MA-1 @ 11.982448
2 Estonian+Lithuanian+Lithuanian+MA-1 @ 12.169832
3 Estonian+Estonian+Estonian+MA-1 @ 12.225538
4 Erzya+Erzya+Estonian+La_Brana-1 @ 12.250755
5 Erzya+Erzya+Estonian+Estonian @ 12.292066
6 Estonian+La_Brana-1+Lithuanian+MA-1 @ 12.434074
7 La_Brana-1+Lithuanian+Lithuanian+MA-1 @ 12.473574
8 Erzya+Erzya+La_Brana-1+Lithuanian @ 12.480595
9 Finnish+Lithuanian+Lithuanian+MA-1 @ 12.547096
10 Erzya+Erzya+Estonian+Ukrainian_Lviv @ 12.657215
11 Erzya+Erzya+Estonian+Ukrainian @ 12.660239
12 Estonian+Lithuanian+MA-1+Ukrainian @ 12.66118
13 Erzya+Erzya+Estonian+Lithuanian @ 12.661794
14 Erzya+Estonian+Estonian+Kargopol_Russian @ 12.679962
15 Erzya+Erzya+Erzya+La_Brana-1 @ 12.695461
16 Estonian+Lithuanian+MA-1+Ukrainian_Lviv @ 12.697136
17 Erzya+Erzya+La_Brana-1+Ukrainian @ 12.707643
18 Erzya+Estonian+Estonian+Estonian @ 12.716859
19 Erzya+Erzya+Finnish+Lithuanian @ 12.72455
20 Erzya+Erzya+Estonian+Finnish @ 12.737834
21 Estonian+Finnish+Lithuanian+MA-1 @ 12.746305
22 Erzya+Erzya+La_Brana-1+Ukrainian_Lviv @ 12.753404
23 Lithuanian+Lithuanian+Lithuanian+MA-1 @ 12.768751
24 Belorussian+Estonian+Estonian+MA-1 @ 12.780747
25 Estonian+Estonian+MA-1+Ukrainian @ 12.797031
26 Estonian+Estonian+La_Brana-1+MA-1 @ 12.807529
27 Erzya+Estonian+Estonian+Ukrainian @ 12.813496
28 Estonian+Estonian+MA-1+Ukrainian_Lviv @ 12.822931
29 Erzya+Estonian+Kargopol_Russian+La_Brana-1 @ 12.831473
30 Erzya+Estonian+Estonian+Lithuanian @ 12.839613
31 Chuvash+Estonian+Estonian+Lithuanian @ 12.851803
32 Belorussian+Estonian+Lithuanian+MA-1 @ 12.855733
33 Erzya+Estonian+Estonian+Ukrainian_Lviv @ 12.857349
34 East_Finnish+Erzya+Estonian+Lithuanian @ 12.875013
35 Erzya+Estonian+Kargopol_Russian+Lithuanian @ 12.901956
36 Erzya+Estonian+La_Brana-1+Lithuanian @ 12.90565
37 Erzya+Kargopol_Russian+La_Brana-1+Lithuanian @ 12.914481
38 Erzya+Estonian+Estonian+La_Brana-1 @ 12.921321
39 Erzya+Estonian+Estonian+Southwest_Russian @ 12.931952
40 Lithuanian+Lithuanian+MA-1+Ukrainian @ 12.932804

Gaussian method.

Using 1 population approximation:
1 East_Finnish @ 12.111642
2 Finnish @ 12.136433
3 Tatar @ 12.260871
4 Chuvash @ 12.287812
5 Kargopol_Russian @ 13.238854
6 Erzya @ 13.290701
7 Ukrainian @ 14.224517
8 North_Swedish @ 14.501487
9 Mari @ 14.582022
10 La_Brana-1 @ 15.102585
11 Ukrainian_Lviv @ 15.466692
12 Moldavian @ 16.561361
13 Ukrainian_Belgorod @ 16.829215
14 Southwest_Finnish @ 17.044556
15 Southwest_Russian @ 17.644306
16 Estonian_Polish @ 17.912619
17 Swedish @ 18.055712
18 Estonian @ 18.417704
19 Hungarian @ 18.442869
20 Lithuanian @ 18.500045

Using 2 populations approximation:
1 La_Brana-1+Mari @ 9.086839
2 Kargopol_Russian+La_Brana-1 @ 9.216681
3 La_Brana-1+MA-1 @ 9.529079
4 Chuvash+La_Brana-1 @ 9.628936
5 Erzya+La_Brana-1 @ 9.741056
6 La_Brana-1+Tatar @ 10.312023
7 East_Finnish+La_Brana-1 @ 10.369729
8 Chuvash+Estonian @ 10.38245
9 Estonian+La_Brana-1 @ 10.698394
10 Chuvash+Finnish @ 10.701826
11 Estonian+Tatar @ 10.72273
12 Estonian+Shors @ 10.734028
13 Chuvash+Lithuanian @ 10.781409
14 Chuvash+East_Finnish @ 10.832523
15 Chuvash+Kargopol_Russian @ 11.058841
16 Finnish+Tatar @ 11.078731
17 East_Finnish+Tatar @ 11.104768
18 Lithuanian+Shors @ 11.131471
19 Chuvash+Ukrainian @ 11.241182
20 Estonian+Hakas @ 11.257456

Using 3 populations approximation:
1 50% La_Brana-1 +25% Estonian +25% MA-1 @ 6.880967
2 50% La_Brana-1 +25% La_Brana-1 +25% MA-1 @ 7.035486
3 50% La_Brana-1 +25% Lithuanian +25% MA-1 @ 7.1341
4 50% Estonian +25% La_Brana-1 +25% MA-1 @ 7.18973
5 50% La_Brana-1 +25% East_Finnish +25% MA-1 @ 7.57191
6 50% La_Brana-1 +25% Finnish +25% MA-1 @ 7.600389
7 50% Lithuanian +25% La_Brana-1 +25% MA-1 @ 7.628929
8 50% La_Brana-1 +25% Estonian_Polish +25% MA-1 @ 7.697983
9 50% La_Brana-1 +25% Belorussian +25% MA-1 @ 7.70291
10 50% La_Brana-1 +25% Kargopol_Russian +25% MA-1 @ 7.781779
11 50% La_Brana-1 +25% MA-1 +25% Southwest_Finnish @ 7.798672
12 50% La_Brana-1 +25% Erzya +25% MA-1 @ 7.80171
13 50% La_Brana-1 +25% MA-1 +25% Polish @ 7.929863
14 50% La_Brana-1 +25% MA-1 +25% Southwest_Russian @ 7.935151
15 50% La_Brana-1 +25% MA-1 +25% Russian_Smolensk @ 8.031297
16 50% La_Brana-1 +25% MA-1 +25% North_Swedish @ 8.049602
17 50% La_Brana-1 +25% MA-1 +25% Ukrainian_Belgorod @ 8.049701
18 50% La_Brana-1 +25% MA-1 +25% Ukrainian @ 8.06409
19 50% La_Brana-1 +25% MA-1 +25% South_Polish @ 8.188305
20 50% Finnish +25% La_Brana-1 +25% MA-1 @ 8.237496

Using 4 populations approximation:
1 Estonian+La_Brana-1+La_Brana-1+MA-1 @ 6.880967
2 La_Brana-1+La_Brana-1+La_Brana-1+MA-1 @ 7.035486
3 La_Brana-1+La_Brana-1+Lithuanian+MA-1 @ 7.1341
4 Estonian+Estonian+La_Brana-1+MA-1 @ 7.18973
5 Estonian+La_Brana-1+Lithuanian+MA-1 @ 7.414412
6 East_Finnish+La_Brana-1+La_Brana-1+MA-1 @ 7.57191
7 Finnish+La_Brana-1+La_Brana-1+MA-1 @ 7.600389
8 La_Brana-1+Lithuanian+Lithuanian+MA-1 @ 7.628929
9 Estonian+Finnish+La_Brana-1+MA-1 @ 7.689347
10 Estonian_Polish+La_Brana-1+La_Brana-1+MA-1 @ 7.697983
11 Belorussian+La_Brana-1+La_Brana-1+MA-1 @ 7.70291
12 East_Finnish+Estonian+La_Brana-1+MA-1 @ 7.712903
13 Finnish+La_Brana-1+Lithuanian+MA-1 @ 7.779771
14 Kargopol_Russian+La_Brana-1+La_Brana-1+MA-1 @ 7.781779
15 La_Brana-1+La_Brana-1+MA-1+Southwest_Finnish @ 7.798672
16 Erzya+La_Brana-1+La_Brana-1+MA-1 @ 7.80171
17 East_Finnish+La_Brana-1+Lithuanian+MA-1 @ 7.850763
18 Estonian+Estonian_Polish+La_Brana-1+MA-1 @ 7.890161
19 Belorussian+Estonian+La_Brana-1+MA-1 @ 7.906509
20 Estonian+La_Brana-1+MA-1+Southwest_Finnish @ 7.927839
21 La_Brana-1+La_Brana-1+MA-1+Polish @ 7.929863
22 La_Brana-1+La_Brana-1+MA-1+Southwest_Russian @ 7.935151
23 Estonian+Kargopol_Russian+La_Brana-1+MA-1 @ 7.940811
24 Erzya+Estonian+La_Brana-1+MA-1 @ 7.965223
25 La_Brana-1+Lithuanian+MA-1+Southwest_Finnish @ 7.991558
26 La_Brana-1+Lithuanian+MA-1+North_Swedish @ 8.029449
27 La_Brana-1+La_Brana-1+MA-1+Russian_Smolensk @ 8.031297
28 Belorussian+La_Brana-1+Lithuanian+MA-1 @ 8.038993
29 Estonian_Polish+La_Brana-1+Lithuanian+MA-1 @ 8.046271
30 La_Brana-1+La_Brana-1+MA-1+North_Swedish @ 8.049602
31 La_Brana-1+La_Brana-1+MA-1+Ukrainian_Belgorod @ 8.049701
32 La_Brana-1+La_Brana-1+MA-1+Ukrainian @ 8.06409
33 Estonian+La_Brana-1+MA-1+North_Swedish @ 8.075392
34 Estonian+La_Brana-1+MA-1+Polish @ 8.08945
35 Kargopol_Russian+La_Brana-1+Lithuanian+MA-1 @ 8.100132
36 Estonian+La_Brana-1+MA-1+Southwest_Russian @ 8.108852
37 Erzya+La_Brana-1+Lithuanian+MA-1 @ 8.127814
38 Estonian+La_Brana-1+MA-1+Ukrainian @ 8.153751
39 La_Brana-1+Lithuanian+MA-1+Polish @ 8.17359
40 La_Brana-1+La_Brana-1+MA-1+South_Polish @ 8.188305

The Eurogenes K15 and Alexandr Burnashev's 4 Ancestors Oracle are available for use free of charge at GEDmatch for anyone with genotype data from 23andMe and similar personal genomics companies. Look for the Ad-mix option and then the Eurogenes tab.

See also...

PCA of five ancient genomes

4 Ancestors Oracle results for Anzick-1, La Brana-1 and MA-1