In the debate over the location of the Proto-Indo-European urheimat, Colin Renfrew's Anatolian hypothesis is usually mentioned as the most viable option to the steppe or Kurgan hypothesis. But probably not for very much longer.
Below is a Principal Component Analysis (PCA) featuring extant Indo-European and non-Indo-European groups from West Eurasia, a couple of typical early Neolithic farmers from Central Europe, a typical Western Hunter-Gatherer, also from Central Europe, and the Iceman from the Copper Age Tyrolean Alps, again typical of his time and place.*
It's just a taste of the ancient genomic data we have available from prehistoric Europe, but it has almost everything that is pertinent to the issue at hand.
You don't need to be familiar with PCA methodology to be able to read the plot. Basically, it shows that the present-day European population structure is the result of two main events:
- the arrival of early farmers from Anatolia during the Neolithic transition, which eventually led to the extinction of people like the Western Hunter-Gatherer, who is the most obvious outlier on the plot
- the expansion of Kurgan groups such as the Yamnaya, who may have been the ancestors or perhaps cousins of the Corded Ware people, from the western steppe during the Late Neolithic/Early Bronze Age, which shifted the genetic structure of almost all Europeans to the east, away from the Neolithic and Copper Age samples.
These were massive population turnovers, and, as a rule, massive population turnovers are accompanied by language change. So it's highly unlikely that any Europeans today are speaking languages derived from the languages of the Western Hunter-Gatherers or early Neolithic farmers from Central Europe. Moreover, consider this:
- most present-day Indo-European speaking Europeans form an elongated cluster between the Neolithic farmers and the Corded Ware sample, pointing to the steppe-derived Corded Ware Culture as the proximate agent of the Indo-European expansion in much of Europe
- the only present-day Europeans who closely resemble Neolithic farmers are some Sardinians (the small Romance cluster just above the two Neolithic samples), but Sardinians spoke Paleo-Sardinian or Nuragic languages until they adopted Indo-European speech, in the form of Latin, from the Romans.
Also, this isn't shown on the plot, but the dominant Y-chromosome haplogroup of early Neolithic farmers is G2a, which is a low frequency marker in Europe today. The two most common Y-chromosome haplogroups among present-day Europeans are R-M198 and R-M269, which are also typical of Corded Ware and Yamnaya males, respectively, and probably originally from the steppe.
All this begs the question: is there any way to rework the Anatolian hypothesis so that it can be salvaged? I doubt it. Even making the steppe a homeland for all of the main Indo-European language groups apart from Anatolian and Armenian doesn't appear to be a viable option.
It is true that the Yamnaya nomads carried Near Eastern-related ancestry which may represent Proto-Indo-European admixture from outside of the steppe. But there's no evidence that it came from Anatolia.
In fact, if Neolithic Anatolians were basically identical to early Neolithic European farmers, which seems to be the case (see here), then it's unlikely that it did, because the latter carried a peculiar genome-wide signal that is missing in Yamnaya samples (orange cluster in the ADMIXTURE bar graph below). Heck, even the early Corded Ware genomes from Germany barely show any of it.
The Indo-European Controversy: Facts and Fallacies in Historical Linguistics. I haven't read it yet, so I welcome the opinions here of those who have. I did, however, read a lot of the online articles on which the book is based. As far as I know most of them are still available here and here.
*Another version of the same PCA, with the samples labeled individually, is available here. The samples are listed here. All of the samples are from Haak et al. and Allentoft et al. The PCA was run using ~56K high confidence SNPs listed here.
The Corded Ware sample is a composite of Corded Ware sequences from present-day Germany, Scandinavia, Estonia and Poland. The Yamnaya sample is a composite of Yamnaya sequences from the Samara and Rostov regions of present-day Russia.
I chose to use these composites instead of individual sequences because I didn't want to include any samples in the analysis with genotype rates of less than 98%.