Missed part one? here’s the link: Exploring the Indo-European Roots (Part 1)
Bones and pots found in archaeological digs do not talk. Yet, as discussed in detail in our book, The Indo-European Controversy: Facts and Fallacies in Historical Linguistics, we can use the tools of paleo-linguistics to search for the PIE homeland. The general idea is simple: the reconstructed vocabulary of the ancestral language is examined for clues as to its speakers’ physical environment and modes of subsistence. Thus, speakers of a language that has words for ‘snow’, ‘sleigh’, ‘reindeer’, and ‘seal’ must live in a very different place from those of a language with words for ‘palm’, ‘coconut’, ‘rice’, and ‘elephant’. Based on the consensus reconstructions of PIE, its speakers must have lived in a temperate environment, where snow, birch trees, beech trees, and wolves were common features, but salt-water bodies were not. Reconstructions of words for ‘rye’, ‘barley’, ‘sickle’, and ‘to plough’ tell us that PIE speakers had agriculture, while words for ‘sheep’, ‘goat’, ‘pig’, and ‘cattle’ mean that they raised animals. But perhaps most revealing, and at the same time most controversial, are the reconstructed roots *ek’wos- ‘horse’ and *kwekwlo- ‘wheel’ (which survived in English in equestrian and wheel). Since the earliest archeological evidence of wheels and horses dates from about 3500 BCE, the logic of the paleo-linguistic argument tells us that PIE could not have been spoken earlier than that—a timeframe compatible with the Steppe but not the Anatolian theory. The steppe zone is also the most likely place in which humans first came into close contact with wild horses and eventually domesticated them. Other clues, which likewise strengthen the Steppe theory, can be found among loanwords from neighboring languages such as Proto-Uralic, the ancestor of today’s Finnish, Hungarian, and Samoyedic languages, spoken in northwestern Siberia.
But words alone, Martin Lewis and I argue, cannot tell the whole story and sometimes can be highly misleading. Approaches to the Homeland Problem relying exclusively on lexical data—from glottochronology, which was first explored in the 1950s and has since been discredited, to the Bayesian phylogenetic methods employed by Russell D. Gray and his colleagues in recent work—produce notoriously unreliable results because words are subject to speakers’ conscious choices and are easily and frequently borrowed from one language into another. Grammatical structures offer more reliable evidence of family relationships but they are harder to convert into workable binary input for Bayesian calculations. For example, models that rely on lexical data usually show Romani, the language of the Gypsies, as much more distinctive within the Indo-Aryan branch than it actually is, dating its divergence to 2,500-3,500 years ago. In reality, Romani gained a distinctive lexicon not because it diverged from its “sibling languages” a long time ago but rather because it was in contact with, and picked numerous words from, other languages on its path from northern India to Europe, such as Persian, Armenian, and Greek. A look at its structural properties, such as its gender and case systems, indicates that Romani must have split off from the other Indo-Aryan languages only about 1,000 years ago. This more recent date of the Roma exodus from northern India is now confirmed by genetic studies.
Rapid migrations, such as the trek that the Roma made at the turn of the second millennium CE, are key to understanding both population distribution and the spread of languages. In the historical record of the Indo-European language family, such swift population movements, almost instantaneous at the relevant time scale, happened many times: Latin spread with the growth of the Roman Empire, Russian advanced east with the colonization of Siberia, and Norse speakers settled the previously uninhabited Iceland (and for a while also Greenland), to give just a few examples. Yet, recently proposed computational models often take into account only one mechanism of language spread: demic diffusion, a slow and random population movement in all directions, impeded only by water. Such models cannot handle quick migrations, and hence necessarily postulate a much slower spread of Indo-European languages and, as a result, a much earlier date for PIE.
The preceding discussion of the importance of migration, however, should not obscure another well-known fact: although languages often spread through the movement of the people who speak them, they do not always travel with genes. Consider, for example, English, Spanish, Portuguese, and Russian. In addition to the physical descendants of the Anglo-Saxon invaders, Roman soldiers stationed in Iberia, and East Slavs from the Kievan Rus’, these languages are spoken today by millions of genetically-unrelated individuals—and entire indigenous groups—found in such regions as in Alaska, the Andes, the Amazonian rainforest, Australia, the Caribbean, and Siberia. Consequently, genetic studies that reveal patterns of migration and admixture of various groups sometimes help us figure out certain pieces of the Indo-European puzzle, but they cannot provide conclusive evidence of the PIE homeland.
As the book unfolds, Martin Lewis and I take the reader through a maze of findings from historical linguistics, archaeology, historical geography, and genetics, allowing one to interpret and reconcile these findings within a coherent narrative. Thus, the book is as much about methodology and epistemological issues—how we acquire or fail to acquire knowledge of the human past—as it is about the location of the Indo-European homeland itself. At the time when scientific research becomes increasingly collaborative and interdisciplinary, and when the general public increasingly needs to be able to assess scientific findings on a broad range of issues—from genetic history to climate change and genetically-modified foods—rethinking such epistemological issues becomes ever more critical.