Thursday, January 24, 2019

Indo-European phylogeny

Well, in light of the last post I made, I went and got a hold of the preprint of the Chang et al 2015 paper that was referred to in the comments, and read through it (although admittedly, I started skimming when we got the specific methods, about halfway through, and came back in for the discussion and conclusions.)

The gist of the paper is that it is a new linguistic cladogram that purports to eliminate errors that have dogged past cladograms, especially by introducing ancestry constraints that don't stretch the time frame required by being unable to reconcile, for example, the fact that Old Irish is an ancestor of modern Irish and Scots Gaelic, and not a sister language of them.  This not only strongly reduces the (already very low) probability that anything other than the steppe hypothesis works, but it also put constraints on the splitting of various branches from the whole, and shows more nodal relationships; like Indo-Iranian being a node between PIE and separate Indic and Iranian nodes, but further removed from that most obvious one.  Almost all linguists recognize Indo-Iranian, and most recognize the slightly more controversial Balto-Slavic and Italo-Celtic intermediate nodes.  This paper suggest many such intermediate nodes, which then in turn gives archaeologists and archaeogeneticists material to go look for to corroborate.

Anyway, it's one of the most interesting linguistic papers I've read in a long time, even though it's strictly speaking a statistics paper rather than linguistics per se.  It's obvious that it has informed the hypothesis that Davidski mentioned and which I referred to in my last post.  Let's see how I can make it work, if I can.  Because the family tree has time stamps on the nodes, let's start with the oldest and work our way forward.  Keep in mind that this can't analyze languages that are too poorly attested to be useful in the cladistic analysis, so if you want to try and speculate on where they fit in, it'll have to be speculative, or at best, rely on other evidence than linguistic.  I'm unfortunately not really familiar enough with the archaeological literature to posit where and how archaeological cultures can correspond to this linguistic phylogeny, but presumably it wouldn't be terribly difficult to do.  It does fit, albeit quite broadly, with David Anthony's revised steppe hypothesis in many ways.
  • By 6,500 years ago, 4,500 BC, the Anatolian languages had split out from PNIE, or proto-Nuclear Indo-European (i.e. everything else besides Anatolian.)
  • Between 3,500 BC and 4,000 BC, Tocharian had split off from PNIE.
  • By 3,500 BC, a node that includes Greek, Albanian and Armenian (and presumably Phrygian) branches off, but remains in close geographic adjacency for some time, so it can absorb some isoglosses via borrowing.  Presumably this is the first movement of Usatovo culture into the Balkans from the steppe.
  • By 3,000 BC the rest of the IE tree's node was starting to break up and the Indo-Iranian languages branched off.  Maybe this corresponds to their very early history as the languages of very northerly Eastern Corded Ware and before they really developed enough traits to truly be called Indo-Iranian.
  • Between 2,500 and 2,000 BC the Balto-Slavic languages split off, again representing a northerly Corded Ware dialect, no doubt.
  • By 2,500 BC, Albanian splits off from the Graeco-Armenian node, by whatever name this proto Albanian is known (Illyrian?)
  • Between 2,500 and 2,000 BC a Germanic node breaks off.
  • Shortly after 2,000 BC Italic and Celtic separate.  Iranian and Indic do right around here as well.  Not long after this, Greek and Armenian separate from each other too.
  • By about 500 BC, Baltic and Slavic start to differentiate themselves from proto-Balto-Slavic.  Not long after this, Brythonic and Goidelic Celtic split (no idea how Continental Celtic and Celtiberian fit into this at this point.)
  • A the Meridian of Time, Eastern Germanic splits from a node that contains Western and Northern Germanic still combined.  They will themselves separate before 500 AD.
  • By 700-800 AD or so, Slavic is starting to break up from Common Slavic and Vulgar Latin is starting to actually produce the Romance languages.  By 1,000 AD, both have sufficiently broken apart that even Late Common Slavic (or Common Romance) can no longer be spoken of and the specific daughter languages are recognizable.
Now, this seems to imply that Italo-Celtic is the last "core" of nuclear Indo-European, but of course, that's absurd; whatever joint innovations it has are shared peripheral changes combined with some shared conservative features (which, for example, led people to chase after a Celtic-Tocharian link decades ago).  Italo-Celtic developed, wherever and however exactly it developed and as the language of whatever material culture spoke it, far to the western periphery.  The chart could be drawn otherwise, but the dates would still be the same and the same lines would still connect even if you changed the order in which they were presented.  It also seems to imply that some steppe language core split off early, and while it's probably true that they were differentiating themselves early, they also maintained a degree of contact that allowed the sharing of innovative isoglosses well after their split (such as satemization.) 

But it really puts it out there that we can look for material cultures that represent some of these nodes.  For example, if Italic and Celtic didn't break up until after 2,000 BC, we can look for a material culture that's the right time, place, and has the right traits to represent it.  Maybe the Tumulus culture breaking up into Urnfield (early Celtic) and Terramare culture (Italic).  Which has the interesting effect of suggesting that maybe Venetic was a third branch of that group, originating in the Polada culture?  And if Germanic had separated from Italo-Celtic less than half a millennia before Italic and Celtic themselves split, then we can look for a material culture that's in the right time, place, and has the right traits to represent the three of them still in a state of some unity (Unetice culture leading to Italo-Celtic Tumulus and early Germanic Nordic Bronze Age?  This even potentially leaves room for that Nordwestblock, although we shouldn't consider that the archaeological culture boundaries we've devised are really boundaries that were meaningful back then.)  And the early Germanic should have a contact border with that material culture of proto-Balto-Slavic (Trziniec-Komarov culture), because they do not show a particularly close genetic relationship, but we know that there was a long period of contact relationship between them.

Anyway, it's not like that wasn't being done before this paper came out, but this gives a significantly improved roadmap of what to look for and when.

Anyway, here's the phylogeny from the article:

Another discussion about Celtic and Germanic:

https://www.eupedia.com/forum/threads/26447-Celtic-and-Pre-Germanic

No comments: