As more samples come in, more likely possibilities are coming to light. However, I do note some hinkiness in the dates. As we look for the origin of the Indo-European languages and who spoke this proto-Indo-European, we've had a lot of handwaviness in the past between "the steppe" which was often generalized, and the post-steppe expansion into Europe and Central Asia and beyond. But we're getting less handwavey over time, and that's a good--and interesting--thing. But I do notice that some of the dates of the cultures that are supposedly in a linear are a little bit hinky.
Notably, let's try to lay this out, and I'll show where the dates are a little funny; either overlapping in a way that doesn't quite make sense, or not leaving enough room between cultures for one to be a normal successor to the other. Probably this is simply a lack of sampling; if we have too few samples, of too poor quality, then the dates we have for archaeological material cultures may be a little rough. First off, I think it's important to note that the Yamnaya culture is no longer seen as the likely ancestor of Indo-European, although sources like Wikipedia (and by extension, Infogalactic) are still out of date. Yamnaya is likely now seen as neighbors and probably close cousins to the proto-Indo-Europeans, who interacted with them closely and probably spoke very similar related dialects, but the specific y-DNA haplogroups associated with the Yamnaya culture are not the y-DNA haplogroups associated with the later spread of Indo-European languages. Because language spread often, especially in early historic and pre-historic settings, is associated with demic changes of some kind, having the Yamnaya spread their languages but not their genetics seems unlikely; especially when a closely related set of genetic cousins did spread their DNA through the Indo-European region, and almost certainly brought their Indo-European languages with them.
Anyway, so the first step in this process is to walk back to the earliest stage where we can say something useful. The Dnieper-Donets culture of the western steppe region from the Mesolithic were a robustly built, Swiderian derived late Cro-Magnon population that by genetics can be associated mostly with the Eastern Hunter Gatherer (EHG) lineages of R1a and even moreso R1b, plus I2a, that was mixed somewhat with a Western Hunter Gatherer (WHG) component. This WHG increased over time with the transition from Mesolithic to Neolithic, possibly as a result of population pressure by expanding Early European Farmers (EEF) ancestry, who's source was ultimately Anatolia and who would have resembled (genetically, at least, but probably physically as well) the modern day Sardinians. At this point, these archaic culturo-linguistic groups that were on some kind of glide-path towards becoming Indo-European, but can't yet be called even early/archaic proto-Indo-European don't yet have any Caucasian Hunter Gatherer (CHG) admixture (the event that putatively could have moved them into archaic proto-Indo-European territory, assuming that Bomhard's Caucasian substrate hypothesis, which is certainly gathering momentum in terms of expert approval, is true. Nor does it have any EEF admixture yet, although as we'll see, that's an important element of later Indo-European groups that can confidently be pinned to a specific Indo-European language group.
Dneiper-Donets coexisted with some later groups, of which it was partially the ancestor; probably it got absorbed into additional groups there. The dates are 5,000-4,200 BC. It seems related genetically and culturally, and probably had relatively intense contacts with the Samara culture to the East. It was finally absorbed into the Sredni Stog culture, which was probably an outgrowth of this culture itself admixted with another, to the east. Curiously, however, as David Anthony points out, the CHG ancestry was limited to the Volga-Caspian steppes, and the Dnieper-Donets had, instead, the WHG admixture, and later populations that followed Dnieper-Donets (Sredni Stog) had EEF ancestry. But Samara and Dnieper-Donets, while both being primarily an EHG based population, seem to have not been included in a mutual mating network, Khvalynsk, which followed Samara, had no EEF, but rich CHG admixture, and Sredni Stog, which followed Dnieper-Donets had EEF admixture and CHG admixture. The relationship between Eastern and Western steppe is still unclear, and how much one contributed to the other is unknown.
This actually makes me wonder if it is possible that the WSH ancestry; somewhat different than the EHG with a bit of WHG and EEF that made up the earlier Western steppe, is not the source of PIE. Archaic PIE may well have come from the east, but then late PIE comes from the west, while the PIE groups that remained (i.e. Yamnaya) ended up being linguistic dead-ends. This is where the Caucasian substrate hypothesis comes from, and the language of the EHG's--if you will--took on some significant structural and vocabulary borrowings from the language of the CHGs, who's women it appears that they took and married, whether in a friendly fashion or not is impossible to say at this point. Finally, at this point, we can say that archaic proto-Indo-European forms, and it has a east to west character across the steppes. The Dneiper-Donets culture picks this up by first becoming Sredni Stog on its eastern edge; a culture that eventually absorbs the remainder of the Dnieper-Donets culture altogether. The dates for Sredni Stog are 4,500 to 3,500 BC, and for Khvalynsk are 4,900-3,500 BC. Again; the genesis of these EHG (with some WHG on the western edge) populations into an archaic proto-Indo-European group, i.e., with the significant admixture of CHG, especially on the female lines, seems to have been something that spread from the east. This doesn't necessarily mean that literal CHG people spread into the Dneiper-Donets; more likely Sredni Stog was formed by already admixed populations from the east became more socially dominant and spread their dialects, social customs and genetics through intermarriage and other interactions further west. This mixture of (primarily) EHG plus CHG; male lines almost exclusively from the former and female lines growing significantly from the latter, created a group called Western Steppe Herder (WSH), which is the genetic profile of the Yamnaya, in particular, but also the Khvalynsk culture which preceded them. The Sredni Stog is 80% WSH itself; very closely related to the Khvalynsk and Yamnaya groups, but as mentioned above, there was a bit more WHG already present and because of the cultural horizon between the steppes and the Balkan farming communities, the Sredni Stog also picked up some EEF admixture, again, almost exclusively in female lineages.
To this point, although more sampling would be good to confirm our direction, we feel like we understand in broad strokes what happened. It's what comes next that is a little bit confused. The Sredni Stog and Khvalysnk cultures end, and over the same geographical area, the much broader Yamnaya horizon replaces them. Yamnaya's timing is 3,300-2,600 BC, and its ancestry appears to come from the eastern portion of Khvalysnk. The Afanasevo culture seems to be genetically identical to the pre-Yamnaya Repin, and Repin seems to be the source of Yamnaya as well. Yamnaya paternal lineages are closely related to but different from that found in later Indo-European populations, indicating that at least genetically (and therefore almost certainly) they cannot actually be the source of Indo-European languages. They seem to be a closely related sister group. What seems to have happened, although there is a gap in archaeology and genetic sampling that needs to be closed, is that Sredni Stog peoples, especially male lineages, moved further west into the Balkans and the surrounding areas. From somewhere in this area, probably the North Carpathian and Lesser Poland type area, this group expanded to become the Corded Ware horizon. The Corded Ware, however, didn't come around until about 2,900 BC, so the gap in terms of how Sredni Stog turned into Corded Ware (through Usatove?) is unknown.
And here's where we see some hinky dates as well, although lack of samples of bad radiocarbon results either or may be the source. A derived Corded Ware variant is the Fatyanovo-Balanovo culture, who's dates are 3,200-2,300 BC. How a derivative culture is earlier than the culture from which it derived is, of course, a problem. As Davidski pointed out recently, genetic samples from the Fatyanovo-Balanovo culture cluster extremely closely with both the much later Unetice culture, and even with modern day Poles in an
extremely tight spread, indicating how closely related they all were (and still are.) However, the F-B culture is seen as the primary source of Sintashta (especially in terms of y-DNA lineages) which led to Adronovo, which led to the entire spread of Indo-Iranian languages; everyone from the Aryan conquerors of the Middle East and northern India, to the Persians to the Scythians and their related peoples who spread deep into what is today China.
The Unetice sample is a typically Balto-Slavic subclade of R1a. That isn't to suggest that Unetice was Balto-Slavic; they seem to be a very broad pan-European phenomena not unlike their own ancestors, the earlier Beaker culture. They were followed by the Tumulus culture, which led to the Urnfield culture; Unetice, while still strongly resembling farther eastern archaic proto-Indo-Iranian (supposedly) Corded Ware variants, and probably maintaining close contact with them still, is too early to belong to a single stock, and geographically probably was ancestral to cultures who later developed into all of the well-known European branches of the Indo-European family: Balto-Slavic, Italo-Celtic, Germanic, and maybe even the more southern paleo-Balkan Indo-European languages about which we know very little like Thracian, Dacian, Illyrian, Greco-Armenian, Phrygian, etc.
The Beaker culture also seems, by genetics, to obviously be an outgrowth of the Single Grave variant of the Corded Ware, but the horizon starts about the same time that Corded Ware does, according to published dates, anyway, so that's a bit of a challenge. Corded Ware and most of its clear descendants are dominated by R1a clades, but the Beaker culture is dominated by an R1b clade. Not, however, the clade which dominated the Yamnaya culture, and at an autosomal level, in spite of the y-DNA clade, the Beakers seems to be almost exactly like Single Grave Culture. So where did this R1b-L51 and its subclade P312 come from? It's not the Yamnaya subclade, and the Corded Ware was dominated by R1a-M417. At an autosomal level, the Beakers may be identical to the Single Grave, but a totally different Y-DNA lineage spreads with them, which comes to dominate western and northern Europe in its wake (don't get me wrong; R1a clades are still common. This is especially true in the east, but R1a clades were associated with the Germanic peoples too, for instance.) Did it travel in relative obscurity from the Sredni Stog, into the Corded Ware, and somehow, suddenly come to social dominance with the spread of the Bell Beakers? It seems to have, but more sampling of Sredni Stog and Corded Ware to confirm the presence of R1b-L51 would certainly be nice.
Also, can anything be done to sort out the origins of the Balkan Indo-European languages? Older models suggest that they came from Yamnaya derivative peoples, like the Catacomb, etc. but newer studies suggest that they too may well have a Corded Ware genesis and have actually moved more directly south from the North Carpathians into the Balkans and from there into Greece, Anatolia, and the rest of the places that we know historically that they spread to. All in all, this is still a very handwavy proposition, and more data would be really well received to see if we can't sort this out better.
In a similar fashion, it'd be nice to have better samples from earlier in the Balkans, because it is supposed, and there is tantalizing evidence that hints that this is probably the case, that the Anatolian languages spread from the same population group that led to the Corded Ware in an earlier migration, picking up much more EEF ancestry on the way. The Usatove culture is usually considered to be the likely vector here. And in also similar fashion, we still don't really know exactly how the Sredni Stog became the Corded Ware; were there hybrid cultures, or geographical extensions like Usatove further to the north that could be likely candidates for proto-Corded Ware cultures? Again; more sampling is needed.
To be fair, there are a lot of samples that have been done but not yet published. But we need to start seeing the results of some of these if we are to 1) untangle the origins of some of these peoples, and 2) convince those that are still clinging to models that are starting to look more and more unlikely that they need to be abandoned and a non-Yamnaya model accepted instead.
And while we're at it, Wang's 2020 preprint on Tocharian genetics seems to have caused more confusion and questions than it answered. There's even a proposal to link Tocharian to Corded Ware rather than Afansevo/Yamnaya, and its been difficult to sort out because of all of the Saka/Scythian presence in the area as well. More sampling with better dating is required here to make sure that we're actually getting the rest population groups. I do kind of like the idea of Tocharian representing the one known descendant of a Yamnaya-derived version of Indo-European, but I can't say with any confidence that I know that it does so anymore.