DNA history of the Kazakh clan Argyn - Zhabagin, Sabitov, Agdzhoyan
- Kyrgyz American Foundation
- Apr 8
- 12 min read

The Genesis of the Largest Kazakh Tribal Group – the Argyns – in the Context of Population Genetics
"The Argyns are the largest tribal association among the Kazakhs, yet their genesis remains a subject of ongoing debate. These discussions can be summarized as a confrontation between two viewpoints regarding their origins—whether they stem from Turkic-speaking or Mongolic-speaking peoples—and a contradiction between the version presented by traditional Kazakh genealogy (shezhire) which claims a single biological ancestor for the Argyns, and the version that views them as a union of tribes of diverse origins.
The goal of our study is to present the genetic profile of the Argyns based on Y-chromosome polymorphism data and to examine the various hypotheses of their genesis from the standpoint of their gene pool.
Genetic profiles of nine Argyn clans (N=384) were constructed using Y-chromosome markers (17 STRs and 44 SNPs). Their gene pool was analyzed in the context of Eurasian genetics, and various historical-ethnographic hypotheses regarding the origin of the Argyns were examined in relation to their gene pool.
It was shown that the gene pool of the proto-Argyns is marked by Y-haplogroup G1, and on the paternal line, it traces back to the heritage of Indo-Iranian-speaking peoples. In terms of genetic distances, the Argyns are closest to the peoples of Iran (Assyrians, Baloch, Iranians, Mazandaranis, Kurds), and the oldest known carrier of haplogroup G1 was found in Western Iran.
Origin theories based on the analysis of ethnonymic similarities do not find genetic support. A comprehensive study of the genealogy and gene pool of the Argyns suggests that their primary ancestor was the Golden Horde emir Karakhoja (14th century) or one of his immediate forebears.
Phylogenetic dating of the sub-branches of haplogroup G1 suggests that G1 was already present in the Eurasian steppe since the early Iron Age (among ancestors of the Kazakhs, Bashkirs, and Mongols). The expansion of G1 in the Kazakh gene pool dates to a period between 750 and 470 years ago, and the specific marker of G1 for both Kazakhs and Argyns is the L1323 marker.
The Tribal Structure as a Key Social and Political Institution in Nomadic Societies
The tribal structure is the main social and political institution in societies with a nomadic economic system. It is an essential element of nomadic civilization and serves as the primary unit for building larger political systems.
A highly flexible tribal structure allowed for the formation of state-like entities by large numbers of tribal communities, which could then dissolve when necessary—while still maintaining their tribal identity.
The resilience of this institution is consistently reflected in the demographic history of the region and the architectural layout of the gene pool.
The tribal structure represented a hierarchy of clans, with the clan (rod) at its foundation. The functioning of this structure was based on the idea of a common origin (“shezhire”) among the clan’s members. However, shezhire is a social concept, and the genealogical chain it describes may or may not be reflected in the clan’s genetic profile (on the paternal line) [Chaix et al., 2004].
The tradition of passing down a clan name across generations follows the same inheritance pattern as the transmission of Y-chromosomal genetic information—from father to son. This makes the integrated study of tribal and genetic population structures one of the most effective tools for exploring gene pool structures and migration histories [Abilev et al., 2012; Zhabagin et al., 2014; Bogunov et al., 2015; Skhalyakho et al., 2016].
This work presents an interdisciplinary study (involving both geneticists and ethnographers) of one of the key issues in the formation of the Kazakh ethnos: the genesis of the largest tribal association of the Kazakhs—the Argyns.
The territorial range of the Argyns spans a vast area from the Turgai Plateau to Eastern Kazakhstan. By the end of the 19th century, the Argyn population had reached 450,000–500,000 people, making up 15% of the total Kazakh population (3,055,000–3,340,000) [Masanov, 2011].
In modern Kazakhstan’s population census, tribal affiliation is not recorded, so there is no precise data, but estimates place the Argyn population at about 19% of the total Kazakh population (around 11 million) [Rakishev, 2013].
Although the Argyns are one of the ethnogenesis-forming components of the Kazakhs, the ethnonym “Argyn” is not mentioned in any ancient historical sources [Isayeva, 2013].
Various hypotheses about the origin of the proto-Argyns have been proposed based on comparisons with ancient and medieval ethnonyms, which ultimately come down to two opposing views: one advocating for a Turkic-speaking origin, the other for a Mongolic-speaking one.
Similarly contradictory are the traditional Kazakh genealogical version (shezhire), which traces all descendants back to a single ancestor, and the theory that the Argyns are a “union of tribes” of diverse origins [Zhanuzakov, 1982].
There are two studies in the scientific literature focused on the Argyn gene pool: the first examines one Argyn clan—the Madjar—in connection with its potential genetic affinity with the Hungarians (Magyars) [Biro et al., 2009]; the second addresses the major component of the Argyn gene pool—haplogroup G1-M285 [Zhabagin et al., 2013; Balanovsky et al., 2015].
However, a complete genetic portrait of the Argyns has yet to be presented, and the question of their genesis remains unresolved. This study aims to shed light on the issue using Y-chromosome polymorphism data.
Materials and Methods
The material for this study consisted of venous blood samples from representatives of the Argyn tribal association (N=384), collected by the authors during field expeditions as part of the formation of the Biobank of the Population of Northern Eurasia (regional section: Central Asia) [Balanovskaya et al., 2016], as well as data from the Kazakh Genealogical Project [Sabitov, 2015], the FTDNA-G1 genealogical project [FTDNA], and the literature-based database “Y-base” created under the supervision of O.P. Balanovsky [Balanovsky, 2015].
The collection of biological material (venous blood) was carried out with written informed consent from each participant and under the oversight of the Ethics Committee of the Federal State Budgetary Scientific Institution “Research Center for Medical Genetics” (FGBNU “MGNC”) and the Center for Life Sciences at Nazarbayev University.
Fragment analysis of 17 STR loci was performed using an ABI 3130xl genetic analyzer (Applied Biosystems) with the Y-filer PCR Amplification Kit (Life Technologies). Genotyping of 44 SNP markers was carried out on a 7900HT Real-Time PCR System (Applied Biosystems) using TaqMan probes. Haplogroup classification followed the ISOGG guidelines (Version: 11.177; Date: 27 June 2016) [ISOGG].
Phylogenetic analysis, calculation of Nei’s genetic distances, and their visualization via multidimensional scaling were conducted in the same manner as in our previous works [Zhabagin et al., 2014; Skhalyakho et al., 2016].
For population comparison, materials from the “Y-base” database, developed under the direction of O.P. Balanovsky, were used (Y-base DB).
Results. Genetic Profiles of Argyn Clans
For the first time, a complete genetic profile of the Argyns is presented: haplogroup frequencies and phylogenetic clusters of STR haplotypes were identified. Among the 11 detected clusters, the haplotype cluster corresponding to haplogroup G1-M285 is overwhelmingly dominant, comprising two-thirds (67%) of the Argyn gene pool.
Representatives from all nine studied Argyn clans were found within the G1-M285 cluster. The estimated age of this cluster is between 600 and 200 years (based on 15 Y-STR markers). Figure 2 provides a detailed spectrum of all Y-chromosome lineages according to Argyn genealogy.
The high frequency of haplogroup G1 across almost all genealogical lines of the Argyns indicates the likely existence of a shared biological ancestor.
The age of the common G1 cluster among Kazakhs and Mongols was estimated at 3,000 years [Balanovsky et al., 2015]. This suggests that haplogroup G1 has existed in the Eurasian steppe since the early Iron Age.
Its expansion within the Kazakh population gene pool dates to between 470 and 750 years ago (according to full Y-chromosome sequencing data) and coincides with the lifetime of the genealogical ancestor of the Argyns mentioned in historical sources—the Golden Horde emir Karakhoja (14th century) [Balanovsky et al., 2015]. This expansion may have been driven by selection based on social prestige [Zerjal et al., 2003].
The G1 cluster also includes the genealogical sub-lineage Madzhar (Tokal Argyn group). In [Biro et al., 2009], a hypothesis was proposed suggesting genetic relatedness between the Madzhar and the Magyars (Hungarians).
However, the G1-M285 lineage is absent in the Magyar population [Volgyi et al., 2008]. The mistaken perception of genetic similarity between the Madzhar and the Magyars in [Biro et al., 2009] arose from combining data from haplogroup G1 and the related haplogroup G2 in their analysis.
Yet these two lineages diverged about 20,000 years ago (19,000 ± 6,000 years) [Rootsi et al., 2009], which is many millennia earlier than any plausible time frame for a shared ancestry between the Hungarians (Magyars) and the Kazakhs (Argyns).
In the spectrum of ancestral lineages, two clans—Tobyqty and Tarakty—stand apart. According to genealogical traditions, the founder of the Tarakty clan was not a biological son but an “adopted son of Argyn”; his descendants are linked to the Argyns only through the maternal line [History of Kazakh Tribal Associations, 2007]. Genetic data support the validity of this version.
An even lower frequency of haplogroup G1 was found in another clan—Tobyqty. They predominantly carry the sub-haplogroup J1-M267(xP58), which is extremely rare among other Argyn clans.
The sub-haplogroup J1-M267(xP58) is characteristic of East Caucasus populations (peaking at 99% among the Kubachins of Dagestan) [Balanovsky et al., 2011], and is also found among Assyrians in Iraq (18%), Turkey (16%), and Iran (10%) [Chiaroni et al., 2010], pointing to Near Eastern roots of this clan’s paternal lineage.
The Gene Pool of the Argyns in the Eurasian Context
The phylogenetic network of haplogroup G1 reveals a sub-branch characteristic of Kazakhs, defined by the marker L1323 (validated in four Kazakh samples analyzed by the company FTDNA). Closely related sub-branches are found among Mongols (marker GG1), Ashkenazi Jews (marker L201), and the populations of Kuwait and Syria (marker Y14914) [ISOGG].
The phylogenetic network in Figure 3 is presented in two versions. Figure 3A reflects data from scientific population studies and represents an update of the phylogenetic network from [Balanovsky et al., 2015], which identified four clusters with distinct ethnogeographic characteristics—those of Kazakhs, Mongols, Bashkirs, and Armenians. All new G1 samples of the Argyns obtained in this study fall into the “Kazakh” cluster.
Figure 3B, which reflects data from commercial analyses (FTDNA genealogical projects), reveals at least three additional new clusters: a European cluster (Ashkenazi Jews, L201), Kuwait Arabs (Y14914), and Saudi Arabs (CTS11562, not further differentiated).
One cluster combines the FTDNA sample from Turkey and a population sample of Armenians represented by the Hemshins (Amshens), currently residing in Russia but historically originating from Trabzon (modern-day Turkey).
All FTDNA samples of the Argyns were part of the Kazakh cluster identified from population data. These results, first, demonstrate that careful inclusion of data from commercial and genealogical projects (when cross-checked with scientific population studies) can enhance the scope of gene-geographic analysis.
Second, they underscore the need for further research into this G1 cluster, which is crucial for reconstructing migration routes from the Near East into the Eurasian steppe.
If the highest frequency peak of G1 is found in the steppe zone of Central Asia (predominantly among the Argyns), the second peak is located in the Iranian-Armenian Highlands [Balanovsky et al., 2015].
A connection between these two peaks can be traced back 8,000 years and is accompanied by a decline in haplotypic diversity from western Iran eastward across Southwest Asia and then northward into the Eurasian steppes.
This makes the western part of the Iranian-Armenian Highlands the most likely candidate for the origin of haplogroup G1 [Balanovsky et al., 2015].
New data from paleogenetics confirm this hypothesis, which was previously proposed by our team: the oldest known carrier of haplogroup G1 to date was discovered in western Iran (Seh Gabi, sample I1674), dating to the Chalcolithic period (4500–3500 BCE) [Lazaridis et al., 2016].
Two other haplogroups that are most frequent in the Argyn gene pool—C2 (9%) and R1a1a (7%)—are an order of magnitude less common than G1 (67%).
While the appearance of C2 is associated with the Mongol expansion [Zerjal et al., 2003], the presence of R1a1a may derive from at least two sources previously described in [Underhill et al., 2015; Karmin et al., 2015].
The First Lineage (marked Z2125) is found among the Kyrgyz and Pashtuns of Afghanistan (>40%), several populations of Iran and the Caucasus (>10%), and among Kazakhs (1.5%) [Underhill et al., 2015].
(KAF Note: According to various academic sources, the frequency of haplogroup R1a1 among the Kyrgyz reaches 50–65%.)
The Second Lineage (marked M780) is found in South (India, Pakistan, Afghanistan, the Himalayas) and West Asia (Iran), and in the Kazakh clan Sarysopy (genealogical sub-lineage Babasan) [Sabitov, 2012; Karmin et al., 2015].
Genetic Verification of the Hypotheses on the Origin of the Argyns
Each Y-chromosome lineage has its own history of origin and spread. However, the “chronicle” of any single Y-chromosome lineage cannot be used to reconstruct the entire history of a tribal association or of a whole population.
Typically, different Y-chromosome lineages migrate as a single ensemble from one regional center, contributing their information into the existing “melting pot” of local genetic elements.
Therefore, it is crucial to study the paternal (Y-chromosomal) gene pool as a whole and to reconstruct population origins based on a synthesis of historical, ethnographic, archaeological, anthropological, and genetic data.
To that end, we calculated and visualized genetic distances between the Argyns and the populations whose relatedness to them has been proposed in various historical and ethnographic hypotheses regarding Argyn origins.
On the multidimensional scaling plot, four clusters emerge, whose positions correspond with geography:
The “Western” cluster includes the peoples of the Caucasus and Iran
The “Central” cluster contains the peoples of Central Asia
The “Southern Eastern” cluster comprises the peoples of the Altai and Siberia
The “Northern Eastern” cluster includes the peoples of the Altai and Mongolia
The Argyns are located among neighboring Kazakh populations (most closely related to the Kazakhs of the Altai, d = 1.19), underscoring a shared historical trajectory in the formation of the Kazakh gene pool.
The populations genetically closest to the Argyns are from Iran: Assyrians (d = 1.45), Baloch (d = 1.67), Iranians (Bandari) (d = 1.69), Mazandaranis (d = 1.69), and Kurds (d = 1.75).
Mongols are also genetically close to the Argyns (d = 1.57), reflecting the genetic influence of Mongol expansion in the 12th–15th centuries.
This picture does not confirm any one ethnographic version of the Argyns’ origin. The genetic proximity of this Kazakh group to the peoples of the Iranian Plateau points to a significant shared component (“substrate”), which may have entered the proto-Argyn gene pool via migration from the southwest by Indo-Iranian peoples or their descendants.
The genetic similarity between the Argyns and the Kazakhs of the Altai and the Mongols reflects a later genetic component (“superstrate”) that was added to the Argyn gene pool through migrations of Turkic- and Mongolic-speaking peoples.
However, by the time the descendants of the proto-Argyns had formed a tribal-social identity and began to identify as descendants of a single ancestor, Argyn, they were most likely already a Turkic-speaking group—just like their genealogical founder, Karakhoja. This is supported by historical sources from the era of the Golden Horde [Sultanov, 1982].
Thus, the paternal gene pool of the Argyns carries a primary legacy from Indo-Iranian-speaking peoples or their descendants, and only in later periods incorporated elements from the gene pools of other Turkic- and Mongolic-speaking populations.
Conclusions
As part of an interdisciplinary approach, historical and ethnographic information about the Argyns has been compiled, the results of genetic studies of their gene pool have been analyzed, and—for the first time—a comprehensive genetic portrait of the Argyn clans has been created.
This set of findings allows us to formulate several conclusions:
None of the ethnographic hypotheses about the origin of the proto-Argyns (whether from Mongolic- or Turkic-speaking communities) finds full genetic support based on Y-chromosome polymorphism data. These hypotheses require further comprehensive investigation using other genetic systems.
The strongest genetic affinity of the Argyns on the paternal line is with the peoples of the Iranian Plateau. This is due to the major component of their gene pool—haplogroup G1—likely inherited from Indo-Iranian-speaking peoples (or more plausibly, their descendants who later adopted Turkic languages).
The genetic similarity of the Argyn gene pool to that of the Altai Kazakhs and the Mongols reflects a later genetic legacy from Turkic- and Mongolic-speaking populations.
The genetic unity of the Argyn clans is clearly expressed in the overwhelming dominance of haplogroup G1-M285 in most of their clans, comprising two-thirds of the total Argyn gene pool.
A comprehensive study of the genealogy and gene pool of the Argyns suggests that their primary progenitor (carrier of haplogroup G1) was the Golden Horde emir Karakhoja (14th century) or his immediate ancestors.
Therefore, the theory that the Argyns are a union of tribes of diverse origins does not find support in the genetic data.
The hypothesis that the Iranian-Armenian Highlands are the ancestral homeland of haplogroup G1 receives new support. The appearance of this haplogroup in the Eurasian steppe is dated to the Early Iron Age (approximately 3,000 years ago), and its expansion within the Kazakh gene pool occurred between 470 and 750 years ago.
The marker G1-L1323 is specific to both Kazakhs and Argyns. The closest related Y-chromosome lineages are found among Mongols (marker GG1), Ashkenazi Jews (L201), and the populations of Kuwait and Syria (Y14914).
Authors:
M.K. Zhabagin, Zh.M. Sabitov, A.A. Agdzhoyan, Yu.M. Yusupov, Yu.V. Bogunov, M.B. Lavryashina, I.M. Tazhigulova, A.R. Akilzhanova, Zh.Sh. Zhumadilov, O.P. Balanovsky, E.V. Balanovskaya
Institutions:
• National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan
• L.N. Gumilyov Eurasian National University, Astana, Kazakhstan
• Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
• Research Center for Medical Genetics, Moscow, Russia
• Institute for Strategic Studies of the Republic of Bashkortostan, Ufa, Russia
• Kemerovo State University, Kemerovo, Russia
• Center for Forensic Expertise, Ministry of Justice of the Republic of Kazakhstan, Astana, Kazakhstan
#archeology #archeologist #civilizations #archeological #castle #ancient #history #historyfacts #historynerd #scythians #huns #kyrgyz #turks #nomads #empires #sarmatians #история #археология #цивилизации #скифы #сарматы #тюрки #киргизы #гунны #pasyryk #kazakh #massagetean #cumans #kipchaks #GoldenHorde #RussianEmpire #Yenisei #Siberia #Khoton #mongolia #cumans #kipchaks #bulgars #кипчаки #куманы #булгары #шорцы #тадарлар #shors #Shorians #vikings scandinavia #sweden #norway #bashkirs #башкиры #russians #русские #iranians #arabs #kurds #armenians #иранцы #
Источник:
Commentaires