Linking the Sino-Tibetan fallen leaves

Andrew Hsiu

May 2018

Please cite as: Hsiu, Andrew. 2018. Linking the Sino-Tibetan fallen leaves. <https://sites.google.com/site/msealangs/home/blog/fallen-leaves>.

Please note that this is a working draft that will be periodically updated.

1. List of "fallen leaves"

Below are some Sino-Tibetan branches that are best left unclassified as "fallen leaves," the concept of which is further discussed in van Driem (2014).

  1. Cai-Long languages: Longjia, Caijia, Luren

  2. Kathu

  3. Gong

  4. Taman

George van Driem (2014) lists 42 "fallen leaves".

  1. Bodish

  2. Tshangla

  3. West Himalayish

  4. Tamangic

  5. Newar

  6. Kiranti

  7. Lepcha

  8. Magaric

  9. Chepangic

  10. Raji–Raute

  11. Dura

  12. 'Ole

  13. Gongduk

  14. Lhokpu

  15. Siangic

  16. Kho-Bwa

  17. Hruso

  18. Digaro

  19. Midzu

  20. Tani

  21. Dhimal

  22. Sal: Bodo–Koch + Konyak

  23. Ao

  24. Angami–Pochuri

  25. Tangkhul

  26. Zeme

  27. Meitei

  28. Karbi

  29. Sinitic

  30. Bai

  31. Tujia

  32. Lolo-Burmese

  33. Qiangic

  34. Ersuish

  35. Naic

  36. rGyalrongic

  37. Kachin–Luic

  38. Nungish

  39. Karenic

  40. Pyu

  41. Mru

  42. Kukish

Once coherent branches are identified, it can be difficult to put them together, as we have seen with Indo-European branches, which continue to defy classification attempts despite having more than a century of rigorous research done on them.

STEDT is a wonderful resource that contains data for most but not all of these fallen branches. Unfortunately, it is not being updated another since its last release was in 2015.

  1. Proto-Luish [Matisoff 2013]

  2. Proto-Ersuic [Yu 2012]

  3. Proto-Bai [Wang 2006]

  4. Cai-Long

  5. Kathu

  6. Dura

  7. Gongduk

  8. Pyu

  9. Hruso

  10. Siangic

  11. Kho-Bwa

2. Some notes on selected Tibeto-Burman "fallen leaves"

Below are my personal speculations on the classifications of some Tibeto-Burman "fallen leaves."

Pyu appears to be an independent branch that shares some similarities with some neighboring Tibeto-Burman branches such as Meithei, Kuki-Chin, and Sal. Like Meithei, Pyu has PTB *s- > h-. It may or may not be a Luish language related to Sak. It looks like Luish languages were in contact with Pyu, but are not directed related to it. I have compiled <Pyu.xlsx>, which has Pyu data from Luce (1985) and comparisons from the STEDT database.

Mruic, a branch consisting of Mru, Anu, and Hkongso, is definitely not Lolo-Burmese, contrary to Löffler (1966). Its autonym is reminiscent of certain Austroasiatic autonyms such as "Bru." Austroasiatic words that look similar to "Mru" include Proto-Khasic *brəəw 'person', Proto-Khmuic *-brɔʔ 'man, person', and Proto-Khmuic *kmraʔ 'person' (MKED database). Mru does not appear to have many Austroasiatic loanwords, but its phonology is highly Mon-Khmer-like with sesquisyllables, final liquids, and initial consonant clusters with liquids. A similar parallel is Kerinci Malay, which has a Mon-Khmer-like phonology with sesquisyllables and Mon-Khmer-like diphthongs, but no recognizable Austroasiatic loanwords. However, there has not yet been any solid argument for Mru having an Austroasiatic substratum.

Gong is certainly not Lolo-Burmese, and may be a remnant of what was a more widespread Tibeto-Burman branch in the Myanmar-Thai borderlands located to the south of the Karenic-speaking area. It shows an unusual development from PTB *s- > ʔ- or zero, which had developed from ʔl- according to Bradley. Like Karenic, the ancestors of the Gong would have migrated down the Salween watershed from the Tibeto-Burman heartland further up to the north. However, unlike Karenic, Gong word order is SOV rather than SVO (Mayuree 2006).

Kathu shows similarities to Tibeto-Burman languages in the Myanmar-China-India border triangle, such as Nungic, Tani, Karenic, Taraon, Nusu, Naic, Qiangic, and others. My hypothesis is that Kathu (or "pre-Kathu") had split off independently from Proto-Tibeto-Burman and was in contact with other eastern branches of Tibeto-Burman in its former range, which would have been located much further to the northwest of its current distribution.

Cai-Long languages: Caijia, Longjia, and Luren of western Guizhou have no immediately recognizable close relatives, but Bai has been suggested as a close relative (Zhengzhang 2010). Waxiang Chinese is related according to Sagart (2011). However, while the superstrata of Bai and the Cai-Long languages belong to similar forms of western Old Chinese dialects, the substrata languages of Bai and Cai-Long constitute separate, independent southern branches of Burmo-Qiangic. The superstratum of Bai and Cai-Long was the first to split off from Old Chinese, even before Min (following Sagart 2011 on Caijia-Waxiang as the first branch to split from Old Chinese). These languages all have Burmo-Qiangic substrata, but they would have belonged to different branches of Burmo-Qiangic. Hence, the superficial similarities between Bai and the Cai-Long-Waxiang are due to (1) shared features due to the same Old Southwestern Chinese superstratum that is common to Bai, Cai-Long, and Waxiang, and (2) shared retentions from Proto-Burmo-Qiangic in the substratum layer. Thus, pre-Bai and pre-Cai-Long-Waxiang were each separate branches of Burmo-Qiangic. Pre-Cai-Long-Waxiang would have had a much wider geographical distribution in the past, and would have been spread across much of northern Guizhou.

3. Linked leaves

I believe that historically, many Tibeto-Burman branches formed linkages with each other, with transmission probably taking place both vertically and horizontally. They may or may not have had the same common ancestors, but they definitely did mutually influence each other through mutual contact. The following list is partially based on my unpublished computational phylogenetic analysis of Tibeto-Burman branches, as well as previous proposals of Tibeto-Burman groupings by George van Driem, David Bradley, Guillaume Jacques, Scott DeLancey, Nicolas Schorer, and others. Please note that these are untested preliminary impressions, and that I have not yet provided any well thought-out evidence for this classification.

  1. Western Tibeto-Burman ("Nepal") linkage

    1. West Himalayic

    2. Newaric (Newar + Baram-Thangmi) [van Driem]

    3. Greater Magaric (Chepangic-Raji + Magaric + Dura) [Schorer]

    4. Tamangic

    5. Bodic (?)

  2. Central Tibeto-Burman linkage [DeLancey 2015]

    1. Kuki-Chin-Naga linkage

      1. Kuki-Chin

      2. Zeme

      3. Tangkhul

      4. Meithei

      5. Angami-Pochuri

      6. Ao

      7. Karbi

    2. Sal linkage

      1. Bodo-Koch

      2. Konyak

      3. Kachin

      4. Luish

      5. Dhimal (?)

    3. Mruic

    4. Pyu

    5. Taman [Huziwara 2016]

    6. Miju (Kaman-Zakhring) [Central TB according to DeLancey 2015]

  3. Eastern Tibeto-Burman linkage [Bradley; van Driem]

    1. Burmo-Qiangic (Lolo-Burmese + Naic + "Qiangic" + Ersuic) [Jacques & Michaud]; Bai, Cai-Long-Waxiang [included by A. Hsiu (2018)]

    2. Kathu

    3. Nungic

    4. Karenic

    5. Gong

    6. "Donor Miao-Yao" [see Ratliff 2010; Paul Benedict]

    7. Tujia

    8. "Donor Kra" [see Ostapirat 1995, etc.]

    9. Sinitic

  4. Kiranti

  5. Gongduk (+ substratum of East Bodish [van Driem])

  6. Lhokpu (+ substratum of Dzongkha [van Driem])

  7. Ole

  8. Tshangla

  9. Lepcha

  10. Tani [which has a Greater Siangic substratum (Blench 2014)]

  11. Greater Siangic [Blench]

    1. Siangic (Koro-Milang)

    2. Digaro (Idu–Taraon)

  12. Puroik (Sulung)

  13. Kho-Bwa

    1. Bugun

    2. Mey, Sartang

    3. Chug, Lish

Likely isolates

  1. Hrusish (Aka + Miji)

Some of my personal observations (new proposals):

  • Tibeto-Burman branches that may have received Austroasiatic influence include those in the Kuki-Chin-Naga linkage, Tani, Mru, and perhaps some Lolo-Burmese languages.

  • Idu-Taraon, Miji, Miju, Puroik, and Kho-Bwa are all early splits from Proto-Tibeto-Burman rather than isolates. All branches clearly show Tibeto-Burman forms in their basic vocabulary. Only Hrusish and Siangic look to me as if they are likely non-Sino-Tibetan languages. Hruso, with its long strings of consonant clusters reminiscent of Salish and Berber, has a very non-Sino-Tibetan look to it.

  • Digaro looks like it has Siangic + Central TB + Eastern TB elements.

  • Tamangish is similar to Bodish and may form a branch with it.

  • Gongduk has some Central TB forms.

  • Burmo-Qiangic is a linkage that has the internal diversity of Sal or perhaps even Central Tibeto-Burman. "Qiangic" is a linkage consisting of rGyalrongic, Rma, Pumi, Muya (Minyak), Zhaba / Queyu, Guiqiong, etc. that has the internal diversity of Kuki-Chin-Naga. It does not include Naic and Ersuic, which are divergent and are separate branches within the Burmo-Qiangic linkage.

In my view, based on lexical evidence, languages in the Eastern Sino-Tibetan linkage roughly diversified as follows, from earliest to latest.

Eastern Sino-Tibetan linkage

- Sinitic

-- Tujia

--- Karenic

---- Gong

----- Nungish

------ pre-Jiamao

------- Kathu

["Burmo-Qiangic" branches below]

--------- Bai, Cai-Long

---------- Ersuish

----------- Qiangic

------------ Naic

------------- Burmish

-------------- Loloish

This is where the wave model and tree model both come into play. Combinations of vertical and horizontal transmission have complicated phylogenetic trees since the earliest life forms arose, with the best-known examples being those of multiple cross-breeding events among early human species, and constant gene transfers among Bacteria, Archaea, and Eukarya.

Other than the four major linkages listed above, the other languages would be best left as primary branches of Tibeto-Burman. Certain Tibeto-Burman languages of Arunachal Pradesh may have non-Tibeto-Burman substrata, resulting in divergent lexical items.

The NeighborNet results of my 2014 trial run of Sino-Tibetan fallen leaves using SplitsTree 4.0 is shown in the screenshot below. Cognate sets of 18 vocabulary items mainly from STEDT were included.

Phylogram using the Neighbor-Joining (NJ) algorithm:

Additional maps are shown below.

Map 1: Proposed dispersal of Tibeto-Burman branches. Branch names (otherwise known as George van Driem's "Trans-Himalayan fallen leaves") are labeled in different colors according to the linkage or geographic area that they are in. Some of my proposed new "fallen leaves" in the Eastern Tibeto-Burman area are labeled in orange.

Map 2: Single-crop rice regions and Sino-Tibetan branches in China.

3.1. Burmo-Qiangic fallen leaves

Chikova (2012) believes that Qiangic may be a linguistic area rather than actual unified branch. In other words, Qiangic is likely to be paraphyletic. The comparative lexical data in ZMYYC (1991) and STEDT show that "Qiangic" has high internal lexical diversity. Based on evidence from comparative lexical data and geographical distribution, I believe that Burmo-Qiangic is best divided into multiple primary branches that had radiated out from the Sichuan Basin via the Upper Yangtze drainage basin before the Qin and Han conquests of Sichuan occurred. The Three Gorges served as a natural geographical barrier between Burmo-Qiangic and the non-Burmo-Qiangic languages to the east such as Tujia, Hmong-Mien, and the missing "Donor Miao-Yao" (or rather "Donor Hmong-Mien") branch of Tibeto-Burman proposed by Benedict (1988).

The Tibeto-Burman form *syam 'iron', reconstructed by Matisoff in STEDT, has an almost exclusively Burmo-Qiangic distribution, with sporadic loanwords in non-Burmo-Qiangic languages to the west such as Nungic languages, Apatani (Tani branch), and Deori (Bodo-Koch branch). Hence, *syam 'iron' is actually a Proto-Burmo-Qiangic lexical innovation. This suggests that Proto-Burmo-Qiangic speakers were an economically influential group that had already developed metalworking. Hence, the Sanxingdui culture of Sichuan that had existed over 3,000 years ago was very likely to have been associated with speakers of early forms of Burmo-Qiangic languages.

Sun Hongkai (2013) also notes that the geographical distributions of Qiangic subgroups correspond with specific watersheds.

George van Driem's "fallen leaves" model can be applied to Burmo-Qiangic as well. Further comparative work will be needed to figure out the relationships of these branches to each other, and whether Nungic, Karenic, and Gong (or perhaps even Sinitic) are sister branches of Burmo-Qiangic as part of a larger "Eastern Tibeto-Burman" group.

  1. Baima [with Tibetic superstratum]

  2. rGyalrongic

  3. Rma (Qiang)

  4. Muya (Minyak)

  5. Pumi (Prinmi)

  6. Tangut

  7. Zhaba / Queyu (Choyo)

  8. Guiqiong

  9. Ersuic

  10. Naic

  11. Lolo-Burmese

  12. Bai [with Old Chinese superstratum]

  13. Cai-Long [with Old Chinese superstratum]

  14. Waxiang [with Old Chinese superstratum]

The following map shows the Burmo-Qiangic branches and how their probable dispersal routes. My Burmo-Qiangic dispersal hypothesis is synthesized from earlier work by Jacques & Michaud (2011), Sun (2013), and Chamberlain (2015). Bai has been included based on Lee & Sagart (2008), Cai-Long and Waxiang based on Zhengzhang (2010) and Sagart (2011), and Baima based on Chirkova (2008).

Map 3

Map legend:

Red = Burmo-Qiangic branches

Blue = Hmong-Mien branches

Green = Kra-Dai (Austro-Tai) branches

Pink = Austroasiatic branches

Purple = non-Burmo-Qiangic Tibeto-Burman branches

Brown = Old Chinese

(Note: She of Jiangxi, Fujian, and Zhejiang is closely related Hakka, and is completely distinct from the She of Guangdong, which is a Hmongic language. The She of eastern China may have been "Para-Hmong-Mien" speakers who had shifted to Hakka as Hakka speakers from the north moved into the hills of eastern China.)

Two additional maps are shown below.

Map 4: Diversity of Burmo-Qiangic branches in the Upper Yangtze watershed

Map 5: Dispersal of Lolo-Burmese branches from the Lolo-Burmese homeland in Dian Lake

Many Qiangic subgroups are located within the Min River watershed of Sichuan, while the remaining Burmo-Qiangic diversity is concentrated in the Jinsha River (Upper Yangtze) watershed. thus, it is not unreasonable to assume that early forms of Qiangic (or Burmo-Qiangic) were spoken in the Sichuan Basin during the late Neolithic.

Likewise, Kra-Dai dispersed via the Pearl River drainage basin, and Hmong-Mien had dispersed via the Yuan and Xiang drainage basins in Hunan. Austroasiatic dispersed upstream via Mekong tributaries (Blench & Sidwell 2010), and also by coastal routes.

The internal diversity of each branch or phylum roughly correlates with the geographic size of the respective drainage basin that the phylum or branch had primarily dispersed in. Chamberlain (2015) notes that languages in Bhutan also tend to disperse via drainage basins (watersheds), and that the tributaries in a river system are analogous to subway lines, and that watersheds correspond closely to linguistic groupings. I believe that following river valleys upstream would have been the primary means of agriculturally-motivated population dispersal during the East Asian Neolithic when overland travel via roads was much more difficult than riverine transportation in sparsely populated mountainous frontier regions. Coasts and flat plains are other means for population dispersals. However, starting from the Han Dynasty and especially during the Ming and Qing Dynasties, population movements often follow roads rather than rivers, and usually occurred as a result of military operations and forced population displacements as noted by Holm (2010).

Additionally, I have noticed that Tibeto-Burman loanwords abound in Kra and Jiamao, but it is unclear which branch of Tibeto-Burman these words are from. Today, only Southern Loloish, Northern Loloish, Southeastern Loloish, and Mondzish languages are found in the region, and the Tibeto-Burman loanwords in Kra and Jiamao are ostensibly not from these branches. These Lolo-Burmese branches are all recent arrivals in northern Vietnam, Wenshan, and Guangxi within the past 1,000 years. Thus, a "missing" Burmo-Qiangic branch may have been in southern Guangxi and northern Vietnam, which was later absorbed by Tai and Vietic languages. This "missing" Burmo-Qiangic branch would had dispersed downstream via the Red River valley.

3.2. A parallel to the south: Austroasiatic linked leaves

Similarly, the Austroasiatic language family has northern (Khmuic + Palaungic + Khasic), eastern (Mangic + Vietic + Katuic + Bahnaric), and southern (Monic + Aslian + Nicobaric) linkages. Blench & Sidwell (2010) argues that this is due to mutual contact, while Gérard Diffloth argues that these are coherent branches. There are 13 existent modern-day Austroasiatic branches, but there may have been as many as 22 or more branches in the past. Again, please note that these are preliminary impressions, and that I have not yet provided any well thought-out evidence for this classification.

  1. Eastern linkage

    1. Mangic [+ source(s) of Austroasiatic words in Kra languages]

    2. pre-Jiamao [proposed substratum of Jiamao (Thurgood)]

    3. pre-Be [source(s) of Austroasiatic words in Be and Jizhao]

    4. Vietic

    5. Katuic

    6. Bahnaric

    7. pre-Chamic (Sidwell)

    8. Khmer-Pearic linkage: Khmeric and Pearic

  2. Northern linkage

    1. Khmuic

    2. Palaungic

    3. Khasic

    4. Rongic [proposed substratum of Lepcha (Blench)]

    5. pre-Mru / pre-Kuki-Chin-Naga (?) [+ source(s) of Austroasiatic words in Mruic, Kuki-Chin-Naga, and perhaps also Tani languages]

    6. pre-Hmong-Mien (?) [source(s) of Austroasiatic words in Hmong-Mien languages (Ratliff 2010)]

    7. "Donor Kra" [source(s) of early Austroasiatic loanwords in Proto-Kra; see Ostapirat (1995)]

    8. Munda (?)

  3. Southern linkage

    1. Monic

    2. Aslian

    3. Nicobaric

    4. pre-Acehnese (?) (Sidwell)

    5. pre-Kerinci [proposed substratum of Kerinci Malay (van Reijn 1974)]

    6. Bornean (?) [proposed substratum of Greater Bornean (Austronesian) languages (Blench)]

Note that Paul Sidwell (2017) has since classified Shompen within Southern Nicobaric rather than as a separate branch of Austroasiatic.

SplitsTree 4.0 NeighborNet test of Austroasiatic , based on branch reconstructions for over 50 words.

All below were drawn by Andrew Hsiu (2015).

Map 6: Dispersal of Austroasiatic primary branches from the Mekong River drainage basin, including possible extinct branches.

Map 7: Mutual influence among Austroasiatic branches in Indochina (contact shown by black lines)

References

Chamberlain, Brad. 2015. Watersheds and language mapping. Presented at SEALS 25, Payap University, Chiang Mai, Thailand.

Chirkova, Katia. 2012. "The Qiangic Subgroup from an Areal Perspective: A Case Study of Languages of Muli" (Archived 2015-06-08 at WebCite). In Languages and Linguistics 13(1):133-170. Taipei: Academia Sinica.

DeLancey, Scott. 2015. "Morphological Evidence for a Central Branch of Trans-Himalayan (Sino-Tibetan)." Cahiers de linguistique - Asie oriental 44(2):122-149. December 2015. doi:10.1163/19606028-00442p02

Holm, David. 2010. "Linguistic Diversity along the China-Vietnam Border." In Linguistics of the Tibeto-Burman Area, Volume 33.2, October 2010.

Luce, George. 1985. Phases of Pre-Pagan Burma, volume 2. Oxford University Press.

Sun Hongkai, et al. 1991. Zangmianyu yuyin he cihui (ZMYYC) 藏缅语音和词汇 [Tibeto-Burman phonology and lexicon]. Beijing: Chinese Social Sciences Press.

Sun Hongkai. 2013. Tibeto-Burman languages of eight watersheds [八江流域的藏缅语]. Beijing: China Social Sciences Academy Press.

van Driem, George. 2014. "Trans-Himalayan", in Owen-Smith, Thomas; Hill, Nathan W., Trans-Himalayan Linguistics: Historical and Descriptive Linguistics of the Himalayan Area, Berlin: de Gruyter, pp. 11–40, ISBN 978-3-11-031083-2.

Zhèngzhāng Shàngfāng [郑张尚芳]. 2010. Càijiāhuà Báiyǔ guānxì jí cígēn bǐjiào [蔡家话白语关系及词根比较]. In Pān Wǔyún and Shěn Zhōngwěi [潘悟云、沈钟伟] (eds.). Yánjūzhī Lè, The Joy of Research [研究之乐-庆祝王士元先生七十五寿辰学术论文集], II, 389–400. Shanghai: Shanghai Educational Publishing House.