Ian Joo

Building EuraPhon, a phonological database of Eurasia

Ian Joo, work with Yu-Yin Hsu (Hong Kong Polytechnic University)


We present EuraPhon, a database in the make that is aimed to contain basic phonological

information of around 700 lects spoken in the Eurasian macroarea (as defined by Hammarström

and Donohue 2014). The database includes the following information of each lect: (i) Segmental

phonemes; (ii) Number of tonemes; (iii) Word-initially forbidden consonants; (iv) Word-finally

permitted consonants; and (v) Syllabic structure (minimal and maximal number of onsets, minimal

and maximal number of vowels, and maximal number of codas). The phonological information

of each lect is derived from the reference grammar that is classified as its Most Extensive

Description in Glottolog 4.4 (Hammarström, Forkel, et al. 2021), given that the Description is

accessible and appropriate.


Based on the five types of data, we measure the distance between each pair of lects whose

geographical coordinates are within 1,000km distance. Figure 1 shows the distance between

geographically adjacent lects: blue dotted lines represent the distance between lects of the same

family, whereas red solid lines represent cross-family distance. The thicker the line, the closer

the distance between the connected lects. Figure 2 zooms into Mainland Southeast Asia, which

shows that lects in this linguistic area show a high degree of cross-family resemblance, as

previously described by many studies (Nick J. Enfield 2005; Comrie 2007; Nick J. Enfield 2011;

De Sousa 2015; Nick J. Enfield 2018; Nick James Enfield and Comrie 2021).

References

Comrie, Bernard (2007). “Areal typology of mainland Southeast Asia: what we learn from the

WALS maps”. In: MANUSYA: Journal of Humanities 10.3, pp. 18–47.

De Sousa, Hilário (2015). “The Far Southern Sinitic languages as part of Mainland Southeast

Asia”. In: Languages of Mainland Southeast Asia. Ed. by Nick J. Enfield and Bernard Comrie.

Berlin; Boston: De Gruyter Mouton, pp. 356–440. DOI: 10.1515/9781501501685-009.

Enfield, Nick J. (2005). “Areal Linguistics and Mainland Southeast Asia”. In: Annual Review of

Anthropology 34.1, pp. 181–206. DOI: 10.1146/annurev.anthro.34.081804.120406.

— (2011). “Linguistic Diversity in Mainland Southeast Asia”. In: Dynamics of human diversity:

The case of mainland Southeast Asia. Pacific Linguistics 627. Ed. by Nick J. Enfield, pp. 63–

79.

— (2018). Mainland Southeast Asian Languages: A Concise Typological Introduction. Cambridge,

UK ; New York, NY: Cambridge University Press.

Enfield, Nick James and Bernard Comrie (2021). The Languages of Mainland Southeast Asia.

Berlin, München, Boston: Cambridge University Press.

Hammarström, Harald and Mark Donohue (2014). “Some principles on the use of macro-areas

in typological comparison”. In: Language Dynamics and Change 4.1, pp. 167–187.

Hammarström, Harald, Robert Forkel, et al. (2021). Glottolog 4.4. Leipzig: Max Planck Institute

for Evolutionary Anthropology. DOI: 10.5281/zenodo.4761960.