The language tree shows 12 language families with at least 50 million speakers. The distribution of Indo-European—the largest—was described on the previous page. This page summarizes the distribution of the other 11 language families with at least 50 million speakers.
Most people in East Asia use languages belonging to the Sino-Tibetan family. The region’s two other widely used languages are Japanese and Korean.
Language Families of Southwest, East, and Southeast Asia Other Than Indo-European
Sino-Tibetan is the world’s second-most widely used language family. The most commonly used is Mandarin, which the Chinese call Putonghua (“common speech”). Spoken by approximately three-fourths of the Chinese people, Mandarin is by a wide margin the world’s most-used language. Once the language of emperors in Beijing, Mandarin is now the official language of both the People’s Republic of China and Taiwan, and it is one of the six official languages of the United Nations. Seven other Sino-Tibetan languages are used by at least 20 million each in China, mostly in the southern and eastern parts of the country: Gan, Hakka, Jinyu, Min Nan, Xiang, Wu, and Yue (also known as Cantonese).
Written in part with Chinese characters, Japanese also uses two systems of phonetic symbols, used either in place of Chinese characters or alongside them. Chinese cultural traits have diffused into Japanese society, including the original form of writing Japanese. But the structures of the two languages differ. Foreign terms may be written with one of these sets of phonetic symbols.
Unlike Sino-Tibetan languages and Japanese, Korean is written in a system known as hankul (also called hangul or onmun). In this system, each letter represents a sound, as in Western languages. More than half the Korean vocabulary derives from Chinese words. In fact, Chinese and Japanese words are the principal sources for creating new Korean words to describe new technology and concepts.
The three largest language families of Southeast Asia are Austronesian, Austro-Asiatic, and Tai-Kadai.
These languages are used by about 5 percent of the world’s people, who are mostly in Indonesia, which is the world’s fourth-most-populous country. With its inhabitants dispersed among thousands of islands, Indonesia has many distinct languages and dialects; Ethnologue identifies 706 living languages in Indonesia. Indonesia’s most widely used first language is Javanese, spoken by 84 million people, mostly on the island of Java, where two-thirds of the country’s population is clustered.
This family is used by about 2 percent of the world’s population. Vietnamese, the most spoken language of the family, is written with our familiar Latin alphabet, with the addition of a large number of diacritical marks above the vowels. The Vietnamese alphabet was devised in the seventeenth century by Roman Catholic missionaries.
The Tai-Kadai family was once classified as a branch of Sino-Tibetan. The principal languages of this family are spoken in Thailand and neighboring portions of China. Similarities with the Austronesian family have led some linguistic scholars to speculate that people speaking these languages may have migrated from the Philippines.
Most language families are named for regions or countries. Based on their names, how would you expect the distributions of the the three largest Southeast Asia language families to differ?
Dravidian and Turkic are the two most widely used languages in Asia not yet discussed (refer to Figure 5-14).
Dravidian is the second-most widely used language family in South Asia, following Indo-European, and is the principal family in southern India. The two most widely used are Telugu and Tamil. The origin of Dravidian is unknown, and it has been studied less than other widely used language families. When speakers of Indo-European languages reached India, speakers of Dravidian languages were already present.
The Turkic language family is thought to have originated in the steppes bordering the Qilian Shan and Altai mountains between Tibet and China. Present distribution covers an 8,000-kilometer band of Asia. The Turkic language that has by far the most users is Turkish.
When the Soviet Union governed most of the Turkic-speaking region of Central Asia, use of the languages was suppressed. With the dissolution of the Soviet Union in the early 1990s, Turkic languages became official in several new countries, including Azerbaijan, Kazakhstan, Kyrgyzstan, Turkmenistan, and Uzbekistan.
The Turkic family was formerly called “Altaic,” but that name is no longer used. The much smaller Uralic family was once considered closely linked to Turkic, but it is now considered to have originated 7,000 years ago in the Ural Mountains of present-day Russia.
No one knows the precise number of languages in Africa, and scholars disagree on classifying them into families. In the 1800s, European missionaries and colonial officers recorded African languages using the Latin or Arabic alphabet. Ethnologue lists 2,146 languages in Africa; only 699 have a literary tradition. The world’s third- and fourth-largest language families are based in Africa: Afro-Asiatic in North Africa and Niger-Congo in sub-Saharan Africa (Figure 5-15).
African Languages
The great number of languages results from at least 5,000 years of minimal interaction among the thousands of cultural groups inhabiting the African continent.
Arabic is the major language of the Afro-Asiatic family, an official language in two dozen countries of Southwest Asia & North Africa, and one of six official languages of the United Nations. According to Ethnologue, 206 million people speak and write the official language Arabic. Most also use a second language that is distinct from official Arabic. For example, 65 million people use Egyptian Spoken Arabic. Ethnologue identifies 34 distinct Arabic languages in addition to the official one.
A large percentage of the world’s 1.9 billion Muslims have at least some knowledge of Arabic because Islam’s holiest book, the Quran (Koran), was written in that language in the seventh century. The Afro-Asiatic family also includes Hebrew, the original language of Judaism’s Bible (Tanakh) and Christianity’s Old Testament.
More than 95 percent of the people in sub-Saharan Africa use languages of the Niger-Congo family. The three most widely spoken Niger-Congo languages are Yoruba, Igbo, and Swahili. Yoruba and Igbo are among the many languages of Nigeria (refer ahead to Figure 5-35). Swahili is an official language in Kenya, Rwanda, Tanzania, and Uganda. It is the first language of 16 million people and is spoken as a second language by more than 50 million Africans. Especially in rural areas, the local language is used to communicate with others from the same village, and Swahili is used to communicate with outsiders. Swahili originally developed through interaction among African groups and Arab traders, so its vocabulary has strong Arabic influences. It is one of the few African languages with an extensive literature.
Languages of the Nilo-Saharan family are used by 53 million people in north-central Africa, immediately north of the Niger-Congo language region. Divisions within the Nilo-Saharan family exemplify the problem of classifying African languages. Despite having relatively few speakers, the Nilo-Saharan family is divided into six branches, plus numerous groups and subgroups. The total number of speakers of each individual Nilo-Saharan language is extremely small.