The Dispersal of Language in China, and the Creation of Standard Mandarin

The story of the Chinese language stretches far back into the annals of prehistory and is, therefore, deeply shrouded in the opaqueness of antiquity. Simply put, any attempt at understanding the Chinese language from a historical perspective is tantamount to entering into a vast museum at night with only a small flashlight to guide your way: only a slight portion of the riches are viewable and to proceed necessitates a large amount of guesswork and speculation. The modern dispersal of, and the variations within, the many forms and dialects of spoken Chinese is an equally complicated tale to tell, and is the result of over five thousand years of political maneuvering, migration, and geographical restraints. This investigation is but a meager attempt at shedding some light on the hidden riches behind the modern Chinese language.

All variants of Chinese are categorized into the Sino-Tibetan language family which, itself, is also lock within the deep vaults of antiquity. Modern linguist are confident that all Sino-Tibetan languages steam from a single common root language referred to as proto-Sino-Tibetan. This claim is made based upon the similarities between various aspects of all of the modern languages and dialects of the family as well as through the analysis of remnant bits and pieces of historic evidence of earlier Sino-Tibetan languages. "The relations between Chinese and other Sino-Tibetan languages are an area of active research, as is the attempt to reconstruct Sino-Tibetan."1 The convergent point of Chinese from its Sino-Tibetan root is also being actively pursued by linguist. Although there is no written evidence to assist them in their research and every possible road currently is blocked by insufficient documentation or understanding exploration continues unabated.2 The modern languages of the Sino-Tibetan (and Tibeto-Burman, which falls under the same moniker) language family consist of all variants of Chinese, Kamarupan, Himalayish, Qiangic, Jingpho-Nungish-Luish, Lolo-Burmese-Naxi, Karenic, and Baic.3

Chinese- meaning all of the dialects and ‘languages’ carried by the Chinese heading- is currently the most widely spoken language in the world. One out of every five people on the planet speaks some form of Chinese language as their mother tongue. Current investigations estimate this number to be over 1.3 billion. There is also an ever increasing amount of foreigners studying the language all over the world. "China’s Ministry of Education estimates the worldwide learners to be 30 million people. . ."4 Mandarin Chinese is the official language of the People’s Republic of China (mainland China), Taiwan, and is one of the recognized idioms of Singapore; Whereas, Cantonese is one of the official languages of Hong Kong (with English) and Macau (Portuguese). Chinese is also widely spoken in Southeast Asia and there are sizable Chinese communities in Thailand, Vietnam, Malaysia, Philippines, and Myanmar.5 But in lieu of these great numbers and wide distribution, the criteria as to what separates a language from a dialect and, consequently, if all dialects in China should continue being classified under the ‘Chinese’ heading is still vigorously debated.

Although all of the languages in China are a part of the Sino-Tibetan family, there is still a great amount of internal diversity between all of the variant forms. There is currently between six and twelve main language groups in China that are spread throughout the country showing strong regard towards geographical barriers (i.e. mountains and rivers). The ten main Chinese language variants (I use the word variant because it is still hotly debated as to what constitutes a separate language from a dialect), their respective number of speakers, and a brief description of each follows:6


- 800 million speakers. The most widely spoken and distributed language in China; over two thirds of the Chinese population speaks some form of Mandarin, and the localities where it is spoken natively consumes three quarters of the Chinese mainland.7 It is known variously as "the language of the officials or Mandarians"8 (Guanhua), National Language (Guoyu), General speech (Putonghua), or Northern Chinese (Beifang fangyan).


- 90 million speakers. This group includes the Shanghai, Yugkia, and Hsiang dialects, and extends over the Shanghai, Zhejiang, and Hunan areas. It is characterized by its retention of, "ancient voiced or sonant initials like b-, d-, g-, v-, z-, etc."9


- 80 million speakers. This dialect or language (depending upon classification criteria) is, "characterized by . . . [the] . . . preservation of ancient consonantal endings –m, -p, -t, -k."10


(including Taiwanese) - 50 million speakers. This is the main Chinese language that is spoken in Taiwan as well as on the mainland Chinese coast around Fuzhua.


- 35 million speakers. This dialect is present in the area of central China where the plains meet the mountains. This region is surrounded on three sides by Mandarin speaking areas.


- 35 million speakers. Like Cantonese, it still retains the –m, -p, -t, -k consonant endings.11


- 20 million speakers. Spoken in Eastern China in between the Wu and Xiang groups.


- Derived from Mandarin and is spoken in the north of China on the border of Inner Mongolia.12


- Is a derivative from the Wu dialect and is spoken just to the west of the main Wu areas of Zhejiang province.


- Is influenced from the Cantonese language and is prevalent to the west of the main Cantonese area in the southern arch of China.

The spatial distribution of these language variants have been greatly impacted by political conquest, population shifts, and the geographic layout of China. The most widely spoken form of Chinese, Mandarin (which is actually divided again into three more dialect groups), spreads from the Siberian and North Korean borders in the far northeastern frontier of Chinese Manchuria south to the mountainous areas of Hunan and Kiangsi provinces; west, to the Tibetan borderlands of Yunnan and Szechwan and then culminating through the oddly shaped Silk Road province of Gansu and ending in the multi-cultural region of Xinjiang in the far northwest. Mandarin has by far that largest area span of any of the other Chinese variants. Most of the other language groups are huddled together in the southeast of China. Wu encompasses the Shanghai area as well as Zhejiang province, the Foochow group is found in Fukien province on the coast immediately across from Taiwan, the Hakka group is huddled between Mandarin regions to the north and Cantonese to the south, and Cantonese is spoken around the horseshoe arch that makes up the coastal southeast of China.13 The greater language variation in the southeast was evidently caused by the seclusion offered by the mountains which cover the region. This is to the extent that, "in parts of South China, a major city’s dialect may be marginally intelligible to close neighbors."14 This mountainous region also served as a place of refuge for communities that were uprooted for political reasons. An example of this is that of the Hakka; who needed to flee their ancestral home in Northern China due to invading Mongolian armies. The exiled Hakka traveled through China until they were able to find a safe abode in the isolated southern mountains.15

The divisions between the variant forms of the spoken Chinese language are very distinct. Often times, speakers of different dialects are not able to verbally communicate with each other even at the most rudimentary of levels. As the missionary Alessandro Vilignano wrote in the sixteenth century, "The Chinese have different languages in different provinces, to such an extent that they cannot understand each other. . ."16 Yuen Ren Chao adds that, "The mutual intelligibility between speakers of these different dialects depends, as is the case of other languages, both upon their dialects and upon the education of the speakers," and he continues by writing that, ". . .we can say that the other groups of dialects are about as far from Mandarin as say Dutch or Low German from English."17 Although the divisions between different language groups are very distinct, within the various dialects of the groups communication flows much more smoothly as, "A speaker of one group of Mandarin, say a native of Harbin, can converse freely with a speaker of another group, say a native of Chungkin, without misunderstanding each other."18

In my own experience, I have found that the differences between dialects (or languages) in regards to physical geography conform to the findings of the above stated research. While traveling in the mountainous areas of Southeastern China, I discovered that the differences in language were grossly disproportional to distance traveled. Even the very sounds of the various languages which are spoken in relatively nearby villages are so different that it can be perceived by the untrained ear. To travel an hour south of the Southeastern Chinese city of Hangzhou into the Tiantai Mountains is to enter into a completely incompatible linguistic territory. The fledgling Mandarin which I used to communicate in Hangzhou was of absolutely no use in Tiantai, and the dialect in Tiantai could not be of any use in Hangzhou. Converse to this experience is the fact that I could travel from the far northeastern reaches of Manchuria to the southwestern borderlands in Yunnan province, a distance of thousands of kilometers, and still be able to communicate using Mandarin. The language conformity of the later example is due to the fact that the area mentioned is primarily open plains with few geographical impediments; which allowed for the free expansion of the language of the dominant Northern populous. As previously stated, the mountains of the southeast provided a habitat in which various cultural groups could abscond from the linguistic influence of the politically stronger North.

The people from areas of China that do not natively speak Standard Mandarin (more on this later) are usually able to speak multiple forms of the Chinese language. In lieu of this, China is a great representation of diglossia, and most educated Chinese are able to speak their local dialect, a regional common language, as well as Standard Mandarin. They are thus enabled to speak to people from neighboring parts of their region, country, as well as read national publications and understand other language based forms of information and entertainment- such as television and movies. "Such polyglots frequently code switch between Standard Mandarin and the local dialect(s), depending on the situation."19 To walk around in the streets of most any Chinese city is to hear many incompatible languages spoken and the kaleidoscope of Chinese culture blatantly reveled. This is nothing new in China as, due to the great language variations of the country, the Chinese have always had to be able to speak multiple dialects. W. South Coblin cites Matteo Ricci, the sixteenth century missionary, as writing that:

With all the varieties of languages, there is also one that we call cuonhoa, that is to say, the language of the law courts; it is used in audiences and tribunals; and, if one learns this, he can us it in all the provinces; in addition, even the children and women know enough of it to be able to communicate with all the people of another province.


W. South Coblin made another note on the extensive language diversity of China when he reproduced Joseph Edkin’s 1864 passage which state that:

Many men from Kiangnan reside in Peking, especially of the class of scholars. They retain many peculiarities of the southern pronunciation, even after the lapse of three or four generations. In such cases, the tones of Peking are sometimes used in conjunction with the initials and finals of Nanking.

The debate between whether the diversities found within Chinese constitute a single language with many dialects or a language family with multiple independent idioms still rages on. This debate is not merely relegated to the somewhat sterile bounds of linguistic academia but has the weight of politics and national identity behind it. "From a purely descriptive point of view, "languages and "dialects" are simply arbitrary groups of similar idiolects, and distinction is irrelevant. . ."20 Intellectually, this statement is appropriate but, in practice, there are many cultural issues that go into further defining the Chinese language.

The nationalistic tendency of the modern Chinese state propagates the notion of a single Han race, and if some variants of Chinese were to be regarded as separate languages this claim would be partially diluted. Language distinction is a major socio-political factor in the formation of concepts as to what constitutes Chinese nationalism, history, culture, as well as regional diversities within China. The wanton tag of any political authority would be to state that they preside over a single united body of people with a common identity, and this is so of modern China state. But this view of Chinese culture is horribly over-simplified; as "just as the Roman Empire was composed of different ethnic groups, there were once different Chinese and non-Chinese nations before they were united by conquest into the Chinese empire."21 This point is ever-strengthened by the fact that many idiom variants in China still retain portions of their former languages; which directly implies that the people who spoke them were of a different culture that the Han, who are currently politically and culturally dominant. To publicize this fact would work to dissolve the picture of China as a single endogamous nation and would imply that, "the notion of a single Chinese language and a single Chinese state is artificial."22 Government officials in the PRC feel as if this claim would, subsequently, vindicate succession movements that are currently brooding throughout the country. But, in clever compromise, it is widely believe in China feel that language and culture are not mutually inclusive and that people can speak different dialects (than Mandarin) and still be of the same race. The official position on this issue is that the, "Han Chinese are an entity of great internal diversity."23

Although there are many variant forms of spoken Chinese, there is only one traditional form of the written language. All dialects (or languages) throughout the whole of China use the same character system to write their languages. "A newspaper published in Peiping, in Chungking, in Shanghai, or in Canton, will use the identical characters for the same news . . . But when a news item or an article is read aloud by different readers, who do not speak the same dialect, the text serves as a many-sided diamond which gives out all sorts of colour-tones according to the direction of approach."24 This point was also broached in 1592 by Matteo Ricci when he wrote that, "The letters are common in all fifteen provinces of China. However, the language in each of the provinces is different."25 This common written language is one of the main unifiers of Chinese cultural identity and, consequently, is one of the main arguments that all dialects of spoken Chinese are, in fact, one single language.

The common written language also serves the function of allowing literate people who speak different dialects the ability to communicate with each other without having to go through the rigors of learning an entire cornucopia of languages. In my experience, whenever my Mandarin would be misunderstood or I did not understand what was said, the person with whom I was attempting to communicate with would often times revert to writing Chinese characters. I was at first taken a little aback by this, as I figured that it was implied that if I did not understand what was being said then I definitely would not be able to comprehend the complex Chinese characters. But I eventually realized that this was the way that Chinese people have intrinsically communicated with people whose speech they did not understand for thousands of years. Subsequent observation and reading confirmed my suspicion: the Chinese characters are the lingua-franca of China.

The Chinese system of writing invokes the use of ideographs that are commonly referred to as characters or hanzi (writing of the Han), as they are called in Mandarin Chinese. These characters are mono-syllabic and each represents one morpheme of speech.26 There are currently around fifty thousand characters in the modern lexicon, although only about six thousand are commonly used.27 Current estimates decree that it takes the ability to recognize around three thousand characters to read a newspaper in mainland China and slightly more to read one in Taiwan. This manner of character writing is very unique and indigenously Chinese. As put by Tao-Tai Hsia, "The present-day Chinese complex system of ideographs is the only major non-phonetic language in a world of phonetic languages."28

The Chinese characters are constructed from the application of one to twenty-seven strokes and follow a system of radicals (components) which lend meaning and the occasional slight sound indication. There are 240 different radicals that are the building blocks of the characters. Most ideographs contain multiple radicals and they are assembled together in a side-by-side and/or over and under formation; so that the resulting character is squarish and evenly balanced. The radicals of the Chinese characters are assembled together in such a way that some components will provide a hint as to the ideograph’s semantics while other portions provide slight phonetic clues. The phonetic elements of characters are indicators of how the words that they represent once sounded in ancient Chinese; therefore, they do not provide much assistance in their contemporary form. To the untrained learner of Chinese, the semantic portions, which at onetime were somewhat visual signposts as to a character’s meaning, are now nearly equally as opaque. During his initial attempts at character recognition, L.G. Wooley writes that, "I was disappointed as I realized how little help the pictorial element was to me in learning Chinese characters. I found that most characters. . .were to me pictorially about as vague as a cubist drawing by Picasso. It was only after I had spent some time on the study of the character etymologies that the picture element in the characters became of real help."29 During the Han Dynasty, around 100 BC, the scholar official, Xu Shen, divided the most widely used characters up into six categories; into which only four percent were directly pictorial and over eighty percent were compounds that consisted of multiple semantic and (now archaic) sound elements.30

The utilization and development of Chinese characters (hanzi) stretches far back into the annals of history. This writing system has continuously evolved and has been regularly adapted and regulated throughout its existence. The first evidence of the use of Chinese Characters spans back to the Shang-Yin dynasty (1766-1122 B.C.) and were primarily written on oracle bones and tortoise shells.31 But these specimens were of a very mature script, which indicates that the origin of the writing system is much older.32 These characters were, for the most part, pictorial, and were simplified representations of objects and basic denotations of abstract ideas (such as numbers and direction). They were already written in the block shaped style that has survived until this day. These characters were greatly adapted in the eighth century B.C. in the Chou dynasty. A more complex system of writing was created which is now referred to as ‘Great Seal’ and is predominately found etched into bronze ware. The Chinese writing of this era was far more complex than its shell and bone predecessor. Then between 246 and 210 B.C., at the direction of Huangdi, the emperor who first conquered and pacified China, the Chinese writing system was again overhauled. The Prime Minister, Li Szu, greatly refined the characters by eliminating variation and creating a standard form; which came to be known as ‘Small Seal.’ This was the first time that the imposition of a standard writing system over China has been recorded, and it was enforced through the burning of books that evidenced the older writing systems. After this massive standardization initiative, the Chinese writing system continued to go through many changes; ". . .from Small Seal to Official style and then to the present-day Grass, Cursive, and Regular styles."33 The main drive of most character reform movements has been for the purpose of creating a writing system that is easier to learn and recognize.34

The May 4th movement of 1919 demarcated a time in recent Chinese history that the writing system again underwent a great upheaval. Up until this point, all writing was done in Literary Chinese (wenyan) or, as it is often referred, Classical Chinese. This system of writing differed very much form all spoken types of the language and its use was reserved for only the literary elite; who had the time and money to be educated. During the May 4th movement, the popular writing style was officially change to acquiesce with the common Chinese vernacular (baihua); which was to become the standard for all written Chinese.35

When the communist party came to power and formed the People’s Republic of China (PRC) in the mid-twentieth century, character reform was again pursued with added vigor. At this time, the illiteracy rate in China was an amassing ninety percent and, in accordance with communist ideology and propaganda purposes, needed to be decreased. As Mao Zhedong spoke in 1940, "The Chinese written language must be reformed when conditions permit. . .Language must be close to the masses. We must realize that the masses are an inexhaustible and abundant resource of the revolutionary culture."36 As Mao believed that the complexity of the Chinese characters were the main cause of illiteracy and that some of the underlying implications within the characters themselves portrayed counter-revolutionary ideas he sought out to reform and, ultimately, exterminate them. On October 10, 1949- immediately following the inauguration of the PRC- the Chinese Written Language Reform Association was formed.37 Their main purpose was to simplify the existing Chinese characters so that they would be accessible to the entire population. They began this mission by simplifying the 1,200 characters by reducing the number of strokes that it takes to construct them. They predominantly used simplified characters that have existed colloquially for centuries while discarding the ancient ‘proper’ characters. This was the first time that any simplified characters had been deemed acceptable by the constituencies of the state; as even as late as the Manchu dynasty (1644-1912), an applicant in a civil service exam would be automatically failed if it was discovered that he use even one simplified character.38 This initial group of officially recognized simplified characters was to be followed by the simplification of thousands more; which, as a result, increased literacy.

Taiwan, Hong Kong, and Macau did not follow the PRC’s example of simplifying the ancient Chinese writing system. "This new Communist initiative was bitterly denounced by Chinese nationalist as ‘a declaration of war on China’s cultural heritage.’"39 These Chinese populated territories kept the traditional characters and still use them to this day; so now there are two different standards by which Chinese is written.

On the shirt-tails of the character simplification movement, there was the attempt on the part of the Communist party to eradicate the use of Chinese Characters all together. Their motive was, again, to increase literacy and spread Communist ideology by disposing of the "cumbersome" characters. They did this by coming up with a phonetic system that utilized a thirty letter Latin alphabet by which Chinese could be written. This example blatantly shows the, "Communists. . .ability to alter Chinese cultural heritage so that more efficient indoctrination of the people, tighter control of mass organizations, and a higher level of industrialization can be assured."40 But this initial attempt did not fully gain enough steam and, fifty years after the Latinization process seriously began, characters are still the main form of writing in mainland China. As of now the reign of the Chinese character remains triumphant; which probably has to do with fact that, "the Chinese have poured their life and soul into these written symbols; . . ."41

But this is not to say that the Latinization process did not make any permanent changes in the Chinese writing system. The Latinized system that prevailed, Pinyin, now has a very prominent presence throughout China and regularly accompanies Chinese characters on street signs, storefronts, and on television programs. Ironically, the Pinyin system is now also used to teach children Chinese characters; as it is a phonetic way of representing the characters that are being taught. Henceforth, nearly all children in China learn the Latinized system before they learn the characters and they, subsequently, use pinyin as a supplementary text throughout their lives. This is exemplified in the way that Chinese characters are written on a computer. In order to do this, one must first type in the pinyin for the desired word and then a box appears on the screen from which the proper character is chosen. In this way, the pinyin text has come to complement Chinese characters, rather than eradicating them.

Along the long trials of its history, the spoken Mandarin language has continuously evolved, been adapted, and utilized for various purposes. Due to its ever-varying forms and incarnations it is not possible to say how old the Mandarin Chinese language is. Linguists now divide the Mandarin language into three temporal stages: Old (Archaic), Middle, and contemporary Chinese. This manner of division is based off of the great Swedish linguist Bernard Karlgren’s initial categorization system; which he worked out in the beginning of the twentieth century.42 Karlgren arrived at his theories by reconstructing the Qieyun rhyme dictionary, as well the rhymes of the Shijing, and, ". . .for the first time put the study of Chinese historical phonology on a rigorous scientific basis."43

The attempt to reconstruct Old, or Archaic, Chinese (Shanggu Hanyu) was first attempted by the Chinese in the Qing dynasty and has continued to this day. Modern linguist now denote that the first inceptions of the Old Chinese to have occurred during the Zhou dynasty (1122-256 B.C.). The evidence of this phase of the Chinese language had been taken from archaeological excavations; in which inscriptions on bronze implements were collected and analyzed. The Shijing, Shujing, and Yijing were also texts from which the modern construction of Old Chinese was heavily influenced.44 The phonetic elements of modern Chinese characters also provide hints as to their archaic pronunciation. The reevaluation of Chinese characters that were borrowed thousands of years ago and still used in their original form by the Japanese and Koreans also provide evidence as to what Old Chinese sounded like. The usual linguistic methods of, ". . .the neogrammarian principle of the regularity of phonetic change. . .,"45 are also part of the evidence base for the reconstruction of Old Chinese. Taken altogether, modern linguist now know that Old Chinese had a very diversified vocal system, "in which aspiration or rough breathing differentiated the consonants, but probably was still without tones."46

Middle Chinese was spoken during the Sui, Tang, and Song dynasties (between the 7th and 10th centuries). Linguist are now pretty sure that they have successfully reconstructed the form and sounds of the language through the combined analysis of, ". . .modern dialect variations, rhyming dictionaries, foreign translations, "rhyming tables" constructed by ancient Chinese philologist. . .and Chinese phonetic translations of foreign words."47 This Middle Chinese reconstruction is now employed to further delve into the deep historic trenches of Old Chinese.

The evolution and creation of the Standard Mandarin that we know today has been a very complex process. It is known that the Mandarin language was first spoken in Northern China and that it initially spread as its result of its speaker’s military conquests- which eventually covered nearly the whole of the country (or what we now define as the boundaries of China). Once control was established over all of the main territories of China, the implementation of a standard form of speech was imperative. But these early attempts were limited to governmental offices and other high-society doings, and the general populous continued to speak their historic dialects unimpeded. From 1356 to 1421 the capital of China was based in Nanking; whose dialect was taken to be standard. In 1421, the capital of China moved from Nanjing to Beijing; although the cultural hub of the country undeniably remained in Nanjing. The Nanjing dialect continued to be the national Mandarin standard for the next three hundred years or so.48 At this time the Beijing base pronunciation began making headway in the court and in the general populace, which cause a great debate concerning whether the currently standard Nanjing dialect would give way to that of Beijing:49

. . .by the 1790s Barrow clearly hears them [Beijing dialect speakers] competing with standard (i.e. Nanking-like) pronunciations in the streets of Peking [Beijing]. A decade or so later Morrison grudgingly admits that the imperial court prefers this "Tarter Chinese" [Beijing dialect], which he predicts may eventually become the national standard. By about 1850 this prediction has been realized, and a wholesale shift to a Pekinese-like phonological base has occurred. The result remains with us to this day.

Throughout this time the government has consistently attempted to promote the various (first Nanjing then Beijing) standard forms of Mandarin. In the seventeenth century the government set up orthoepy academies around the country to try to regulate the pronunciation of Mandarin.50 But, outside the immediate vicinity of Beijing, the established colloquial language variants continued on in spite of these governmental measures. It was not until the Communist takeover in the mid-twentieth century that any form of language systemization was successful. Upon gaining control of China, the PRC promptly put a state run educational system in place which used Standard (Beijing) Mandarin as the main medium of instruction. One of the main intents of this nation-wide education program was also to wipe out "typical" local accents.51 As Chang His-jo, who was the minister of education at this time said that, "The language reform is a serious political mission which mist be carried out in order to strengthen the solidarity of the people. . ."52 This "serious political mission" was engaged upon with full vigor and was, for the most part, effectively implemented, and, "As a result, Mandarin is now spoken by virtually all people in mainland China. . ."53

As it was just written, the story of the Chinese language has gone through many dips, turns, incarnations, and adaptations throughout its long history. There will probably be many more changes and modifications on the horizon but, to be sure, the depth and basic feel of the language shall ever remain intact. "There is a directness and earthiness in the speech style peculiar to all parts of the People’s Republic which I visited,"54 wrote the linguist Beverly Hong Fincher; and it is this peculiar linguistic depth that will forever wined the Chinese, their culture, and the very landscape of China into a tight, coalesced whole. Language is an extension of a people’s culture and outlook on life; so as long as the Chinese people stay true to their long-trodden history, it can be said with confidence that the Mandarin Chinese language with continue to shine on with the proud vigor of the people who speak it.

