Contrastive Discourse Markers in Native and Non-Native English Academic Writing

Reyhan Ağçam (Kahramanmaras, Turkey)


This study is motivated to investigate contrastive discourse markers (CDMs) in doctoral dissertations produced by native and non-native academic authors of English. Three sets of subcorpora were analysed to see whether they significantly differ with respect to the use of these markers. The findings displayed that paratactic discourse markers were used by all groups much more frequently than hypotactic discourse markers in conveying contrastive relations among the statements. Specifically, hypotactic CDMs were slightly overused by the Turkish-speaking group and significantly underused by the Spanish-speaking group in comparison to the native group, and paratactic CDMs were significantly underused by the former and significantly overused by the latter against the native group. Finally, it is seen that certain items in both categories were overwhelmingly employed by the three groups. The study concludes with practical implications to academic writing and suggestions for further directions.

Keywords: Academic writing, discourse, contrastive discourse marker, hypotaxis, parataxis

1 Introduction

Discourse is a unit of language larger than a sentence and which is firmly rooted in a specific context (Halliday 1990: 41). Cook (1992) defines it as “the use of language for communication”, and states that it may be composed of one or more well-formed grammatical sentences. In a similar vein, Hinkel & Fotos (2002) postulate that discourse in context may consist of only one or two words as in stop or no smoking; alternatively, a piece of discourse can be hundreds of thousands of words in length, as some novels are. They describe a typical piece of discourse is somewhere between these two extremes. Discourse markers, on the other hand, are defined by Schiffrin (1987: 31) as “sequentially dependent elements which bracket units of talk”. According to Redeker (1991), a discourse marker is a word or phrase, for instance, a conjunction or adverbial, comment clause, interjection that is uttered with the primary function of bringing to listener’s attention a particular kind of the upcoming utterance with the immediate discourse context. Grange (1996: 80) advocates that they can be used in order to make the speech or written text coherent, consistent, easy to follow and understandable. These markers are also known as discourse particles (Schourup 1983, Fisher & Gruyter 2000), connectives (Salkie 1995 Cele & Huart 2007), insert / discourse markers (Biber, Stig, Leech, Conrad & Finegan 1999), connectors (Copage 1999, Stephens 1999), discourse markers / utterance indicators / fillers (Pridham 2001), pragmatic markers / discourse markers (Aijmer 2004, Carter & McCarthy 2006), and discourse markers / discourse connectors (Biber 2006).

Gerard (2010) depicts them as the ‘glue’ that binds together a piece of writing, making the different parts of the text stick together. According to Brown & Levinson (1987), they are an important feature of both formal and informal native speaker language. In academic writing, they are used as the ways of thinking and using language which exists in the academy (Hyland 2009: 1). Hyland (2004) proposes that academic discourse markers like connectives and lexical cue phrases (e.g. although, in other words, etc.) constitute prevalent signalling devices that explicitly mark intra-sentential conceptual relations and text transitions in academic discourse. The frequent choice of DMs in academic written discourse is regarded as a reflection of the writer’s need to present and support his or her arguments to an academic audience in a straightforward and comprehensive way (Povolna 2012: 133), and this reflects the characteristic choice of academic register to overtly mark the links between ideas (Biber et al. 1999: 880).

Hyland & Tse (2004: 156) classify these markers into two groups as interpersonal discourse markers (reflecting the writer’s stance towards both the content of the text and the potential reader) and textual markers (referring to organization of discourse). They sub-classify textual discourse markers as logical markers, sequencers, reminders, topicalisers, code glosses, illocutionary markers, and announcements, and interpersonal discourse markers as hedges, certainty markers, attributors, attitude markers, and commentaries. As cited in Fraser (2009: 87), on the other hand, discourse markers are classified by most scholars into three groups as contrastive discourse markers (e.g. but, however, instead), elaborative discourse markers (e.g. and, furthermore, in addition), and inferential discourse markers (e.g. so, thus, as a result).

Contrastive discourse markers, on which the present study focuses, are defined by Volkova (2010) as the items that correlate two utterances by means of rendering the semantic meaning of contrast at the discourse level. The researcher goes on to state that the explicit proposition with a discourse marker is opposed to the implicit proposition that is not formally expressed in the discourse. The following statements are presented by the researcher in order to get a better understanding of their function:

a. Sarah is 37. In comparison, her husband is 39.

b. It was not a long trip. Nevertheless, he felt exhausted at the end of the day.

c. She listened to the five-year old boy. But she could not understand what the problem was.

These statements would undoubtedly remain intelligible even when the italicised components were not used to connect them. However, the second statement in each would most probably confuse the readers or the audience and lead them to reconsider or reread the first one to check their comprehension. Hence, it might be proposed that the CDMs are used for the purpose of clarifying meaning and conveying the intended message as accurately as possible by saving time of the target mass.

According to Malá (2006), these markers serve as text-organizing devices at the level of academic discourse, and they may extend their scope to function as markers of intertextuality and dialogicality, introducing other ‘voices’ in the written as well as spoken monologue.

Taking into consideration the classification of transitions made by Redeker (2006: 344) as hypotactic transitions (involving interruption or suspension of an incomplete unit with parenthetical material) and paratactic transitions (between segments that follow each other at the same level), the current study tackled with contrastive discourse markers in two groups: hypotactic CDMs and paratactic CDMs across native and non-native academic writing of English. This classification is outlined in Table 1:





albeit, although, at any rate, by comparison, despite the fact that, even if, even though, except, in comparison, in spite of the fact that, or else, though, whereas, while

actually, after all, all the same, alternatively, anyhow, anyway, at the same time, besides, but, by contrast, conversely, however, in any case, in contrast, in spite of that, instead, nevertheless, nonetheless, notwithstanding, on the other hand, on the other side, oppositely, still, yet

Table 1: Hypotactic and Paratactic CDMs

A total number of 38 CDMs proposed by Fraser (2010: 29) were investigated across doctoral dissertations of native and non-native academic authors of English in order to find out whether they significantly differed in conveying contrastive relations in their writing. Accordingly, the following two research questions were posed:

1. Do Turkish-speaking academic authors of English and native academic authors of English significantly differ in the use of contrastive discourse markers?

2. Do Spanish-speaking academic authors of English and native academic authors of English significantly differ in the use of contrastive discourse markers?

The subsequent section provides information about the theoretical background of the study and the findings of the research previously conducted on discourse markers in academic writing.

2 Background of the Study

Since the late 1970s, discourse markers, especially those in spoken language, have been extensively probed across various languages such as English, Finnish, French, German, and Japanese (e.g. Öztman 1981, Wierzbicka 1986, Holmes 1986). A decade later, they began to constitute the scope of a considerable amount of research conducted on academic writing (e.g. Schiffrin 1987, Blass 1990, Kortmann 1991, Unger 1996, Fraser 1999, Blakemore 2002, Hyland 2004, Hyland & Tse 2004, and Taboada 2006).

Kortmann (1991: 160) suggests that causal and contrastive relations are the most informative and most complex semantic relations between segments of discourse. These relations are mostly marked explicitly, especially in academic written discourse (Biber et al. 1999: 880), and they are viewed as a class of commentary pragmatic markers that contributes both to cohesion and coherence by signalling relationships between the segments of a discourse (Povolna 2012). Toboada reports that semantic relations of cause and concession are “typically expressed through subordination” (Toboada 2006: 576). According to Kourilova-Urbanczik (2012), the use or absence of subordination carries important signals for interpreting and integrating messages. Based on this, the researcher discriminates between hypotactic and paratactic styles asserting that “hypotaxis (syntactic subordination, the use of main and subordinate clauses) allows for hierarchical arrangement of themes and foci to foreground and background information in clauses” (Kourilova-Urbanczik 2012: 107), and that “parataxis, a sequence of main clauses without subordination and connectors, allows for multiple themes and foci in a sequence of clauses where no one clause dominates another” (Kourilova-Urbanczik 2012: 107). The researcher goes on to state:

An appropriate use of syntactic subordination provides focus and prominence to important discourse entities and creates a differential hierarchical discourse structure signalling important communicative values by backgrounding presupposed or known information through the themes and foregrounding new, unpredictable information through end-focus, facilitating comprehension (...) Parataxis leaves more to infer and is sometimes used in peer writing where producer of information and recipient share a large amount of background knowledge and possess inferencing capacities. For non-native speakers, who often lack this competence, parataxis may cause incoherence and loss of important implications. And yet, as producers of information, NNS present a too scanty use of subordination, which may create an undifferentiated, non-hierarchical structure, failing native speaker expectations. The difference between degrees of subordination in native and NNS discourse has repeatedly been found significant. (Kourilova-Urbanczik 2012: 108)

In a study of 1992, Kourilova reported that hypotactic structures occurred 25 times per 1000 words in native speaker scientific communication, and only four times in non-native speaker communication. Emphasizing this and similar findings reported in various research, Kourilova (1992: 108) inferred that native speakers tend to indicate more readily the distribution of functional weight of their messages by using hypotactic structures, while NNS leave readers to their own resources in the selection of information mandatory for discourse comprehension.

As cited in Whitley, Ayora suggests that the major difference between Spanish and English is the way that ideas are commonly joined to form sentences and paragraphs. According to him,

English speakers tend to state a point directly and then develop it with parataxis; that is, loose joining of simple clauses with coordinating conjunctions and sentence adverbials. Spanish speakers, however, tend toward a less abrupt start and prefer to link ideas through hypotaxis, that is, embedded clauses with subordinating conjunctions. (Ayora 1977: 196, cited in Whitley 2002: 341)

In a similar vein, López-Guix & Wilkinson (2001) maintain that English favors parataxis while Spanish favors hypotaxis.

Claiming that coordination is the main key for coherence in Arabic, and subordination in English, Othman (2004) scrutinized syntactic relations across texts originally written in English, Arabic, and those translated from English into Arabic. His survey revealed that subordination is more common than coordination in English and seen as a sign of maturity and sophistication in writing. It also indicated that the reverse is the case in Arabic in terms of syntactic relations, and that subordination is more frequent than coordination in the texts translated from English to Arabic, implying that Arabic-speaking academic writers of English follow the norms of their source language rather than those of their target language.

Mala (2006) analysed three sets of corpora including texts from the field of academic discourse: spoken monological lectures and dialogues from the M1CASE corpus and their own corpus comprising articles from American academic journals in terms of well-established markers of contrastive and concessive relations (30,000 words contained in each corpus). She found a marked difference between the proportion of paratactic and hypotactic relations in spoken and written discourse, with parataxis being characteristic of spoken dialogue, and hypotaxis being characteristic of written texts. While hypotactic relations are always marked by a conjunction (e.g. although, while), paratactic links may be marked by a conjunction {e.g. and, but), a conjunct (e.g. however, by contrast) or remain unmarked.

Linares & Whittaker (2007) analysed spoken and written productions in English made by Spanish students attending an secondary school where CLIL (Content and Language Integrated Learning) was used in teaching different disciplines. They reported that paratactic extension was considerably more frequent than hypotactic extension both in textbooks and students’ oral and written performances.

Yu (2010) explored the authenticity of oral exercises in a textbook series designed for levels from basic to advanced Chinese in terms of grammar and discourse structures, and concluded that discourse authenticity was mainly affected by the limited types of adjacency pairs and the lack of discourse markers. His findings also showed that the proportion of hypotaxis was much higher than that of parataxis in comparison to real situations in daily communication.

Povolná (2010) studied contrastive discourse markers in non-native academic writers’ theses, and found that contrastive relations are conveyed through a richer repertoire of hypotactic markers than paratactic markers although the frequency of the hypotactic markers was measured approximately four times higher than that of the paratactic markers. Comparing this specific finding with those reported in Biber et al. (1999), the researcher concluded that native speakers of English tend to use most of the markers less frequently than non-native speakers, which she attributes to the fact that their repertoire of contrastive discourse markers is much broader than that of non-native writers. In a further study of 2012, she analysed causal and contrastive discourse markers in MA theses of Czech students studying English Language and Literature, and found that hypotactic markers are less frequently used than paratactic markers so as to express contrastive relations. An interesting finding of her study was that most of the types (25 out of 38) were not found in the theses, at all. Lastly, she compared research articles written by native and non-native writers of English. She revealed that the former expressed contrast through paratactic relations in more cases than the latter even though both groups employed hypotactic CDMs much less frequently than paratactic ones.

In a corpus-based study, Bisiada (2013) investigated a frequency shift from hypotactic to paratactic constructions in concessive and causal clauses in German management and business writing, assuming that the influence of the English SVO word order makes German language users prefer verb-second, paratactic constructions to verb-final, hypotactic ones. His findings indicated that the translation corpus displayed relatively more paratactic rather than hypotactic features in the construction of concessive and causal clause complexes.

Sitthirak (2013) compared Thai university students and native speakers of English with respect to the use of discourse markers and explored the use of contrastive discourse markers by Thai students. He found that Thai students were able to distinguish between the contrast and non-contrast relation between two utterances more successfully than the English speakers could for the given contexts, and that they tended to form a set of rules to deal with the ‘appropriate’ answers while English speakers considered the authentic use rather than the semantic use in general. He noted that the contrastive discourse markers although and while are specifically problematic for these students as they believe that the distinction between them should be clarified since their Thai counterparts (Tæ̀thẁā for whereas and Mæ̂ẁā for although) are not interchangeably used (Garner 2009), and their use can vary across public and private domains (Weingarten 2003). Finally, he reported that Thai students tended to use these two discourse markers more interchangeably than English speakers did in a general context, producing their own rules when they felt the ambiguity whereas English speakers did so by intuition.

In a recent study, Rodríguez-Vergara (2015) analysed different clause nexus types in introductions and conclusions of research articles written by Mexican authors in Spanish and American authors in English, and found a statistically significant preference for hypotaxis in both the Spanish and the English corpora. He also reported that hypotaxis served at least three significant functions. Firstly, hypotaxis allows information condensing, i.e. the linguistic process of conveying as much information as possible with the fewest possible signs (Perrin & Ehrensberger-Dow 2008: 299). Secondly, manipulating the order of the clauses is easier in hypotactic clause nexuses than in paratactic clause complexes due to the fact that a secondary clause in a hypotactic nexus can precede or follow the primary clause while it always follows the primary clause in a paratactic nexus (Rodríguez-Vergara 2015: 480). Lastly, hypotaxis is used as a persuasion strategy considering that the content of the secondary clause usually presents given information whose veracity is not subject to negotiation (especially in non-finite clauses) (Baklouti 2011: 517).

3 Methodology

A total of 136 doctoral dissertations produced by Turkish-speaking, Spanish-speaking and native academic authors of English during a period of seven years (2005-2012)* were examined in the current study, using Contrastive Interlanguage Analysis, which was introduced in 1996 (Granger 1996). Three sets of subcorpora were compiled from dissertations produced by native academic authors of English (Native Academic Corpus of English, NACE), Turkish-speaking academic authors of English (Turkish Academic Corpus of English, TACE), and Spanish-speaking academic authors of English (Spanish Academic Corpus of English, SACE). The dissertations in question were selected among those written in the fields of English language teaching, applied linguistics, English language and literature and modern languages, and submitted to various institutions of higher education in Turkey, Spain, the UK and the USA from 2005 to 2012. The sections abstract, introduction, review of literature, methodology, references, and appendices were excluded from the subcorpora. Hence, they were confined to the sections findings, discussion, conclusion, pedagogical implications (implications to English language teaching) and suggestions for further research. Finally, all titles, subtitles, tables, figures, quotations, and paraphrases were excluded from the above-mentioned sections included into the subcorpora. In Table 2, the number of dissertations and the size of each corpus analysed in this study, are illustrated:

Table 2: Size of the Subcorpora

More than two million words were analysed through Wmatrix (Rayson 2009) in terms of CDMs across native and non-native academic writing, and a log-likelihood calculator (Rayson & Garside 2000) to reveal whether they significantly differ from each other in expressing contrastive relations in academic writing. The six steps identified for the Contrastive Interlanguage Analysis (CIA) (Granger 1996) were followed in analysing the subcorpora: the first three steps required calculating frequencies of CDMs in each corpus. In the fourth step and fifth step, the non-native subcorpora and the native subcorpus were compared to find out there was a statistically meaningful difference between them regarding the subclasses of CDMs identified in Table 1. The following section is designed to present the results of our data analysis, and a related discussion on them.

4 Results and Discussion

The preliminary findings of the study have revealed that 28 out of 38 CDM types were found in each set of the corpus. In other words, ten types were not found in any of the corpora (five hypotactic CDMs and five paratactic CDMs), thus indicating a richer distribution in comparison to those found in Povolná (2010; 2012a). The markers in question are exemplified in Table 3:

Table 3: CDMs not Found in the Three Corpora (Sample sentences taken from COCA)

The finding that the above-tabulated CDMs appeared neither across native nor across non-native academic writing was not surprising, considering the fact that all except three (despite the fact that, in spite of the fact that, and in any case) were not found in The Louvain Corpus of Native English Essays (LOCNESS), either**. Besides, the item despite the fact that appeared seven times only, and the items in spite of the fact that and by contrast occurred only once, respectively, across the corpus in question. So, this finding may reflect that these items do not appear frequently in academic writing in general. Another reason for this finding may be that these markers are rather complex, which was likely to reduce their preferability especially against the single component items (e.g. although, but). Indeed, the frequencies of CDMs clearly show that although was extensively used by all groups of academic authors, whereas its counterparts like despite the fact that, in spite of the fact that and in spite of that were not used by them, at all.

In general, it has been revealed that the groups significantly differed from each other in that CDMs were underused by the Turkish-speaking academic authors of English (TAE, henceforth), and overused by the Spanish-speaking academic authors of English (SAE, hereafter) as opposed to the native academic authors of English (NAE, henceforth). The distribution in question is illustrated in Table 4:

Table 4: Frequency distribution of CDMs in three corpora

As stated above, 28 out of 38 types of CDMs appeared in each corpus. They appeared 90 times in every 10.000 words in SACE, and 70 times in TACE, which means that they were significantly overused by SAE, and significantly underused by TAEs in comparison to NAEs.

Three corpora displayed great similarity in that paratactic CDMs were found much more frequently than hypotactic CDMs, confirming the findings previously reported by Povolná (2010, 2012a, 2012b), and contradicting with Malá (2006), who concluded that parataxis is the characteristic of spoken discourse while hypotaxis is characteristic of written discourse. In order to see whether this finding is statistically significant, a log-likelihood test was administered between the native and non-native corpora. The first test was conducted between TACE and NACE, and its results are provided in Table 5:

Table 5: Log-likelihood results for CDMs in TACE and NACE

As listed in Table 3, the statistical findings confirmed both the overuse of hypotactic CDMs and the underuse of paratactic CDMs by TAE against NAE with log-likelihood values of +4.88, and -105.84, respectively. This finding also demonstrated that CDMs were significantly underused by TAE as opposed to the respective native group. The results of the test administered to SACE and NACE are displayed in Table 6:

Table 6: Log-likelihood results for CDMs in SACE and NACE

In contrast to the case between TACE and NACE, it is statistically approved that hypotactic CDMs were significantly underused while paratactic CDMs and CDMs in general were significantly overused by SAE against NAE, contradicting the finding reported in Rodríguez-Vergara (2015). This finding also contradicts with Ayora (1977) and López-Guix & Wilkinson (2001), who imply that English displays a strong paratactic tendency while Spanish has a more hypotactic structure. In the case of the Spanish-speaking group, this finding reflects some - possibly unconscious - influence of the target language on the authors’ mother tongue. Nonetheless, the difference between these markers in terms of frequency was lower in TACE than in NACE and in SACE.

Overall, all the three corpora could be identified as similar with regards to the frequency distribution of individual CDMs: both native and non-native academic authors tended to rely on certain CDMs in their writing and ignored the others regardless of their categories. Thus, the hypotactic CDMs found in the three corpora comprised albeit, although, even if, even though, except, in comparison, though, whereas and while, most of which were already reported in Malá (2006). However, six of them were almost never employed by the authors. The related distribution of the items in question is displayed in Figure 1:

Figure 1. Hypotactic Discourse Markers in Three Corpora

As can be seen in Figure 1, a couple of hypotactic CDMs were predominantly used by all groups. Particularly while and although represented the most frequented CDMs in this regard, confirming Povolná's (2012b) findings. Though not as frequent as while and although, whereas was found to be slightly more frequent than the other items in this group. The following are statements extracted from each corpus to exemplify the most commonly employed items:

[Although spelling out words may occur in classrooms, particularly language classrooms, it would appear to be very uncommon in other interactional settings.] Extracted from <NACE-NU-2011-AB>

[While cooperating teachers adopted a traditional grammar-based language teaching, the student teachers employed more communicative activities in the practicum classrooms.] Extracted from <TACE-AU-2010-MA>

[Iconical texts emphasise the visual nature of the written word, while, at the same time, they connect their meaning to a precise diegesis.] Extracted from <SACE-UZ-2012-ARJ>

Paratactic CDMs, on the other hand, were found more than as twice frequently as hypotactic CDMs in both NACE and SACE, and approximately twice as frequently in TACE. As in the case of hypotactic CDMs, a limited number of paratactic CDMs were extensively documented in the three corpora (e.g. but and however). It is noteworthy that but was significantly underused by TAE in comparison with NAE and SAE.Their individual distribution across the three corpora is illustrated in Figure 2:

Figure 2: Paratactic discourse markers in three corpora

Similar to the case reported in Povolná (2012b), the most typical paratactic marker but was found more than twice as frequently as its hypotactic counterpart while. The following statements are drawn from each corpus to exemplify the most frequently used items in this category:

[Jackie was not teaching LEP students at the time of the study, but in past years, LEP students had enrolled in her classes.] Extracted from <NACE-UI-2010-AH>

[However, she was able to cover this due to the alternatives she had during her teaching practice.] Extracted from <TACE-GU-2010-CB>

[As my parents reiterated to my brother and I, we were Americans but came from a Mexican background.] Extracted from <SACE-SU-2010-DMM>

Another finding of the current study is that the paratactic CDM on the other hand was infrequently used by the native group while it appeared more than five times and four times in every 10.000 words in TACE and SACE, respectively. This particular finding is also in line with the one previously reported by Povolná (2012b).

5 Conclusions

The current study was designed to scrutinize the use of contrastive discourse markers by native and non-native academic authors of English. Results of data analysis administered to three subsets of corpora, including doctoral dissertations written by native Turkish-speaking and Spanish-speaking academic authors, have indicated that CDMs were, in general, underused by TAE and overused by SAE as opposed to NAE. As for the categories of CDMs, it was found that both the native group and the non-native groups used paratactic CDMs much more frequently than their hypotactic counterparts in expressing contrastive relations. This finding was essentially determined by the predominant use of the paratactic CDMs but and however, which constituted over 65% of the all CDMs falling into this category. Both native and non-native groups did not prefer their hypotactic counterparts although and even though, possibly striving for a certain simplicity in their wording, which is commonly advised in academic writing. Specifically, these findings have revealed that TAEs tend to slightly overuse hypotactic CDMs and to underuse paratactic CDMs against their native counterparts. The overuse of CDMs was at least partly determined by the extensive use of the hypotactic CDM while by the Turkish-speaking group. Nonetheless, this specific finding could be considered as unexpected in view of the fact that Turkish is a parataxis-oriented language, especially in spoken registers. However, when taking the possibility into consideration that the Turkish-speaking authors preferred to employ subordination, which sounds more formal and sophisticated and is commonly advised in academic writing courses rather than coordination, we may just as well interpret this finding as an expected one.

On the contrary to the case in TACE, and in opposition to Ayora (1977) and López-Guix & Wilkinson (2001), hypotactic CDMs were significantly underused and paratactic CDMs were overused by SAE against NAE, indicating that the Spanish-speaking group developed new writing habits in the target language which favours parataxis.

Finally, different tendencies were observed among the groups in the choice of individual CDMs, which may be attributed to the individuals' variety in knowledge, their preferences in writing habits, overt instruction and frequency of exposure to the language items. Namely, both native and non-native groups may tend to use the markers they hear or read most frequently. This explanation is virtually imposed by an analysis of the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA): the hypotactic CDMs although and however, for instance, occurred only 42.242 and 54.067 times, respectively, while their paratactic counterparts but and however appeared as often as 440.000 and 59.003 times, respectively, across the BNC. That means that the paratactic CDM couple was found five times as frequently as the hypotactic CDMs in question. Likewise, although and while were found only 137.436 and 349.779 times, respectively, in COCA, whereas but and however were found 2.400.943 and 180.568 times, respectively, in the same corpus, indicating that the paratactic couple was used approximately 5.3 times more frequently than their hypotactic counterparts. Especially non-native speakers are likely to be exposed to certain markers much more frequently than native speakers both during classes and in the written materials they encounter. As for practical implications to academic writing, as suggested by Povolná (2012: 146), it is suggested that CDMs and their correct use should be given sufficient attention in students’ education, notably at advanced levels of language learning. Kourilova-Urbanczik (2012: 114), on the other hand, suggests that a proper clause structure reflects the academic quality of written expository discourse and that shortcomings are frequently caused by non-native speakers’ pragmalinguistic and socio-cultural failure to master the organizational conventions of discourse processing and the broad repertoire of devices of the English modality system. She states that these phenomena fall within the flexible area of hidden grammar, subconsciously absorbed by native speakers, yet very difficult to teach and learn by non-native users of the language. So, in order to overcome this challenge especially in academic writing, it is here suggested that research papers written by both native and non-native academic authors and published in qualified journals should be used as teaching materials particularly for graduate students who are expected to produce report papers or theses as part of their education. These students should also be trained to analyze those works in terms of certain textual criteria in order to acquire a better understanding of academic writing and revise their own papers accordingly.

The current study is limited to the investigation of contrastive discourse markers across doctoral dissertations written by Turkish-speaking, Spanish-speaking and native academic authors of English between 2005 and 2012. It might be extended to explore other types of contrastive discourse markers (e.g. elaborative CDMs and inferential CDMs) across spoken or written productions of non-native academic authors coming from various L1 backgrounds other than Turkish and Spanish. It is also limited to the fields of English language teaching, English language and literature, applied linguistics and modern languages, in which the dissertations were produced; hence, further studies might involve analysing broader corpora, comprising dissertations or research papers submitted or published in other disciplines. Another limitation of the present study might be that it does not specifically deal with the reasons why certain contrastive discourse markers were overwhelmingly preferred both by native and non-native academic authors of English while other discourse markers rarely appeared in their works. So, it would be worthwhile identifying the factors that may cause the overuse or underuse of specific contrastive discourse markers by non-native academic authors of English and the reasons why native academic authors of English tend to use certain CDMs more frequently than the other CDMs.


* It is noteworthy that for establishing our corpus, the authors of the doctoral dissertations in question needed to be asked for their permission before their studies could be included in our corpus. It took a relatively long while to receive their positive replies and to build the three sets of subcorpora, which explains the aforementioned coverage period of our corpus.

** LOCNESS is a corpus of native English argumentative essays that comprises British pupils' A-Level essays, British university students' essays and American university students' essays. It was compiled by the scholars at the Centre for English Corpus Linguistics (CECL), Universite Catholique de Louvain, Belgium in 1995, and used in this particular study with the official permission obtained from CECL in order to investigate CDMs which were not found in the native and non-native corpora used for our study.


