Comparing Alternative Scoring Methods for a Language Disorder Screening Task
Comparing Alternative Scoring Methods for a Language Disorder Screening Task
Researcher: Madison Burton (Major: Speech Pathology & Audiology)
Research Mentor: Gerard H. Poll
Developmental Language Disorder (DLD) is a communication disorder that impacts learning and the use of language. DLD can cause problems with spoken language and understanding.
Approximately 7.6% of all children have DLD.
Sentence repetition (SR) is a promising task for screening people with DLD but for adults we need to make it more accurate.
Sentence repetition tasks can be diagnostically helpful for screening adults and their are multiple different ways a sentence repetition task can be scored.
Scoring methods can directly impact diagnostic accuracy.
Diagnostic accuracy can directly influence factors like reliability and validity of scoring systems.
Can reflect the true condition being evaluated in a SR task.
The choice of a scoring method in a study can also affect the diagnostic accuracy when assessing SR tasks.
Very few studies have compared scoring methods for advances sentences for adults with DLD.
Ward et al. (2024) conducted a systematic review that compared scoring across different studies, rather than a single study.
Ward et al. (2024) found little difference in the effects of scoring techniques.
In this study the scoring approaches examined are the % of words, 0/1 scoring, and CELF scoring.
Scoring methods have a variety of practical and analytical consequences.
Scoring methods can vary in terms of their efficiency and practicality, which can impact reliability.
Example: For very challenging sentences, the 0/1 scoring would mean that everyone gets a 0 on the test, but % accuracy or CELF scoring might demonstrate a big difference.
Prior studies have not evaluated the effect of scoring method on sentence conditions from easy to hard.
Do different scoring approaches (percentage of words, 0/1, CELF) differentiate at risk and neurotypical groups?
% of words: Calculated by dividing the number of words correctly repeated by the total of words presented in a test.
0/1: Score of 1 is given for complete accuracy and 0 points for at least one error.
CELF (0123): Score of 3 points is given for complete accuracy, 2 points for one error, 1 point for two/three errors, and 0 points for four or more errors.
45 total participants.
25 neurotypical young adults and 20 at risk young adults participated in this study.
Utilized 3 types of scoring methods which are % of words, 0/1 scoring, and CELF scoring, to analyze group differences between a neurotypical and at risk group.
Analyzed effect sizes for 3 different conditions.
Examined floor and ceiling effects.
Analyzed group differences between neurotypical young adults (group 0) and at risk young adults (group 1).
At risk means by either scores on a DLD test battery or reporting functional effects of DLD (e.g. problems reading, problems in school)
Analyzed group differences and effect sizes for the conditions 15, 35 & 45.
All passive sentences.
Condition 15: Passive sentence, 8 words total, “easiest.”
Example: “The disorderly cupboards were rearranged by the crews.”
Condition 35: Passive sentence, 11 words total, most discriminating with % scoring.
Example: “Columns were stared at by sinners and sparrows adored by starters.”
Condition 45: Passive sentence, 16 words total, “hardest.”
Example: “The vineyard was supplied with profitable antiques and profits were reviewed for feedback by irrational champions.”
% of words scoring method was the best for the most difficult sentences only (condition 45).
0/1 scoring method was worst in all three conditions (the least amount of spread), but is a faster way to score compared to the other methods.
CELF scoring method: Best scoring method, a happy medium compared to the other methods.
Able to be scored faster then % of words.
Better validity then 0/1 scoring.
Condition 15 had ceiling effects for all 3 scoring methods.
Condition 45 had floor effects for 0/1 scoring method and CELF scoring method.
Facilitates the use of analytical techniques for rating scales.
Enables us to systematically identify sentences that are well targeted to participants.
The poster that was presented at the 2025 Undergraduate Research Forum (all content has been communicated on this website).
Haug, T., Batty, A. O., Venetz, M., Notter, C., Girard-Groeber, S., Knoch, U., & Audeoud, M. (2020). Validity evidence for a sentence repetition test of Swiss German Sign Language. Language Testing, 37(3), 412-434. https://doi.org/10.1177/0265532219898382
Marinis, T. & Armon-Lotem, S. (2015). Sentence Repetition. 1-33. https://www.researchgate.net/publication/287252052
Stokes, S. F., Wong, A. M., Fletcher, P., & Leonard, L. B. (2006). Nonword Repetition and Sentence Repetition as Clinical Markers of Specific Language Impairment: The Case of Cantonese. Journal of Speech, Language, and Hearing Research, 49, 219-236
Ward, L., Polišenská, K., & Bannard, C. (2024). Sentence Repetition as a Diagnostic Tool for Developmental Language Disorder: A Systematic Review and Meta-Analysis. American Speech-Language-Hearing Association, 1-31. https://doi.org/10.1044/2024_JSLHR-23-00490