Articulates the need for and the purpose of an assessment.
Selects and applies an assessment approach appropriate for the assessment need.
Articulates standard setting approach for the proposed assessment.
Articulates likely sources of threats to validity and proposes measures to mitigate threats.
Gathers validity evidence for the results of the assessment based on its intended use.
"Assessment" refers to measuring some characteristic of an individual. These are often categorized as knowledge, attitudes (values) and behaviors. Each characteristic may require a different assessment method, such as a multiple choice test, an attitude questionnaire, a standardized patient, a case analysis, an EPA and so on.
An important component of competence in Assessment is either selecting or creating an appropriate assessment method for the outcomes variables you care about. However, because there is seldom a perfect way to assess anything, you will need to select among multiple assessment options and DEFEND that choice in your EPA write-up.
Why did you choose one assessment method over another?
What are the strengths and weaknesses of the alternative methods?
MHPE Alumna
Clinical Professor
Department of Internal Medicine, Hospital Medicine
University of Michigan Medical School
Jen Stojan is co-lead for the AAMC Group on Educational Affairs (GEA) looking at Clinical Skills Assessment and the Standardization (CLASS) Project and the future of assessment now Step 2 has been terminated. How are people going to respond to not having data and making judgements on clinical skills?
The session walks us through the sequential logic of defining and developing a novel assessment of a given outcome. Scan the ACGME document beforehand for an overview of established assessment methods.
Reconsidering the focus on "outcomes research" in medical education: A cautionary note. Cook DA, West CP. Acad Med 2013;88(2):162-67. doi: 10.1097/ACM.0b013e31827c3d78.
Steps in developing an assessment of an educational outcome. Larry Gruppen
A model for programmatic assessment fit for purpose. Van Der Vleuten CPM, Schuwirth LWT, et al. Med Teach 2012;34:205-14.
Narrative Assessment Online Discussion. Facilitator: Gurjit Sandhu
Cognitive, social and environmental sources of bias in clinical performance ratings. Williams RG, Klamen DA, McGaghie WC. Teach Learn Med 2003;15(4);270-92.
Assessment in health professions education. Downing SM, Yudkowski R. 2009. New York: Taylor and Francis. (ISBN 10:0-8058-6128-9)
How to assess doctors and health professionals. 2013. Davis M, Forrest K, McKimm J. West Sussex, England. Wiley-Blackwell Publishing. (ISBN 978-1-4443-3056-4)
Assessment in higher education: issues of access, quality, student development, and public policy. 2016. Messick SJ (Ed). Routledge. (ISBN: 1138987611)
Effective Grading: A Tool for Learning and Assessment in College, 2nd ed. Walvoord BE and Anderson VJ, . San Francisco: Jossey-Bass, 2010, 13. (ISBN: 0470502150)
Advancing resident assessment in graduate medical education. Swing SR, Clyman SG, et al. J Grad Med Ed 2009;278-286
How to develop a competency-based examination blueprint for longitudinal standardized patient clinical skills assessments. Mookherjee S, Chang A, et al. Med Teach 2013;35:883-90. doi: 10.3109/0142159X
A blueprint to measure professionalism. Wilkinson TJ, et al. Acad Med 2009;84(5);551-58
Blueprinting for the assessment of health care professionals. Hamdy H. Clin Teach 2006;3:175-79
Workplace-based assessment for general practitioners: using stakeholder perception to aid blueprinting of an assessment battery. Murphy DJ, Bruce D, Eva K. Med Educ 2008;42:96-103
Developing the blueprint for a general surgery technical skills certification examination: A validation study. de Montbrun S, Louridas M, et al. J Surg Educ 2017;75(2):344-350
NEW! NBME Item Writing Guide: Constructing Written Test Questions for the Health Sciences. NBME 2021 Philadelphia
Item Writing Slides by Chris Orem, Jerusha Gerstner, Christine DeMars
The field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, and personality traits. The field is primarily concerned with the study of differences between individuals. It involves two major research tasks, namely: (i) the construction of instruments and procedures for measurement; and (ii) the development and refinement of theoretical approaches to measurement
Generalizability theory for the perplexed: A practical introduction and guide: AMEE guide No. 68. Bloch R, Norman G. Med Teach 2012;34:960-92.
Reliability: on the reproducibility of assessment data. Downing SM. Med Ed 2004;38:1006-12. doi: 10.1046/j.1365-2929.2004.01932.x
Online Discussion. Facilitator: Deb Rooney (December 7, 2016)
Online Discussion. Facilitator: Steve Kasten (April 4, 2018)
Slide presentation - Validity by Steve Kasten
Consequences validity evidence: Evaluating the impact of educational assessments. Cook DA, Lineberry M. Acad Med 2016;91(6):785.
When I say...validity. Cook DA. Med Educ 2014;48:948-49.
Validity evidence sources Table 2.2. Downing SM, Haladyna TM. Validity and its threats. In: Downing SM, Yudkowsky R (eds) Assessment in health professions education. New York: Routledge, 2009 p.30.
Validity: on the meaningful interpretation of assessment data. Downing SM. Med Ed 2003;37:830-37.
Validity threats: overcoming interference with proposed interpretations of assessment data. Downing SM, Haladyna TM. Med Ed 2004;38:327-33.
Validity in work-based assessment: expanding our horizons. Govaerts M, van der Vleuten CPM. Med Ed 2013:47;1164-74.
Facilitator: Larry Gruppen
Date: January 10, 2018
Developing questionnaires for educational research: AMEE guide No. 87. Artino AR, La Rochelle JS, et al. Med Teach 2014;36:463-474
MHPE Summer Retreat 2021
Director of Education and Research, Clinical Simulation Center
Associate Professor
Department of Learning Health Sciences
University of Michigan
Participants will be able to:
Become comfortable with general standard setting practices
Identify differences between two common standard setting practices
Make informed decisions about selecting the right standards setting practice for your setting
Why do we avoid standard setting?
When might you perform standard setting?
How do you choose a standard setting process?
In most instances, it doesn’t matter which standard setting method you use. What matters is that you are transparent in your process to ensure defensibility of the results/decision
Standard setting doesn’t have to be onerous
Standard setting is important to ensure defensibility of the results/decision
Standard setting is continuous/ongoing process
Research Methodology: Procedures for Establishing Defensible Absolute Passing Scores on Performance Examinations in Health Professions Education. Downing SM, et al. Teaching and Learning in Medicine. https://doi.org/10.1207/s15328015tlm1801_11
Mckinley DW, Norcini JJ: How to set standards on performance-based examinations: AMEE Guide 85. Med Teach 2014; 36(2):97-110
Yudkowsky R, Downing S, Tekian A: Standard Setting. In Downing SM and Yudkowsky R (eds): Assessment in Health Professions Education, New York and London: Routledge 2009
Cizek G (ed): Setting Performance Standards: Concepts, Methods and Perspectives. New Jersey and London: Lawrence Erlbaum Associates, 2001
Standard setting is the methodology used to define different levels of achievement or proficiency
Formal standard setting isn’t a common opportunity - part of an institutional policy or ‘judgment’ rather than individual judgments
If assessment is important, then it is worthwhile to make it defensible and reproducible, not arbitrary.
Standard setting is a part of the competencies of the MHPE and figures into several of the EPAs.
Standard setting requires qualified judges and a method for summarizing judgments across those judges. Selecting judges is critical - represent stakeholders, have familiarity with the educational/assessment process.
Judges will still need to be trained in the method of making and capturing their judgments.
Two types of standards: relative to other examinees, and absolute: relative to a fixed score. The relative standard can move up or down depending on the performance of the group. The absolute standard is always the same number.
Various methods for setting standards: Angoff, Hofstee, contrasting/borderline groups (and many others)
Angoff method: what is the probability of a borderline student correctly answering each item in the assessment.
Hofstee: combines judgments on acceptable range of passing scores and acceptable range of failure rates for the assessment, combined over multiple judges. Uses actual student data as well as judge estimates.
Whatever method used, the process is iterative and should be updated periodically with new judge estimates.
More resources are available on the MHPE web page under the Assessment competency and EPAs04, 05, 06, 07,
Evidence based standard setting: establishing cut scores by integrating research evidence with expert content judgments. Beimers JN, Way WD, et al. Pearson Bulletin January 2012: issue 21 .
A comparison of Angoff and Bookmark standard setting methods. Buckendahl CW, Smith RW, et al. J Ed Measurement 2002:39(3);253-63.
Standard-setting guidelines. Cizek GJ. Educ Measure: Issues and practice 1996; 13-22
Setting performance standards on complex educational assessments. Hambleton RK, Jaeger RM, et al. Applied Psychol Measurement 2000: 24(4);355-66.
A comparative study of standard-setting methods. Livingston SA, Zieky MJ. Applied Measurement in Educ 1989:2(2);121-141
Standard setting: Does using mixed methods help? slides by Monica Lypson
Setting standards on educational tests. Norcini JJ. Med Educ 2003;37:464-69
Use of the Rasch IRT model in standard setting: an item-mapping method. Wang N. J Educ Measure 2003:40(3);231-53.
Standard Setting, Chapter 6. Yudkowsky R, Downing SM, Tekian A. pp. 119-148. In: Assessment in health professions education. Downing SM, Yudkowski R. 2009. New York: Taylor and Francis. (ISBN 10:0-8058-6128-9)
Setting Performance standards: Foundations, Methods and Innovations 2nd edition. Cizek GJ (Ed). Routledge, 2012. (ISBN: 041588148X)
Standards for Educational and Psychological Testing. American Psychological Association, American Educational Research Association, and National Council on Measurement in Education: Joint Committee on Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association, 2014. (ISBN: 0935302352)
Current perspectives in assessment: the assessment of performance at work. Norcini JJ. Med Educ 2005;39:880-89. doi: 10.1111/j.1365-2929.2005.02182.x
Workplace assessment. Norcini JJ. In: Understanding Medical Education: Evidence, Theory and Practice. Tim Swanwick (ed). 2010. 232-245. ISBN: 978-1-405-19680-2.