Assessment

General Assessment Criteria

Articulates the need for and the purpose of an assessment.
Selects and applies an assessment approach appropriate for the assessment need.
Articulates standard setting approach for the proposed assessment.
Articulates likely sources of threats to validity and proposes measures to mitigate threats.
Gathers validity evidence for the results of the assessment based on its intended use.

How to use these resources

"Assessment" refers to measuring some characteristic of an individual. These are often categorized as knowledge, attitudes (values) and behaviors. Each characteristic may require a different assessment method, such as a multiple choice test, an attitude questionnaire, a standardized patient, a case analysis, an EPA and so on.

An important component of competence in Assessment is either selecting or creating an appropriate assessment method for the outcomes variables you care about. However, because there is seldom a perfect way to assess anything, you will need to select among multiple assessment options and DEFEND that choice in your EPA write-up.

Why did you choose one assessment method over another?
What are the strengths and weaknesses of the alternative methods?

Questionnaire Construction

Standard Setting

Workplace Assessment

Website Feedback

Assessment - General

Rigorous determination of a learner's skill/knowledge

Applies to learners

Facilitator:

Jen Stojan, MD

MHPE Alumna

Clinical Professor

Department of Internal Medicine, Hospital Medicine

University of Michigan Medical School

Competency: Assessment

Jen Stojan is co-lead for the AAMC Group on Educational Affairs (GEA) looking at Clinical Skills Assessment and the Standardization (CLASS) Project and the future of assessment now Step 2 has been terminated. How are people going to respond to not having data and making judgements on clinical skills?

Assessment Workshop

MHPE Winter Retreat 2022

February 3, 2022

Facilitator: Larry Gruppen

The session walks us through the sequential logic of defining and developing a novel assessment of a given outcome. Scan the ACGME document beforehand for an overview of established assessment methods.

- Reconsidering the focus on "outcomes research" in medical education: A cautionary note. Cook DA, West CP. Acad Med 2013;88(2):162-67. doi: 10.1097/ACM.0b013e31827c3d78.
- Steps in developing an assessment of an educational outcome. Larry Gruppen

This 9-step guide lays out the sequence of decisions that are needed to select and develop or find an assessment method to measure a given outcome. It describes the overall process and helps clarify where additional learning might be necessary

- A model for programmatic assessment fit for purpose. Van Der Vleuten CPM, Schuwirth LWT, et al. Med Teach 2012;34:205-14.

This is a valuable "big-picture" treatment of the assessment as a system, rather than just a single assessment of one outcome at one point in time. "Programmatic assessment" can apply to an entire curriculum, a residency program, maintenance and certification, or a course. “Programmatic assessment is one of the hot current innovations in assessing learning outcomes. This is arguable one of the seminal works for this idea. It lays out the administrative and data structures needed for programmatic assessment, but it also presents a nice summary of key issues in assessment generally.”

- Narrative Assessment Online Discussion. Facilitator: Gurjit Sandhu
- Cognitive, social and environmental sources of bias in clinical performance ratings. Williams RG, Klamen DA, McGaghie WC. Teach Learn Med 2003;15(4);270-92.

Many assessments depend on the judgment of experts or other individuals. These judgments are subject to a variety of errors and biases, which this article describes. Judgment bias is a large body of research, so this is only an introduction.

Books

Underlined book titles contain links to the electronic version of the book via Taubman Health Sciences Library. You may be asked to enter your unique name and kerberos password to access the book.

- Assessment in health professions education. Downing SM, Yudkowski R. 2009. New York: Taylor and Francis. (ISBN 10:0-8058-6128-9)
- How to assess doctors and health professionals. 2013. Davis M, Forrest K, McKimm J. West Sussex, England. Wiley-Blackwell Publishing. (ISBN 978-1-4443-3056-4)
- Assessment in higher education: issues of access, quality, student development, and public policy. 2016. Messick SJ (Ed). Routledge. (ISBN: 1138987611)
- Effective Grading: A Tool for Learning and Assessment in College, 2nd ed. Walvoord BE and Anderson VJ, . San Francisco: Jossey-Bass, 2010, 13. (ISBN: 0470502150)

Examples

- Advancing resident assessment in graduate medical education. Swing SR, Clyman SG, et al. J Grad Med Ed 2009;278-286

Online Resources for Building Skills in Assessment

- Accreditation Council for Graduate Medical Education (ACGME)

An independent, not-for-profit, physician-led organization that sets and monitors the professional educational standards essential in preparing physicians to deliver safe, high-quality medical care to all Americans

- Clearinghouse on Assessment and Evaluation

- Research Methods Knowledge Base

A comprehensive web-based textbook that addresses all of the topics in a typical introductory undergraduate or graduate course in social research methods. It covers the entire research process including: formulating research questions; sampling (probability and nonprobability); measurement (surveys, scaling, qualitative, unobtrusive); research design (experimental and quasi-experimental); data analysis; and, writing the research paper. It also addresses the major theoretical and philosophical underpinnings of research including: the idea of validity in research; reliability of measures; and ethics.

Assessment Blueprint

- How to develop a competency-based examination blueprint for longitudinal standardized patient clinical skills assessments. Mookherjee S, Chang A, et al. Med Teach 2013;35:883-90. doi: 10.3109/0142159X
- A blueprint to measure professionalism. Wilkinson TJ, et al. Acad Med 2009;84(5);551-58
- Blueprinting for the assessment of health care professionals. Hamdy H. Clin Teach 2006;3:175-79
- Workplace-based assessment for general practitioners: using stakeholder perception to aid blueprinting of an assessment battery. Murphy DJ, Bruce D, Eva K. Med Educ 2008;42:96-103
- Developing the blueprint for a general surgery technical skills certification examination: A validation study. de Montbrun S, Louridas M, et al. J Surg Educ 2017;75(2):344-350

Item Writing

NEW! NBME Item Writing Guide: Constructing Written Test Questions for the Health Sciences. NBME 2021 Philadelphia

Online Discussion

- Facilitator: Tom Fitzgerald
- Date: November 1, 2018
  - Item writing slide presentation
- Item Analysis Guide
- Item Writing Hints
- Item Writing Slides by Chris Orem, Jerusha Gerstner, Christine DeMars

Psychometrics

The field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, and personality traits. The field is primarily concerned with the study of differences between individuals. It involves two major research tasks, namely: (i) the construction of instruments and procedures for measurement; and (ii) the development and refinement of theoretical approaches to measurement

Generalizability theory for the perplexed: A practical introduction and guide: AMEE guide No. 68. Bloch R, Norman G. Med Teach 2012;34:960-92.

OK, generalizability theory will not be for everyone, but for those who want to better understand its conceptual foundation (as well as the gory details of computing the relevant components and coefficients), this is an excellent overview.

Reliability and Validity

- Reliability: on the reproducibility of assessment data. Downing SM. Med Ed 2004;38:1006-12. doi: 10.1046/j.1365-2929.2004.01932.x

An accessible description of reliability in assessment data and how to use the right statistics for the right kind of reliability

- Online Discussion. Facilitator: Deb Rooney (December 7, 2016)
- Online Discussion. Facilitator: Steve Kasten (April 4, 2018)
  - Slide presentation - Validity by Steve Kasten
- Consequences validity evidence: Evaluating the impact of educational assessments. Cook DA, Lineberry M. Acad Med 2016;91(6):785.
- When I say...validity. Cook DA. Med Educ 2014;48:948-49.
- Validity evidence sources Table 2.2. Downing SM, Haladyna TM. Validity and its threats. In: Downing SM, Yudkowsky R (eds) Assessment in health professions education. New York: Routledge, 2009 p.30.

A table that helps illustrate the categories of validity evidence that the two Downing articles in this section describe.

- Validity: on the meaningful interpretation of assessment data. Downing SM. Med Ed 2003;37:830-37.

Downing translate the complexities of the modern validity framework into quite readable text. This is a good introduction to the concepts.

- Validity threats: overcoming interference with proposed interpretations of assessment data. Downing SM, Haladyna TM. Med Ed 2004;38:327-33.

An accessible discussion of how to gather and report validity evidence for the use of assessment data in making educational decisions.

- Validity in work-based assessment: expanding our horizons. Govaerts M, van der Vleuten CPM. Med Ed 2013:47;1164-74.

This is really theoretical and pushes the limits in terms of traditional psychometric approaches, so don’t worry if it doesn’t all make sense. If you skim it and want to talk to a Subject Matter Expert about it later, it would be good approach.

Questionnaire Construction

- Online Discussion, Writing good/bad questions.

Facilitator: Larry Gruppen

Date: January 10, 2018

- - Slide presentation
  - Discussion notes

- Developing questionnaires for educational research: AMEE guide No. 87. Artino AR, La Rochelle JS, et al. Med Teach 2014;36:463-474

This excellent guide for building a questionnaire focused on the larger principles and processes as well as specific advice on writing good questions. It even includes a glossary.

Standard Setting

Standard setting is the methodology used to define levels of achievement or proficiency and the cut scores corresponding to those levels.

Standard Setting to Ensure Trainee Competence

MHPE Summer Retreat 2021

FaDeborah Rooney, PhD

Director of Education and Research, Clinical Simulation Center

Associate Professor

Department of Learning Health Sciences

University of Michigan

Learning Objectives

Participants will be able to:

Become comfortable with general standard setting practices
Identify differences between two common standard setting practices
Make informed decisions about selecting the right standards setting practice for your setting

Discussion Questions

Why do we avoid standard setting?
When might you perform standard setting?
How do you choose a standard setting process?

A Key Take Home Message

In most instances, it doesn’t matter which standard setting method you use. What matters is that you are transparent in your process to ensure defensibility of the results/decision

Take Home Points for the Session

Standard setting doesn’t have to be onerous
Standard setting is important to ensure defensibility of the results/decision
Standard setting is continuous/ongoing process

Key Resource

Research Methodology: Procedures for Establishing Defensible Absolute Passing Scores on Performance Examinations in Health Professions Education. Downing SM, et al. Teaching and Learning in Medicine. https://doi.org/10.1207/s15328015tlm1801_11

Other Suggested Resources

Mckinley DW, Norcini JJ: How to set standards on performance-based examinations: AMEE Guide 85. Med Teach 2014; 36(2):97-110

Yudkowsky R, Downing S, Tekian A: Standard Setting. In Downing SM and Yudkowsky R (eds): Assessment in Health Professions Education, New York and London: Routledge 2009

Cizek G (ed): Setting Performance Standards: Concepts, Methods and Perspectives. New Jersey and London: Lawrence Erlbaum Associates, 2001

Session Notes

Standard setting is the methodology used to define different levels of achievement or proficiency
Formal standard setting isn’t a common opportunity - part of an institutional policy or ‘judgment’ rather than individual judgments
If assessment is important, then it is worthwhile to make it defensible and reproducible, not arbitrary.
Standard setting is a part of the competencies of the MHPE and figures into several of the EPAs.
Standard setting requires qualified judges and a method for summarizing judgments across those judges. Selecting judges is critical - represent stakeholders, have familiarity with the educational/assessment process.
Judges will still need to be trained in the method of making and capturing their judgments.
Two types of standards: relative to other examinees, and absolute: relative to a fixed score. The relative standard can move up or down depending on the performance of the group. The absolute standard is always the same number.
Various methods for setting standards: Angoff, Hofstee, contrasting/borderline groups (and many others)
Angoff method: what is the probability of a borderline student correctly answering each item in the assessment.
Hofstee: combines judgments on acceptable range of passing scores and acceptable range of failure rates for the assessment, combined over multiple judges. Uses actual student data as well as judge estimates.
Whatever method used, the process is iterative and should be updated periodically with new judge estimates.
More resources are available on the MHPE web page under the Assessment competency and EPAs04, 05, 06, 07,

Standard Setting.

Facilitator: Monica Lypson

Date: December 6, 2017

- Evidence based standard setting: establishing cut scores by integrating research evidence with expert content judgments. Beimers JN, Way WD, et al. Pearson Bulletin January 2012: issue 21 .

- A comparison of Angoff and Bookmark standard setting methods. Buckendahl CW, Smith RW, et al. J Ed Measurement 2002:39(3);253-63.
- Standard-setting guidelines. Cizek GJ. Educ Measure: Issues and practice 1996; 13-22
- Setting performance standards on complex educational assessments. Hambleton RK, Jaeger RM, et al. Applied Psychol Measurement 2000: 24(4);355-66.
- A comparative study of standard-setting methods. Livingston SA, Zieky MJ. Applied Measurement in Educ 1989:2(2);121-141
- Standard setting: Does using mixed methods help? slides by Monica Lypson
- Setting standards on educational tests. Norcini JJ. Med Educ 2003;37:464-69
- Use of the Rasch IRT model in standard setting: an item-mapping method. Wang N. J Educ Measure 2003:40(3);231-53.

Books

Underlined book titles contain links to the electronic version of the book via Taubman Health Sciences Library. You may be asked to enter your unique name and kerberos password to access the book.

- Standard Setting, Chapter 6. Yudkowsky R, Downing SM, Tekian A. pp. 119-148. In: Assessment in health professions education. Downing SM, Yudkowski R. 2009. New York: Taylor and Francis. (ISBN 10:0-8058-6128-9)
- Setting Performance standards: Foundations, Methods and Innovations 2nd edition. Cizek GJ (Ed). Routledge, 2012. (ISBN: 041588148X)
- Standards for Educational and Psychological Testing. American Psychological Association, American Educational Research Association, and National Council on Measurement in Education: Joint Committee on Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association, 2014. (ISBN: 0935302352)

Workplace Assessment

- Current perspectives in assessment: the assessment of performance at work. Norcini JJ. Med Educ 2005;39:880-89. doi: 10.1111/j.1365-2929.2005.02182.x

Doing assessment of performance in the workplace has many more complications than assessing in controlled settings, such as a test. This article addresses these complexities and helps highlight how workplace assessments can be used and improved.

- Workplace assessment. Norcini JJ. In: Understanding Medical Education: Evidence, Theory and Practice. Tim Swanwick (ed). 2010. 232-245. ISBN: 978-1-405-19680-2.

This is a very nice overview chapter on the issues surrounding assessing performance in the real-world workplace, rather than in artificial settings.

Page updated

Report abuse