While incentives to improve quality of care have focused on providing more care, more is often not better. There is a growing consensus that overtreatment and overtesting exposes some patients to services they may not need or prefer or which may harm them, while also decreasing availability of those services for patients who could benefit from them. Our work has shown that some patients with diabetes and hypertension are overtreated (Kerr 2012, Sussman, 2015) and that guidelines currently focus much more on intensification than deintensification (Markovitz 2017). Our current work focuses on ways to restore balance to existing incentives and norms to always do more by fostering deintensification of routine medical services (Kerr 2016). A more balanced approach can actually improve quality of care by decreasing risk of harm to patients while maintaining incentives to extend evidence-based treatments to those who will benefit from them.(Kerr 2020)
Kerr 2012, Sussman 2015, Kerr 2016, Markovitz 2017, Kerr 2020
I have published a substantial body of work dating back to the mid 1990s related to the use and misuse of hospital mortality rates. My contribution has been to show how measurement error can lead to significant misclassification when using mortality rates as hospital performance measures(Hofer 1996) and in using physician peer assessment to measure causal mechanisms, such as estimating the number of hospital deaths due to medical errors (Hayward 2001). Along with Hayward 2007, these latter two studies are among several I have published that show that the measurement that attributes death to poor quality care in the often quoted estimates of the large number of people dying from medical errors in US hospitals is so imprecise that the estimates of deaths attributable to errors are likely to be dramatically overstated. I continue to publish on the methodological issues in using mortality rates as performance measures in work that now spans over 15 years (Mohammed 2012) and have a related body of work that generalizes the identification of these methodological challenges and potential solutions to other outcomes used for performance measurement. The work illustrates the need to use appropriate analytic methods to measure important health outcomes in order to draw correct conclusions about more intangible concepts such as death attributable to medical error, and ultimately the appropriate quality improvement responses. We are one of a several groups whose cumulative contributions in this area have been influential in increasing the use by large organizations, such as CMS and the VA, of multilevel models for operational systems that use mortality rates for performance measurement. My related work (see below) on physician implicit review has characterized the difficulty of measuring preventable deaths, the presence and variation of which is a fundamental assumption of using mortality rates as a quality measure.
Hofer 1996, Hayward 2001, Hayward 2007, Girling 2012, Mohammed 2012
I have worked extensively in the area of cancer health services research most of it as a founding member of the University of Michigan Cancer Surveillance and Outcomes Center in an almost 15-year close collaboration with investigators at the SEER registries in Georgia and California. I serve as a co-Investigator and design and analytic lead on four NCI and ACS grants that are currently funded, including 1/ A NCI funded registry based study of patterns of use of targeted therapies in metastatic cancers in diverse populations, 2/ A NCI funded randomized trial to improve patient-centered communication in breast cancer decision-making around treatment, 3/ A population-based randomized trial of a virtual solution to reduce gaps in genetic risk evaluation and management in families at high risk for hereditary cancer syndromes and 4/ A large ACS grant focusing on Gaps in genetic risk prevention in breast cancer patients and their families. Our CANSORT research group negotiated a novel collaboration between all of the major private genetic testing companies with NCI and IMS (NCI's honest broker) to link all genetic testing results to the SEER registries. We developed and piloted this as part of the Georgia-California SEER Genetic Testing Linkage Initiative (GeneLINK). Previously I served as the Director of the measurement and methodology core for a Program award grant (P01) from NCI focusing on the Challenge of Individualizing Treatments for patients and breast cancer. I was also a co-I on an NCI grant looking at health system factors and patient outcomes in breast cancer which examined the effect of organizational factors including physician and practice attributes on patient decisions about treatment for breast cancer and their satisfaction with care as well as several other grants examining racial and ethnic disparities in breast cancer treatment. We are also in the process of designing a large study to examine cancer care quality in the VA with comparisons to non-VA community care. A selection of senior-authored publications relevant to my cancer health services experience appear below.
Katz 2010, Li 2017, Katz 2017, Katz 2018, Katz 2018
An additional contribution has been my work on the methodology and potential role of physician implicit review in quality measurement. Implicit review has fallen from favor in recent times given perceptions that it is expensive and subjective as well as hopes of using big data to do all performance measurement on the cheap. However, implicit review was one of the principal methods used in every foundational study about quality of care in the US healthcare system and was the method on which all of the findings of the IOM report about deaths in hospitals due to medical errors was based. In related forms it remains the core method of expert testimony in medical malpractice cases, a service provided by peer review organizations to federal, state and local health agencies. Understanding the limitations of the method as well as its potential niche is important.
While my work has shown some of the problems in implicit review (Hofer 2000, Lilford 2007 and Hayward 2001[also cited above]), it has also illustrated the analytic methods that can address some of those problems and the potential role for implicit review in quality measurement (Hofer, 2004, Manseki-Holland 2017) In Hofer 2004, we were able to assess the degree of measurement error in these assessments of quality. Using analytic approaches which I previously established were essential for assessing true quality (signal) versus measurement error (noise), we described the magnitude of signal versus noise in the measurements of quality from this new tool and how they varied across conditions for which there is a highly developed evidence base (diabetes and heart disease) versus conditions with a smaller evidence base (COPD and acute short-course illnesses). The paper argues that structured implicit review could complement or substitute for explicit indicator based methods (such as HEDIS) for quality assessment among conditions with a good evidence base. However, the largest need for alternatives to explicit indicator based quality assessment methods is for conditions where there is not a well developed evidence base. Our findings suggest that peer assessment methods such as structured implicit review can not be counted on to fill in this gap. Most recently, I have been involved in this method's use to evaluate major shifts in staffing in the NHS (Bion 2021), and have advised the NHS in England about their plans to use it to search for preventable mortality in British Hospitals.
Hofer 2000, Hayward 2001, Hofer 2004, Lilford 2007, Manseki-Holland 2019, Bion 2021
Much of my work has focused on how to measure health care quality and utilization in the multilevel organizational structures typical of our delivery system, where physicians take care of panels of patients and are themselves clustered within hospitals or clinics. In a widely cited JAMA paper on physician profiling we illustrated how provider reports can overstate the variability in provider practice and the ability to distinguish meaningfully between providers.(Hofer, 1999) The study was widely reported in the media, including the Associated Press, USA Today, NPR and the CBC, and is part of a body of work that emphasizes the need to identify the organizational level at which variation occurs both to assess the ability of performance monitoring systems to distinguish between providers and the best targets for interventions to remove unwanted variability.
In a more applied vein I was a senior member of a group that developed and validated an automated APACHE-like measure and mortality monitoring system that extracts data elements from the clinical electronic databases that make up the VA electronic medical record. (Render, 2005) Based on this work, the VA established the InPatient Evaluation Center(IPEC) and the approach we developed is presently being applied in every ICU patient in the VA system including over 160 ICUs and well over 100,000 discharges per year. The goal is to facilitate research on the utility of profiling ICU outcomes on a system-wide scale and the analysis of natural experiments to understand process and outcome links and quality improvement efforts. Some of my more recent work has focused on variability of utilization of the ICU. (Chen 2012, 2013)
Hofer 1999, Render 2005, Chen 2012, Chen 2013
While several of my other contributions find significant flaws with popular existing ways of measuring health care quality I have also worked on the development of better performance indicators especially for chronic diseases like diabetes. We have published extensively on better ways to use clinical evidence to develop performance metrics in areas such as diabetic retinopathy screening (Vijan, 2000) as well as better ways for clinical trials to report results so that performance measures can be optimally designed (Hayward 2005). I have also contributed a body of more foundational work, for example in one case exploring the 2003 finding by a Rand group that less than half of recommended processes of care are actually provided to patients. We investigated the hypothesis that there is simply too much too do and providers are not prioritizing in an optimal way when faced during a visit with acute problems and other competing demands for attention suggesting a need for more explicit prioritization of performance measures.(Hofer 2004)
We found that although several high-priority aspects of diabetes care were clearly identified, physicians often prioritized care in a way that was clearly inconsistent with the epidemiological literature. Practicing physicians substantially underrated recommendations for care processes based upon more recent evidence. Furthermore, some areas that are stressed in performance measurement systems like HEDIS were overrated relative to the their impact on outcomes as supported by the clinical trial and epidemiological literature. This illustrates the potential distortions that performance measurement systems can introduce into clinician prioritization of tasks and we argued that the priority of performance monitoring measures should be explicitly considered when selecting and disseminating the measures.
In more most recent examples in this line of work we are now focusing on the extent that existing measures have incentivized care that has as much over-treatment as under-treatment and how the performance measures can be redesigned to ameliorate this problem, (Kerr 2012)(Sussman 2015) as well as tracking management and control during the Covid-19 pandemic (Aubert 2022).
Vijan 2000, Hayward 2005, Hofer 2004, Kerr 2012, Sussman 2015, Aubert 2022