In Conversation With… Gordon Schiff, MD
Editor's note: Dr. Schiff is Associate Director of Brigham and Women's Center for Patient Safety Research and Practice, Associate Professor of Medicine at Harvard Medical School, and Quality and Safety Director for the Harvard Medical School Center for Primary Care. He was an invited expert and reviewer for the Improving Diagnosis in Health Care report of the National Academy of Medicine. We spoke with him about his work and experience with understanding and preventing diagnostic errors.
Dr. Robert M. Wachter: What got you interested in diagnostic errors?
Dr. Gordon Schiff: It started with an interest in medication safety. When I was at Cook County Health and Hospital System, I chaired the Pharmacy and Therapeutics Committee and the Drug Utilization Review Committees. It struck me that we should be applying some of the same thinking and approaches to diagnosis as we were beginning to do with medication safety. As I looked around, nobody was doing that. We put in for an AHRQ grant in the early 2000s. AHRQ funded 93 patient safety grants in that period, and only one was related to diagnosis, and that was us.
We looked at the history and physical, lab testing, assessment, and follow-up. We tried to create a taxonomy looking at the types of things that could go wrong in the diagnostic process. At the same time, an emerging group of people were interested in this, which ultimately coalesced into the Annual Diagnostic Error in Medicine International Conferences. All of this work then led to the 2015 National Academy of Medicine report, which grew out of our efforts to try to better put a spotlight on diagnostic errors, understand them, and prevent them. So it started with my interest in medication safety and realizing that as an internist I mainly did two things: I made diagnoses and prescribed drugs. And these were areas for improvement for me individually and us collectively in both realms.
RW: As you look back, why didn't diagnosis get the attention it deserved for the first 10 years of the safety movement?
GS: I think in some ways diagnostic errors are more threatening for people to talk about. Most doctors consider themselves good diagnosticians, and to think more self-critically requires a little bit more vulnerability, which is threatening to our self-image about being mostly certain and right when we make diagnoses.
Also, a lot of diagnosis errors fly under the radar. Sometimes I won't hear about a patient that I missed with back pain that ended up at another hospital with a spinal epidural abscess. Especially a patient who you may have misdiagnosed who is angry and upset or didn't come back. Maybe you would only hear about it from a malpractice lawsuit—not the best method for learning. Thus, I became interested in trying to hardwire follow-up and feedback because there aren't good feedback systems currently. Moreover, many conditions are self-limited. Even though you have the wrong diagnosis, it doesn't seem to matter that much. Medical problems can be masked by drugs, so symptoms seem better.
It's hard to get our hands on this, and there is even a problem with the word "error." What is an error? Should I have felt that spleen when the person with lymphoma came in 3 weeks ago? How do you know it was enlarged then? This whole idea about error posits some specific event, and that often is different from the nature of diagnosis, which is dynamic and tends to unfold over time. To better conceptualize this I've tried to promote a simple Venn diagram model, included as an appendix in the 2015 report, which I find helpful when trying to explain diagnostic errors—especially things that go wrong in the diagnostic process, which is one big circle. We have another circle that refers to wrong or missed or delayed diagnosis, again where there may or may not be an error. The third overlapping circle is adverse outcomes. There are times when people did not make the correct diagnosis, but as you look at the case there doesn't seem to be anything that could have been done differently, or as Hardeep Singh says, lacked "opportunities for improvement."
On the other hand, take the case of Linda McDougal, whose breast biopsy was switched between two patients. That's a clear diagnostic process error. She was diagnosed as having breast cancer, which she didn't have, and underwent a mastectomy. All three of those circles—diagnosis process error, wrong diagnosis, and patient harm unfortunately came together. But every day, blood samples are switched between two patients. On the inpatient wards, somebody's sodium is too low one day because they drew the blood from the wrong site or the wrong patient. That's a process error. But fortunately, it doesn't lead to any harm or even embarking on a workup of hyponatremia. You just repeat the tests. So, all around us are diagnostic process errors, for example, test results that are not followed up in a timely way, or misinterpreted. Even though these "near misses," where there is no patient harm, represent processes that are pregnant with opportunities to improve and prevent future harm. You asked why this hasn't received the respect that it should have. Many of these are overlooked or felt not to be important because people think, "What's the difference? You know, we just repeated the sodium. It's fine." But it's quite important not to be switching blood specimens and pathology specimens; it can have severe adverse outcomes.
RW: What have we learned in the last 5 or 10 years about trying to measure diagnostic errors?
GS: The most important thing is for people to measure themselves. We may monitor a hospital's readmission rate and try to lower it to satisfy external requirements of an insurer or some third party… but I don't think that will work with diagnosis. Diagnosticians need to be intrinsically curious and motivated and nondefensive about aggressively looking to get feedback from their patients' diagnosis outcomes. We have been holding special diagnostic error morbidity and mortality conferences where we're trying to dig deeply to learn from each of the cases. I don't know that we have any measures that are quite there. Donabedian talked about immature quality metrics—measures that are not quite ripe for prime time. Certainly we do not have good metrics for judging hospitals or individual clinicians rates of diagnostic errors. Thus, I'm still cautious about trying to say that we have ripe metrics here.
Probably more important than collecting (inherently flawed) rates, would be to try to get better ways of uncovering these cases and getting people reporting and sharing them. We've been looking at malpractice cases for lessons. We've just reviewed all the primary care malpractice cases for the last several years for the two big insurers in Massachusetts trying to learn about pitfalls in those cases. There are roughly 100 primary care diagnostic cases a year between the two insurers. But I can't say they reveal any clear metrics that would be helpful for moving us forward.
RW: I guess that leads to a concern: if you think about what health systems have done to address other kinds of errors, will this measurement challenge get in the way of diagnostic errors getting the attention they deserve?
GS: I agree with you. One thing we've learned is that stories are very important. I'm certainly not hoping that some politician or movie star is victim of a serious diagnostic error, but these things do get attention and can motivate change. Clinicians have intrinsic interest in diagnosis—it isn't something that you have to convince doctors is important. We just have to develop and test ways nurture some of that natural curiosity, desire, and intrinsic motivation to improve diagnosis.
RW: Let's shift to solutions. There has been a lot of discussion about thinking about how doctors think and learning and teaching about diagnostic biases and heuristics. What is your sense of that and how well that has worked?
GS: One thing we've tried to do is consider ways to create the infrastructure to support good cognition. The EHR [electronic health record] where clinicians are spending much of their day and clinical encounter time, is something that we have been thinking about a lot lately. Mostly in terms of how, even though it should be supporting better diagnosis, it's getting in the way of good diagnosis. Records are filled with inaccurate information that is often out of date, copied and pasted. Instead of listening to the patient, you're looking at the computer and not paying attention and thinking about the medicine, about the patient's symptoms. You're just checking boxes. The EHR is not going away, so the question for us is how to turn this tool that has great potential for supporting cognition and diagnosis from its current state, which has so many limitations, to something that we would be empowered to be thinking better?
In terms of cognitive support and visual display, think about what a profound display tool the lab flowsheet was, especially well-designed displays where you could see patient's lab values over time and abnormal results were highlighted in red. Key results and trends became obvious. You could not easily miss that a patient had a falling hematocrit. The computer needs to support cognition both by taking away some of the distractions as well as helping highlight, rather than bury, key information. Another way the computer should help is to minimize the need to rely on human memory, something we know from other industries as well as medicine is not a highly reliable practice. How can the computer help us? The computer has a way of not forgetting, for example, that somebody had a splenectomy, so when somebody comes in maybe I'll be more likely to correctly diagnose pneumococcal sepsis by being reminded of that risk factor. How about helping remind us about common pitfalls for that type of patient or symptom or diagnosis? Thus, one of the things we were examining in the malpractice cases is looking at what are the pitfalls, what are the recurring problems that we may be seeing repeatedly in these cases?
I'll give you one prototypical example. A woman comes in with a breast lump and the doctor confirms its presence on physical exam and appropriately orders a mammogram. The mammogram then comes back normal, and the physician tells the patient there is nothing to worry about. But we now know that if there is a palpable breast lump, further workup has to be done even with a normal mammogram. This is a known pitfall, because mammograms are not 100% sensitive, and a palpable mass needs further evaluation. How could we engineer the computer so a forcing function automatically ensures the woman goes on to have an ultrasound or a biopsy, or whatever the next step should be. Or that it simply reminds me about the pitfall related to false negative mammograms in the face of a palpable breast lump.
We've created a broader taxonomy of how the computer could help in a New England Journal of Medicine piece and a BMJ Quality and Safety piece, in which we describe how the computer can help generate a differential diagnosis. In programs like Isabel, I can enter the history in free text and the program begins to suggest differential diagnoses. Now, this will have to be done in a smart way. If it lists hundreds of possible diagnoses that are not prioritized, then that will not be terribly helpful. The computer should be able to help me with intelligent test selection. What are the right screening tests and sequence of follow-up tests if these are positive to work up a patient for hemochromatosis or C. difficile infection? Or how do I interpret a urine toxicology screen. We are currently engaged in study that suggests these results can be confusing and are often and easily misinterpreted. The computer should also facilitate our connections with specialists. We should be talking to the lab staff and we should be talking to radiologist, so maybe the computer could facilitate that—you push a button and there is an endocrinologist or dermatologist on our screen to ask a real-time diagnostic question.
We have to really think of supporting the infrastructure to make cognitive work in medicine easier and more reliable and certainly relying less on memory and less vulnerable to some of these biases. One bias that the computer could help with is doing Bayes calculations. We know that doctors do sensitivity, specificity, and revised probability calculations very poorly in their heads. Quick, prepopulated calculators (with test sensitive/specificity or clinical predicted weighting rule) should be readily available so I can properly weigh the diagnostic probabilities. Again, it cannot say a person definitely has this disease or that, but it helps us be more accurate about the probabilities of different possible diseases in a way that is much more accurate than I would be doing just with my own biases and free-form calculations, or noncalculations, in my head.
RW: You've named a lot of exciting things. If you were going to design an electronic health record architecture starting from Gordy Schiff's brain, you probably would have designed it a different way than the ones you have. You mentioned Isabel as one tool. You've talked about a Bayesian calculator. You've talked about things that facilitate communication between generalists and specialists. There are all sorts of things, but they don't come out of the box from the EHR that any of us can buy.
GS: You're right, we will need to fundamentally redesign the EHR design and workflow. We are urging that this needs to be done for the way we order medications, and likewise we are going to have to also do serious redesigning of the EHR to support better, safer diagnosis. One of the things we need is a lot less clutter and better ability to get the patient's story and course easily discerned. This entails improving ability to record clinicians' assessments, capture the differential diagnosis, and degree of certainty. Such recorded "thinking out loud" assessments that can facilitate that communication among the health care team. I'm a fan of voice recognition, which I have been using for nearly 20 years. It is finally coming into its own in medical documentation, as a way of getting the patient's story and clinician's assessment recorded. Again, not endless amounts of text that nobody will read, but very succinct assessments that can help direct me and force me to think about it as I write this up. That will be a lot more meaningful than checking a bunch of boxes and not thinking.
And let's not forget about sharing with the patient. Increasingly with OpenNotes, patients are reading my notes (all of my patient now are able to). I welcome the opportunity to collaborate with my patients so they can say, "Dr. Schiff, you are way off course in what you wrote about me. You think this skin rash or dizziness is due to a certain drug reaction. But I need to correct your thinking and diagnosis here. It actually began weeks before I even started taking the drug." I see the EHR as a tool with this OpenNotes capability of engaging the patients, of working together to coproduce these notes, or at least think about the assessment and the diagnosis again.
We tend to treat diagnosis as a single static label. But actually it is a dynamic process that evolves over time. It includes uncertainties, calling for more language like it's probably this, or we're not sure what the diagnosis is and therefore we want watch you carefully and see if your symptoms go away or get worse. These are underdeveloped tools for collaboration between the patient and physician. They need to be reconceptualized and clinical diagnosis redesigned in such a way that I am continuously working with my patients and colleagues to become a better diagnostician. Thus, a start would be writing my assessments together in the exam room, then letting the patient go home and further review my thinking—"assessing my assessment" and helping me track their course.
RW: Was there a diagnostic error that you made that influenced your thinking about this work?
GS: For sure there is more than one, and any experienced clinician that would tell you otherwise either is not being honest or has a blind spot related to patient outcomes. I can tell you about one that is in print. A very special patient of mine had a lung nodule, the lab called me about the chest x-ray showing an abnormal lung nodule. I immediately ordered a CT scan, but 6 months later when I saw her in the clinic and I asked, whatever happened with following up on that nodule in your lung, I found the worrisome CT scan result. As she and I looked back over the record, I realized to my horror that I followed up on the abnormal chest x-ray but not the subsequent chest CT. She ultimately died of lung cancer. The thoracic surgeon said I "shouldn't fall on my sword" because the delay didn't really change her stage. It's interesting when I dug further to understand what went wrong and why. The radiologist assumed that since I was notified for the critical abnormal with the chest x-ray, I also knew about the critical abnormal with the CT, which of course I should have. I was trying to rely on my memory or yellow post-its, but those are unreliable methods. Certainly in this case I needed to have much more reliable forcing functions to make sure that was followed up on, and if I failed to, somebody should have rereminded me about this "unclosed loop." Perhaps the patient should have called me, as I encourage them to do if I don't get back to them on key test results. Of course, she assumed that no news was good news. All those things conspired to have an error that I felt very badly about. She was a very special patient and person, and even very forgiving when we disclosed and apologized for the error
RW: That’s understandable. Could that error happen today?
GS: We have put systems in place because of this exact type of dropped ball. That's one place in which my hospital, Brigham, is a leader. Our hospital has created a closed loop system of acknowledgement, reminders, and tracking, which goes a long way to make this sort of error difficult to happen in the future. It even can help automatically order follow-up radiology studies. By making the system more reliable, more automated, instead of relying on me and my remembering to follow-up, it prevents dropping balls in the handoffs between radiology and clinicians. But to answer your question, this error is happening every day in this country. Based on new screening recommendations for high-risk patients, we're ordering many imaging tests to screen for pulmonary nodules. Unless you have reliable follow-up systems in place to avoid this, these kinds of errors will be inevitable.
RW: Let's finish up with the National Academy of Medicine report. You and others in this field pushed for a big national report to focus on the issue of diagnostic errors. It was a very important moment in the field. What has been the reaction to it and your sense of whether it has led to some important changes?
GS: We're beginning to see flowering of some of the seeds that we've planted in this. Ironically, the item in the report that got the most attention was the least evidence-based piece—namely that everyone will experience a significant diagnostic error sometime in their lifetime. No systematic study has actually been done to provide data to support that assertion. Yet, I would say that was an underestimate. My own personal history, I have a half a dozen diagnostic errors that I talk about regularly. Missed pneumothorax, cryptogenic pneumonia that was misdiagnosed and mistreated, misdiagnosis of salmonella food poisoning as appendicitis leading to an unnecessary operation. This idea that errors are frequent and it will happen to everybody has stimulated a number of other studies. The Betsy Lehman Center and NSPF/IHI have recently done surveys of patients showing diagnostic errors are the leading cause of errors that patients report. So diagnostic errors are getting a spotlight, attention, and respect that has not previously been afforded to this issue.
The report also helped energize the Society to Improve Diagnosis in Medicine, which has formed a Coalition that includes most of the major medical specialties and organizations. Each has committed itself to working on the issue in their respective specialties. We have annual conferences, the 11th is coming up this fall in New Orleans. More and more patients are getting involved. Again, I can't say it's directly as a result of the NAM report, but it's all part of the spotlight that has been shown on this problem, and we very much need patients to help lead the way with their experiences and insights.
I'll add one more area we have been working. This is the problem of overdiagnosis. I like to refer to what we need here as the need for more conservative diagnosis. These are situations where we should be doing few tests and referrals. Some people might say the pendulum shouldn't swing too far in one direction or the other, whether underdiagnosis (i.e., missing diagnoses) or overdiagnosis (making too many unhelpful diagnoses). But I see these as two sides of the same coin. They're united in the need for better, more appropriate diagnosis overall. We're trying to develop principles of conservative diagnosis to underpin thinking about diagnosis in a much richer, more patient-centered way that is not just about ordering more or fewer tests. Instead, it involves better listening to patients, more meaningful continuity relationships, dealing with uncertainty, and tolerating the fact that often we don't know the answer. As we discussed earlier, patients will have to help us with follow-up in ways to make sure that a symptom that is not explained is resolved or needs further workup.
RW: Anything else that you wanted to talk about?
GS: One thing I would just stress is how we need to better learn ways to work with uncertainty in diagnosis and using it to our advantage. Making it something that supports us feeling comfortable and being more modest about what we're saying. One of my teachers, Dr. Stuart Levin, former Chair of Medicine at Rush, said, "Diagnosis is a lot of lucky guesses and misses; the really good diagnosis is in the follow-up." Getting better follow-up and feedback and creating the conditions that support such a culture of partnership with patients and learning. Making it easier for patients to call me if they're concerned. Likewise, we need to support a health care team where other members of the team such as the nurses, clerks, others are not afraid to speak up to question a doctor's diagnosis. Those kinds of things will create a culture, foster the preconditions for better diagnosis, and allow me to practice more conservatively. Then I don't have to rule out every possible disease when I'm seeing somebody for a 2-day fever. I can watch them and trust these safety nets. Hopefully they will continue to have insurance and I, as their primary care provider, will be able to reliably follow them to best sort out their diagnoses.