[FREE PDF sample] Abc of clinical reasoning (abc series), 2e (dec 19, 2022)_(1119871514)_(wiley-blac
ABC of Clinical Reasoning (ABC Series), 2e (Dec 19, 2022)_(1119871514)_(Wiley-Blackwell) 2nd
Edition Nicola Cooper
Visit to download the full and correct content document: https://ebookmass.com/product/abc-of-clinical-reasoning-abc-series-2e-dec-19-2022_ 1119871514_wiley-blackwell-2nd-edition-nicola-cooper/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
Chronic Total Occlusions-A Guide to Recanalization, 3e (Nov 29, 2023)_(1119517273)_(Wiley-Blackwell) Ron Waksman
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Nicola Cooper and John Frain to be identified as the authors of the editorial material in this work has been asserted in accordance with law.
Registered Office(s)
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, 9600 Garsington Road, Oxford, OX4 2DQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty
The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data
Names: Cooper, Nicola, editor. | Frain, John (John Patrick James), editor.
Title: ABC of clinical reasoning / edited by Nicola Cooper, Consultant Physician & Clinical Associate Professor in Medical Education, Medical Education Centre, University of Nottingham, UK, John Frain, General Practitioner & Clinical Associate Professor, Director of Clinical Skills, Division of Medical Sciences & Graduate Entry Medicine, University of Nottingham, UK.
Description: 2nd edition. | Hoboken, NJ, USA : John Wiley & Sons, Inc., 2023. | Includes bibliographical references and index.
Identifiers: LCCN 2022047017 (print) | LCCN 2022047018 (ebook) | ISBN 9781119871514 (paperback) | ISBN 9781119871521 (pdf) | ISBN 9781119871538 (epub)
Subjects: LCSH: Medical care--Decision making. | Medical care--Quality control. | Diagnostic errors. | Evidence-based medicine.
Set in 9.5/12pt Minion by Integra Software Services Pvt. Ltd, Pondicherry, India
Contents
Contributors, vi Preface (Second Edition), vii
1 Introduction to Clinical Reasoning, 1 Nicola Cooper and John Frain
2 Evidence-based History and Examination, 7 John Frain
3 Choosing and Interpreting Diagnostic Tests, 17 Nicola Cooper
4 Problem Identification and Management, 23 Nicola Cooper and John Frain
5 Shared Decision-making, 29 Anna Hammond and Simon Gay
6 Models of Clinical Reasoning, 35 Nicola Cooper
7 Cognitive Biases, 41 Nicola Cooper
8 Situativity and Human Factors, 47 Nicola Cooper
9 Metacognition and Cognitive Strategies, 53 Pat Croskerry
10 Teaching Clinical Reasoning, 61 Nicola Cooper and Mini Singh Index, 71
Contributors
Nicola Cooper
MBChB MMedSci FRCPE FRACP FAcadMEd SFHEA Consultant Physician and Clinical Associate Professor in Medical Education Medical Education Centre, University of Nottingham, UK
Pat Croskerry
MD PhD FRCP(Edin) Director, Critical Thinking Program Division of Medical Education, Dalhousie University, Canada
John Frain
MBChB MSc FRCGP DGM DCH DRCOG PGDipCard SFHEA General Practitioner and Clinical Associate Professor Director of Clinical Skills Division of Medical Sciences and Graduate Entry Medicine University of Nottingham, UK
Simon Gay
MBBS MSc MMedEd FRCGP SFHEA Professor of Medical Education (Primary Care) University of Leicester School of Medicine, UK
Anna Hammond
MBChB DRCOG FRCGP MClinEd SFHEA General Practitioner and Academic Lead for Clinical Skills and Reasoning Hull York Medical School, UK
Mini Singh
MBChB MMEd FRCP PFHEA Professor of Medical Education and Honorary Consultant Dermatologist University of Manchester, UK
Preface (Second Edition)
Excellence in clinical practice is not just about good knowledge, skills, and behaviours. As fellow author Pat Croskerry points out, how doctors think, reason, and make decisions is arguably their most critical skill. While medical schools and postgraduate training programmes teach and assess the knowledge and skills required to practice as a doctor, few currently offer comprehensive training in clinical reasoning and decision making. This matters because studies show that diagnostic error is common and results in significant harm to patients, and the majority of the root causes of diagnostic error involve errors in clinical reasoning.
Clinical reasoning is complex and takes years to learn. Most of the time it is learned implicitly and in an ad hoc fashion. In this book, we have made it explicit, broken down into its core components. This book is designed to be an introduction for individuals
and also an up-to-date resource for teachers and curriculum planners. Each chapter describes a component of clinical reasoning and its applications for clinical practice, teaching, and learning. This second edition has been extensively re-written and updated, and key references and further resources have been included for readers who want to explore topics in more detail.
Clinical reasoning is relevant to every clinical specialty in every setting, and it is not confined to medical students and doctors –we have written this book with advanced clinical practitioners and other clinicians in mind as well. We hope you enjoy reading it as much as we enjoyed re-writing and editing it.
Nicola Cooper John Frain
CHAPTER 1
Introduction to Clinical Reasoning
Nicola Cooper and John Frain
OVERVIEW
• Clinical reasoning describes the application of knowledge to collect and integrate information from various sources to arrive at a diagnosis and/or management plan
• A lack of clinical reasoning ability has been shown to be a major cause of diagnostic error
• Several components of clinical reasoning have been identified
• Expertise in clinical reasoning develops as a result of different types of knowledge plus some other important factors
• Clinical reasoning can be viewed from different perspectives that each give insights into how it can be taught and learned and why it goes wrong
Introduction
Fellow author, Pat Croskerry, argues that although there are several qualities we would look for in a good clinician, the two absolute basic requirements for someone who is going to give you the best chance of being correctly diagnosed and appropriately managed are these: someone who is both knowledgeable and a good decision maker. At the time of writing, medical schools and postgraduate training programmes teach and assess the knowledge and skills required to practice as a doctor, but few offer a comprehensive curriculum in decision-making. This is a problem because how doctors think, reason, and make decisions is arguably their most critical skill [1].
This book covers the core components of clinical decisionmaking – or clinical reasoning. It is designed for individuals but also for teachers and learners as part of a curriculum in clinical reasoning. Chapter 10 specifically covers teaching clinical reasoning in undergraduate and postgraduate settings. In this chapter we define clinical reasoning, explain why it is important, and introduce some of the different components of clinical reasoning that are explored in this book. We will consider how expertise in clinical reasoning develops, and also look at clinical reasoning through different lenses.
Clinical reasoning describes the application of knowledge to collect and integrate information from various sources to arrive at a diagnosis and/or management plan for patients [2]. It is a complex cognitive process involving clinical skills, memory, problemsolving, and decision-making. A definition of clinical reasoning is given in Box 1.1.
As the definition in Box 1.1 states, clinical reasoning can be defined as a skill, process, or outcome and multiple components of clinical reasoning have been identified. However, for teachers and learners, it can be useful to think of clinical reasoning as a process made up of different components, each of which requires specific knowledge, skills, and behaviours. The UK Clinical Reasoning in Medical Education group has defined five broad areas of clinical reasoning education [3]:
1. History and physical examination
2. Choosing and interpreting diagnostic tests
3. Problem identification and management
Box 1.1 A definition of clinical reasoning
‘Clinical reasoning can be defined as a skill, process, or outcome wherein clinicians observe, collect, and interpret data to diagnose and treat patients. Clinical reasoning entails both conscious and unconscious cognitive operations interacting with contextual factors. Contextual factors include, but are not limited to, the patient’s unique circumstances and preferences and the characteristics of the practice environment. Multiple components of clinical reasoning can be identified: information gathering, hypothesis generation, forming a problem representation, generating a differential diagnosis, selecting a leading or working diagnosis, providing a diagnostic justification, and developing a management or treatment plan. A number of theories (e.g., script, dual process, and cognitive load theories) from diverse fields (e.g., cognitive psychology, sociology, education) inform research on clinical reasoning.’
From Daniel et al. (2019). Acad Med; 94(6): 902–12.
Figure 1.1 Five broad areas of clinical reasoning education. Clinical reasoning concepts include key theories (e.g., script, dual process), how clinical reasoning ability develops, the problem of diagnostic error, the role of clinical reasoning in safe and effective care for patients, cognitive errors, and other factors that may impair the clinical reasoning process or outcome.
4. Shared decision-making
5. Clinical reasoning concepts
In this second edition, we have used this framework and explore each of these areas (see Figure 1.1) in more detail.
Why Is Clinical Reasoning Important?
Diagnostic errors tend to occur in common diseases and are a significant cause of preventable harm to patients worldwide [4]. It has been estimated that diagnosis is wrong 10–15% of the time [5]. Post-mortem studies consistently find undiagnosed disease as the cause of death in 10–20% of patients, of which half could have been successfully treated [6]. Diagnostic error is by far the leading source of paid malpractice claims in the UK, and diagnostic error has been identified as a high-priority patient safety problem by the World Health Organization.
A lack of clinical reasoning ability has been shown to be a major cause of diagnostic errors which can result in unnecessary pain, treatments, or procedures, and increase the costs of healthcare [2]. There are many reasons why diagnostic errors occur. A comprehensive review of studies of misdiagnosis assigned three main categories, shown in Box 1.2. However, faulty synthesis of the available information was found to be the most common reason for diagnostic errors leading to death and serious harm [7]. In other words, the clinician had all the information to make the right diagnosis but made the wrong diagnosis. There is a growing consensus that medical schools and postgraduate training programmes need to do more to teach clinical reasoning in an explicit and systematic way. The National Academy of Medicine’s report Improving Diagnosis in Health Care [8] found that diagnosis and diagnostic errors have been largely unappreciated in efforts to improve the quality and safety of healthcare and called for curricula to explicitly address teaching the diagnostic process, using
educational approaches that are aligned with evidence from the learning sciences.
Diagnostic error definitely causes harm, but increasing attention is also being paid to another problem which can be caused by faulty clinical reasoning – the harm caused by unnecessary tests and overdiagnosis. A study of over one million Medicare patients in the USA looked at how often people received one of 26 tests or treatments deemed by scientific and professional organisations to be of no benefit [9]. These included things like brain imaging in syncope, screening for carotid artery disease in asymptomatic patients, and imaging of the spine in low back pain with no red flags. In one year, at least 25% of patients received at least one of these tests or treatments. It has been estimated that at least 20% of healthcare spending is waste [10]. This waste has a huge impact on patients and the wider healthcare economy. Overdiagnosis occurs when people without relevant symptoms are diagnosed with a disease that ultimately will not cause them to experience symptoms or early death. There are many factors contributing to overdiagnosis (see Box 1.3), but one of them is the increasing availability of increasingly sensitive tests.
The growing recognition of the problem of diagnostic error, unnecessary tests, and overdiagnosis is why clinical reasoning is
Box 1.2 Categories of misdiagnosis
Error categoryExamples
No faultUnusual presentation of a disease
Missing information
System errorsTechnical, e.g. unavailable tests/results
Organisational, e.g. poor supervision of junior staff, error-prone processes, impossible workload
Human cognitive error
Faulty data gathering
Inadequate reasoning
Adapted from Graber ML et al., 2005 [7].
Box 1.3 Factors contributing to overdiagnosis
• Screening programmes that detect ‘pseudodisease’ – disease in a person without symptoms in a form that will never cause symptoms or early death
• Increasingly sensitive tests
• Greater access to scanning – diagnostic scanning of the head and body reveals incidental findings in up to 40% of those being scanned for other reasons, often leading to anxiety and further testing for an abnormality that would never have caused harm
• Widening definitions of disease and lower treatment thresholds, for example:
• Chronic kidney disease
• High cholesterol
• Attention-deficit hyperactivity disorder
• Cultural considerations – medicalisation, commission bias (better to do something than nothing), fear of litigation
• Individual clinicians’ lack of understanding of statistics relevant to the disease, diagnostic test, and intervention in question
Adapted from Moynihan R. Preventing overdiagnosis: how to stop harming the healthy. BMJ 2012; 344: e3502.
of such interest to researchers, medical educators, and policy makers. Improving clinical reasoning outcomes is a patient safety and healthcare economy priority.
Components of Clinical Reasoning
Several components of clinical reasoning have been identified. A fundamental one is the application of knowledge to gather and interpret data in the patient’s history and physical examination. The purpose is to establish the clinical probability of disease – a judgement based on the clinician’s knowledge of epidemiology and what we call evidence-based history and physical examination, a topic which we explore in Chapter 2. The clinical probability of disease is a prerequisite for choosing and interpreting diagnostic tests. Interpreting diagnostic tests is something even qualified health professionals find difficult [11]. This is because tests lie; very often, tests give us test probabilities, not real probabilities, which is why test results have to be interpreted by knowledgeable clinicians, a topic which we explore in Chapter 3.
Problem representation is something that is neglected in traditional ‘history–examination–differential diagnosis’ teaching methods, but studies show that being able to represent the problem before attempting to solve it (i.e., think of a diagnosis) is a key step in problem-solving, and dramatically increases diagnostic accuracy, especially in more complex cases [12]. This is a skill that can be learned, and a topic which we explore in Chapter 4.
Clinical reasoning often takes place within teams. Clinicians also make use of guidelines, scores and decision aids, and co-produce decisions with patients and carers. The important topic of shared decision-making is explored further in Chapter 5.
Simply knowing about clinical reasoning concepts does not help people reason better. But it is important for clinicians, teachers, and learners to have a shared definition, vocabulary and understanding of clinical reasoning in order to facilitate meaningful discussion and learning. Models of clinical reasoning can be useful to help us understand the processes underpinning our decision-making – as clinicians, teachers, and learners. Chapter 6 explores dual process theories which are widely accepted as a framework with which to understand diagnostic reasoning and diagnostic error. Some common misunderstandings are identified, and we explore critical thinking, rationality, the different types of knowledge used by Type 1 and Type 2 processing, and thinking about one’s own thinking (metacognition).
The topic of cognitive biases in clinical reasoning is controversial. This is partly because there are several key fallacies in the received view of dual process theories. There is definitely agreement that cognitive biases exist in medicine, but disagreement as to whether they are a significant source of diagnostic errors compared with knowledge deficits. Chapter 7 explores this topic further using a case history and analysis.
Clinical reasoning does not exist solely inside a clinician’s organised cognitive structures but is entangled in the activity of providing care for the patient [13]. Chapter 8 explores ‘situativity’ and human factors (the science of the limitations of human performance). ‘To err is human’, therefore in order to minimise
errors, we need to focus on improving processes, systems, and technology, as well as education and training in cognitive strategies. Pat Croskerry explores metacognition and cognitive strategies further in Chapter 9.
Finally, we look at teaching clinical reasoning in Chapter 10. There is no evidence that teaching clinical reasoning concepts alone, or short courses, improves clinical reasoning ability. The most effective way to teach clinical reasoning is to use strategies that build knowledge and understanding, and to practice with as many different cases as possible in as many different contexts as possible with coaching and feedback. We explore key concepts in teaching clinical reasoning, specific evidence-based strategies that teachers can use, and describe one approach to introducing a clinical reasoning curriculum at undergraduate level.
How Does Expertise in Clinical Reasoning Develop?
If how clinicians think, reason, and make decisions is arguably their most critical skill, it is useful to consider how expertise in clinical reasoning develops. In the 1970s, expertise in medicine was thought to be related to superior general thinking skills. However, when researchers observed experts and novices, they found there was no difference in the processes or thinking strategies used – both quickly came up with one or more diagnostic hypotheses which guided the search for further information. Experts were more accurate because they knew more, and because the knowledge of experts varied from case to case, their performance varied from case to case as well [14]. This led to researchers changing direction and examining the role of knowledge in medical expertise.
One of the next questions for researchers was, do experts have bigger, better memories? The answer was no – given unlimited time, novices can remember as much as experts about a clinical case on paper. But experts appear to acquire information more efficiently and pay attention to more critical information (you have probably seen this in action). In a series of well-known experiments, Chase and Simon showed chess players of varying strength – from master to novice – chessboards set up as if in the middle of a game for only 5 seconds and then asked them to reconstruct the position of 28 pieces on a blank chessboard immediately afterwards. What they found was the chess masters showed a remarkable ability to reconstruct the board almost perfectly, whereas the novices could only recall the position of four or five pieces. However, when the experiment was repeated with the chess pieces arranged randomly, chess masters performed no better than anyone else [15]. Chase and Simon concluded that chess masters had stored in memory a large number of recognisable ‘chunks’, or meaningful patterns (see Box 1.4). Similar results have been found in other fields – experts can reconstruct a briefly examined scene provided it portrays a realistic (as opposed to random or meaningless) pattern. But pattern recognition by itself does not explain expertise. Non-chess players can be trained to memorise chess patterns. Experts recognise patterns of high significance because of their formal as well as experiential knowledge of chess – in other words, they study [16].
We know that knowledge is fundamental to expertise in clinical reasoning. (As we will see in Chapter 6, other things matter as well.) But by knowledge, we do not mean only facts. That is like saying the raw ingredients are the same as the cake. Figure 1.2 refers to different types of knowledge; all these types of knowledge matter in clinical reasoning.
With learning, the process of chunking and automating, as described in Box 1.4, frees up mental resources. The difference
Metacognitive
• Knowledge of cognition in general as well as awareness of one’s own cognition
Procedural
• How to do something, methods of inquiry, and criteria for using skills, algorithms, techniques, and methods
Conceptual
• The inter-relationships among the basic elements within a larger structure that enable them to function together
Factual
• The basic elements learners must know to be acquainted with a discipline or solve problems in it
Figure 1.2 Different types of knowledge. Adapted from: A model of learning objectives–based on: a taxonomy for learning, teaching, and assessing: a revision of Bloom’s Taxonomy of educational objectives by Rex Heer, Center for Excellence in Learning and Teaching, Iowa State University. https://www.celt.iastate.edu/teaching/effective-teaching-practices/ revised-blooms-taxonomy (accessed April 2022).
Box 1.4 Chunking and automating: how we can think more complex thoughts
Human working memory can only process a limited number of elements at a time. But not all elements are created equal. The elements of information stored in our long-term memory increase in complexity over time, with smaller elements combining to form larger ones. This process is called ‘chunking’ and it is what allows us to think more complex thoughts.
For example, a child who has not yet learned the alphabet will see the letter ‘H’ as three straight lines. Reproducing these three straight lines correctly could be a cognitively demanding task. Over time, they will chunk and automate drawing the letter ‘H’ until it becomes effortless. Once they have mastered individual letters, whole words will still be new to them. The word ‘H, o, u, s, e’ will consist of five separate elements. But over time, this will be chunked as the word ‘House’ in long-term memory which can then be read and written effortlessly.
Chunking and automation is a result of learning. Experts store knowledge in long-term memory as rich chunks called schemas which allows them to overcome the limitations of working memory when solving problems. (See Chapter 10 for information on how teachers can facilitate the process of schema formation.)
Adapted from Lovell O. Sweller’s cognitive load theory in action. A John Catt Publication, 2020. pp. 20.
between normal learning and expert learning is what people do with those freed up resources. People who become experts reinvest their mental resources in further learning. They seek out more difficult problems. They tackle more complex representations of common problems. They continue to work at the edge of their competence [17].
In summary, we know that expertise in clinical reasoning is highly dependent on knowledge, but that is not the whole story. We will explore this further in Chapters 6 and 10.
Clinical Reasoning through Different Lenses
At its most basic level, clinical reasoning is to do with knowledge, how knowledge is organised in long-term memory as mental representations, and the cognitive processes responsible for storing, transforming, and retrieving these. This view of clinical reasoning is important for learners, who must have a) a deep foundation of factual knowledge, b) understand facts and ideas in a conceptual framework, and c) organise their knowledge in a way that facilitates retrieval and application. (The latter two points are why we need expert teachers.) However, clinical reasoning is also something that is ‘situated’ in the environment. Thinking and learning is context-dependent, the result of multiple dynamic interactions between individuals and the environment. Cognition is also distributed in team members and in non-humans (e.g., computers). It can be adversely affected by poorly designed technology and systems, as well as sleep deprivation, fatigue, and excessive workload. Understanding clinical reasoning through this lens is important too. But sometimes we need to be able to go beyond the ‘content’ and make sense of the patient’s illness (and our own response to it) and understand the practice of medicine in its wider socio-cultural context. This is what we sometimes call the ‘art’ of medicine, which involves crafting a wise and deliberate course of action appropriate for the circumstances and may not involve any technical decision-making at all. Clinical reasoning through this lens is often what postgraduates are able to start practicing once their mental resources are freed up from focusing on the technical aspects of medicine. Many studies demonstrate a correlation between effective clinician–patient communication and ‘whole person care’ with improved health outcomes. An example of this is given in Box 1.5.
Listen to Me; I Am Telling You My Diagnosis
Whatever their level of health literacy, no patient is an empty vessel devoid of knowledge or insight about their own condition. By presenting to a clinician, patients have often engaged in their own reasoning and formulated ideas about a diagnosis or even several diagnoses. The patient’s perspective is not simply a psychosocial one but may include relevant insights into anatomical and physiological changes and pathology (see Box 1.6). This does not mean they are always correct – that is why they attend a clinician – but neither is their perspective irrelevant.
While the clinician arrives at a differential diagnosis (and/or problem list) following the history, or patient interview, the patient arrives with their own differential diagnosis (and/or problem list).
Box 1.5 The importance of whole person care
Two patients had similar symptoms. They were experiencing transient numbness of different parts of the body – one side of the face or the other, sometimes the arm or hand. These symptoms were causing a great deal of anxiety. The patients went to see two different physicians.
The first patient told his story. At the end of the consultation the physician said, ‘Well you’ve either got migraine or multiple sclerosis so we’ll do an MRI scan and I’ll let you know the results.’ He was not given a further appointment. While waiting for his MRI scan, his anxiety and symptoms increased significantly.
The second patient told her story. Recognising that these symptoms are common in stress and did not fit any neurological pattern, the physician said, ‘I see lots of people with these symptoms and very often it’s because they are working too hard, not sleeping, or under stress. Even though they might not realise they are stressed, their body is telling them they’re stressed. Tell me about your schedule and what’s going on in your life.’ The patient’s husband looked at her knowingly and sure enough there were lots of stressors related to work and home that had been an issue. An MRI scan was arranged, but the patient was advised to make changes to her lifestyle and her symptoms resolved.
Both patients had normal MRI scans. Explanation and good communication lead to better outcomes, greater compliance with recommended treatments, and less re-attendances.
Box 1.6 A patient-centred approach to differential diagnosis
Agreeing the differential diagnosis
Patient’s differentialDoctor’s differential
• Leading hypothesis
• Alternative 1
• Alternative 2
• • ‘Must-not-miss’
• Leading hypothesis
• Alternative 1
• Alternative 2
•
• ‘Must-not-miss’
Where is the common understanding? ‘Let us think about this together’ Eventual diagnosis must explain both parties’ data
The history, or patient interview, is a discussion of the patient’s ideas and insights into their symptoms and diagnosis. The clinician’s role is crucial to the correct synthesis of this information, but this must be done with a complete dataset elicited from the patient.
Patients describe their own insight with differing levels of articulacy, but all patients do this. Clinical reasoning should from the outset be a collaborative and dynamic process between patient and clinician (see Box 1.7). The benefits include identifying the patient’s priorities to the eventual goal of shared decision-making. Yet, how often do we hear, either in a media story, complaint, or adverse event, that the clinician ‘wouldn’t listen to what we were saying’? This feeling among patients of dissonance between clinicians and themselves is reflected by the evidence base [18–20]. Assessing the patient’s own starting point by defining their own understanding and experience of their symptoms at the beginning of the interview is crucial to avoiding diagnostic error.
Box 1.7 The patient presents after reasoning through their symptoms
‘Thank you for speaking to me doctor. I’ve been feeling unwell for several days. My hay fever is usually bad at this time of year. I know the pollen count is high at the moment, but my usual medication is not working. My nose is blocked, and I’ve been sneezing. My chest is tight as well and I’ve been coughing more at night. My blue inhaler has helped but I’m using it more than usual and it’s not as effective. My peak flow is down. I was wondering if I need some steroids as well.
I was shielding during the first lockdown, and I know COVID is coming back again. Do I need another jab to boost my immunity?’ Patient’s problem list:
1. My hay fever is not controlled
2. My asthma is deteriorating
3. The prevalence of COVID-19 is increasing. Do I need further immunisation?
There is a lot of data to unpack here to make the diagnosis and answer the patient’s concerns.
The different lenses through which we can view clinical reasoning, whether from the clinician or the patient’s perspective, are not mutually exclusive. There are situations that call for rapid technical responses, for example, emergencies. Then there are others that call for time, wisdom, and care. Clinical reasoning is complex and takes years to learn. The challenge for educators is to provide clinical environments with multiple tasks and strategies in order to equip learners over time to be able to reason through a variety of clinical problems as effectively as possible [21].
Summary
At the time of writing, most medical schools and postgraduate training programmes provide instruction in the basic elements of the diagnostic process. However, students and trainees largely learn the knowledge, skills, and behaviours required for effective clinical reasoning implicitly and ad hoc, through experience and apprenticeship. Yet a lack of clinical reasoning ability has been shown to be a major cause of diagnostic errors resulting in significant preventable harm to patients worldwide. There is a growing consensus that medical schools and postgraduate training programmes should teach clinical reasoning in a way that is explicitly integrated into courses throughout each year of the programme, adopting a systematic approach consistent with current evidence.
For teachers and learners, it can be useful to think of clinical reasoning as a process made up of different components, each of which requires specific knowledge, skills, and behaviours. These components are what the following chapters explore in more detail. If we can start with an understanding of what clinical reasoning is, why it is important, what the key components are, and how it develops, we are in a better position to create clinicians who are good decision makers and who ultimately provide better care for patients.
References
1. Croskerry P. Clinical decision making. In: Barach P, Jacobs J, Lipshultz S, and Laussen P (Eds). Pediatric and Congenital Cardiac Care. London: Springer, 2015. https://doi.org/10.1007/978-1-4471-6566-8_33 (accessed April 2022).
2. DID-ACT Project. Developing, implementing, and disseminating an adaptive clinical reasoning curriculum for healthcare students and educators. An EU funded project, 2020. https://did-act.eu/home/project (accessed April 2022).
3. Cooper N, Bartlett M, Gay S et al. On behalf of the UK Clinical Reasoning in Medical Education (CReME) consensus statement group. (2021). Consensus statement on the content of clinical reasoning curricula in undergraduate medical education. Medical Teacher; 43(2): 152–159.
4. Tehrani ASS, Lee HW, Mathews SC et al. (2013). 25-year summary of US malpractice claims for diagnostic errors 1986-2010: an analysis from the National Practitioner Data Bank. BMJ Quality & Safety; 22(8): 672–680.
5. Graber ML. (2013). The incidence of diagnostic error in medicine. BMJ Quality & Safety; 22(Suppl 2): ii21–ii27.
6. Leape LL, Berwick DM and Bates DW. (2002). Counting deaths due to medical errors. [Letters]. JAMA; 288(19): 2405.
7. Graber ML, Franklin N and Gordon R. (2005). Diagnostic error in internal medicine. Archives of Internal Medicine; 165: 1493–1499.
8. National Academies of Sciences, Engineering, and Medicine. Improving diagnosis in health care. Washington (DC): The National Academies Press, 2015. doi:10.17226/21794 (accessed April 2022).
9. Shwarz A, Landon B, Elshaug A et al. (2014). Measuring low value care in Medicare. JAMA Internal Medicine; 174(7): 1067–1076.
10. Berwick D and Hackbarth A. (2012). Eliminating waste in US healthcare. JAMA; 307(14): 1513–1516.
11. Whiting PF, Davenport C, Jameson C et al. (2015). How well do health professionals interpret diagnostic information? A systematic review. BMJ Open; 5(7): e008155.
12. Bordage G. (1994). Elaborated knowledge: a key to successful diagnostic thinking. Academic Medicine; 69(11): 883–885.
13. Koufidis C, Manninen K, Nieminen J et al. (2022). Representation, interaction, and interpretation. Making sense of the context in clinical reasoning. Medical Education; 566: 98–109.
14. Elstein AS, Shulman LS and Sprafka SA. Medical problem solving. An analysis of clinical reasoning. Harvard University Press, 1978.
15. Chase WG and Simon HA. (1973). Perception in chess. Cognitive Psychology; 4: 55–81.
16. Bereiter C and Scardamalia M. Experts are different from us: they have more knowledge. In: Surpassing ourselves: an inquiry into the nature and implications of expertise. Open Court Publishing, 1993. pp. 25–42.
17. Bereiter C and Scardamalia M. Expertise as a process. In: Surpassing ourselves: an inquiry into the nature and implications of expertise. Open Court Publishing, 1993. pp. 77–120.
18. Fransen GAJ, Janssen MJR, Muris JWM et al. (2007). Measuring the severity of upper gastrointestinal complaints: does GP assessment correspond with patients’ self- assessment? Family Practice; 24: 252–258.
19. Dorval E, Rey JF, Soufflet C et al. (2011). Perspectives on gastroesophageal reflux disease in primary care: the REFLEX study of patient-physician agreement. BMC Gastroenterology; 11: 25–33.
20. Humphrys E, Walter FM, Rubin G et al. (2020). Patient symptom experience prior to a diagnosis of oesophageal or gastric cancer: a multi-methods study. BJGP Open; 4(1): bjgpopen20X101001.
21. Eva K. (2004). What every teacher needs to know about clinical reasoning. Medical Education; 39: 98–106.
Further Resource
1. Bereiter C and Scardamalia M. Surpassing ourselves: an inquiry into the nature and implications of expertise. Open Court Publishing, 1993.
CHAPTER 2
Evidence-based History and Examination
John Frain
OVERVIEW
• The history and physical examination are essential diagnostic tools
• To formulate a differential diagnosis, the history should identify the patient’s key presenting symptoms and associated symptoms
• The patient’s background and the natural history of a condition are essential contextual factors in making a correct diagnosis
• An incomplete physical examination increases the risk of diagnostic error
• If available, likelihood ratios are a measure of diagnostic accuracy which are easy to understand and apply at the bedside
• Clinical teachers should be familiar with evidence-based history and examination
Introduction
Evidence-based history and examination surveys all data from the clinical encounter, compares it to available evidence, including recognised diagnostic standards, and quickly identifies those variables with the greatest diagnostic accuracy.
Around 80% of diagnoses are made from the history alone [1]. Its importance is such that Hampton et al. recommended more emphasis should be placed on teaching students accurate historytaking and more emphasis should be placed on research into communication between patients and clinicians [1]. The purpose of the history is the generation of a differential diagnosis broad enough to include the actual diagnosis but focussed enough to be tested by an appropriate physical examination, and investigations when necessary. The differential diagnosis consists of a leading hypothesis and two or three other hypotheses including any ‘must not miss’ diagnoses, all justifiable by the data gathered, and taking into account the patient’s own concerns. Generating a differential diagnosis guards against premature closure (see Chapter 9). For the learner, it aids in developing their clinical reasoning ability. Generating a differential diagnosis needs to be taught alongside the history and physical examination, not separately. This chapter
will introduce the idea of evidence-based history and physical examination, an important component of clinical reasoning.
The Evidence Base
Most of our knowledge of history and physical examination predates the development of evidence-based medicine, and from a time when patients presented later in the course of their illness, usually with more ‘classic’ features. Nowadays, especially in developed countries, clinical problems present earlier and are often less well differentiated from one another. We know that many single features in the history and physical examination are limited in their usefulness (e.g., whether someone’s chest discomfort is ‘central and crushing’ or not; or whether Kernig’s sign is present). Nonetheless, good quality evidence now exists, including systematic reviews of symptoms and diagnostic accuracy studies of physical signs. This is being assimilated into textbooks (see further resources) that provide epidemiological and qualitative data as well as diagnostic accuracy studies including sensitivities, specificities, and likelihood ratios. Where statistical evidence is lacking, authors provide diagnostic guides based on the best available epidemiological evidence. Evidence-based principles and knowledge can be reinforced for learners through reflective coursework (see Box 2.1) which can further enable reflection on clinical reasoning during workplace training.
Exploring Symptoms: Combining Process and Content
The process of history-taking cannot be separated from knowledge of epidemiology, anatomy, and physiology. Nor can it be separated from effective communication with patients [1]. Irrespective of the cause, each patient seen by a clinician is essentially saying: ‘I’ve noticed some changes in my anatomy and physiology. Can you tell me, is this normal, part of getting older or is it pathology? If it’s pathology, will I return to my previous health, will it leave me with some loss of function or could it even shorten
Box 2.1 Reflective coursework: the presentation of chest pain in women
‘I encountered a female patient in her 50s presenting with a possible acute coronary syndrome (ACS). We had recently been discussing in clinical skills about possible differences in the presentation of acute coronary syndrome in women and men. I decided to examine the evidence for this.
I undertook a systematic search of the literature using the 6S evidence pyramid we had been taught. I identified 16 relevant studies using the SORT criteria.i There was heterogeneity of data recording and analysis across the studies. However, I was able to draw the following conclusions:
• Women are more likely to present with ACS atypically compared to men
• Men are more likely to present with chest pain than women
• Chest pain is the most common symptom presentation for both sexes
• Younger women are more likely than older women to present with typical symptoms
• No difference exists between sexes for prevalence of chest pain and/or other typical symptoms of ACS
• Women reported more associated (non-chest pain) ACS symptoms than men
Continued development as a clinician requires three things: reflective practice, unwavering curiosity, and maintaining an open mind to new evidence and ideas. Writing this essay has highlighted the importance of looking to the highest level of evidence for guidance, while maintaining a healthy level of scepticism for the recommendations by analysing the primary literature and systematic reviews behind them.’
i. Ebell MH, Siwek J, Weiss BD et al. (2004). Strength of recommendation taxonomy (SORT): a patient-centered approach to grading evidence in the medical literature. Am Fam Physician; 69(3): 548–56.
This is an excerpt from a patient-based piece of coursework undertaken by graduate entry medicine students at the University of Nottingham, UK. Students examine the evidence underlying a clinical feature encountered in practice.
my life?’ Clinical reasoning and decision-making is required to assess all these concerns.
Models for clinical communication have been developed, most notably the Calgary-Cambridge model [2] consisting of 70 skills which facilitate accurate history-taking (see Box 2.2). Teaching this model involves deliberate practice with detailed, specific feedback by observers. It is used either one-to-one with a tutor and patient, with patient actors, or at the bedside with real patients. An adapted form of the observation guide includes feedback on elements of clinical reasoning [3, 4].
When taking a history, key symptoms emerge within the relevant system (see Box 2.3). While symptoms may overlap different systems (e.g., chest pain could be cardiac, musculoskeletal, or respiratory in origin) or be challenging for both patient and doctor to define (e.g., dizziness), many diseases present with varying configurations of key symptoms within the relevant system. Provided the clinical setting of an individual symptom is clearly defined (e.g., nausea and vomiting in patients with suspected intestinal obstruction, or chest pain in patients with suspected myocardial infarction in the emergency department) it is possible
Box 2.2 The Calgary-Cambridge model: the process required for accurate history-taking
• The patient’s opening statement (usually 30–120 seconds)
• Identifying a problem list
• Agreeing an agenda for the interview
• Exploring each symptom experienced by the patient
• Patient and clinician agreeing on definition of each symptom present
• Gathering data for each symptom using open questions
• Completing details using closed question
• Establishing a sequence of events
• Attentive listening
• Picking up cues
• Exploring the patient’s relevant background information
• Relevant systems review
• Ensuring all the patients concerns have been addressed
• Explanation and planning
• Shared decision-making
Adapted from Silverman J, Kurtz SM, Draper J. Skills for Communicating with Patients, 3rd edn. CRC Press, 2013.
Box 2.3 Summary of key symptoms by body system
GeneralCardiovascular
Fatigue/malaise
Fevers/rigors/night sweats
Weight/appetite
Sleep disturbance
Rashes/bruising
Pain
Breathlessness
Palpitations
Swelling
RespiratoryAlimentary
Pain
Breathlessness
Wheeze
Cough
Sputum/haemoptysis
Difficulty swallowing
Nausea/vomiting/haematemesis
Indigestion/heartburn
Pain/distension
Change in bowel habit
Bleeding
GenitourinaryNervous system
Frequency
Dysuria
Incontinence
Change in urinary volume
Prostatic symptoms
Menstrual symptoms
Musculoskeletal
Pain
Stiffness
Swelling
Headache
Loss of consciousness
Dizziness
Visual disturbance
Hearing
Weakness
Numbness/tingling
Memory or personality change
Anxiety/depression
Loss of function or activities of daily living
Adapted from the Calgary-Cambridge Guide. In: Silverman J, Kurtz S and Draper J. Skills for Communicating with Patients, 3rd edn. CRC Press, 2013.
to reason its significance and thus its usefulness as evidence of the presence of the target condition. Course content can be developed to teach students the evidence-base for each symptom alongside the process of exploring symptoms with the patient (Box 2.4).
Box 2.4 Summary evidence box for cardiovascular pain
Definition
Impairment of the blood supply to the heart. Pain is precipitated by exertion and relieved by rest.
Anatomical site
Narrowing of the coronary arteries, usually by atherosclerosis but occasionally by spasm.
Pathophysiology
Primarily, smoking, high cholesterol, high blood pressure, and diabetes damage the endothelial lining of the coronary arteries.
What does the patient say?
‘I keep getting pains in my chest whenever I do things.’
‘My angina is getting worse.’
Evidence base
The table shows the European Society of Cardiology clinical classification of angina:i
Typical angina Meets all three of the following characteristics:
1. Characteristic retrosternal chest discomfort –typical quality and duration
2. Provoked by exertion and emotion
3. Relieved by rest or GTN (glyceryl trinitrate) or both
Atypical angina Meets two of the above characteristics
Non-cardiac chest pain Meets one or none of the above characteristics
Up to 95% of chest pain presents in primary care.ii Of patients presenting with chest pain at the emergency department, almost a third have serious cardiovascular disease. Only 11–44% of patients referred to the cardiology out-patients have coronary artery disease.iii
The presence of atypical symptoms such as pain worse on inspiration, stabbing pain, positional, reproducible on palpation of the chest wall and no radiation is more useful in identifying patients who do not require investigation (positive predictive value = 0.94) than are the features of typical angina in identifying patients who do require further investigation (positive predictive value = 0.21). In stable chest pain, the patient’s own description of the pain is the most important predictor of coronary heart disease, and in acute pain, the history of chest pain should be judged alongside the ECG findings.iv
Chest pain in a young person is unlikely to be coronary artery disease but you should ask about associated dizziness, syncope, and a family history of sudden death as well as screen for risk factors. Women present more often with atypical angina. They are more likely than men to have pain at rest, during sleep or precipitated by stress.
References
i. Talley NJ, O’Connor S. Clinical Examination, 7th edition. Sydney: Churchill Livingstone, 2014.
ii. Polmear A. Evidence-Based Diagnosis in Primary Care. Edinburgh: Butterworth Heinemann, 2008.
iii. Buntinx F, Knockaert D, Bruyninckx R et al. Chest pain in general practice or in the hospital emergency department: is it the same? Fam Pract (2001) 18:586–589.
iv. Chun AA, McGee S. Bedside diagnosis of coronary artery disease: a systematic disease. Am J Med (2004) 117:334–343.
Adapted from Frain J. Module 3 Clinical Skills Workbook: Cardiovascular. Graduate Entry Medicine, University of Nottingham, UK.
Defining Symptoms
Patients and clinicians do not always agree on how a symptom is experienced. Learners benefit from realising and exploring this at the outset. Misunderstandings can occur more readily when symptoms appear to overlap body systems and/or when there are multiple causes. An example is dizziness, where one classification lists 33 possible causes. Students, in particular, worry they, ‘Don’t know what questions to ask’. Providing knowledge and categorising the underlying causes of symptoms can enable students to explore and define the symptoms further:
Patient: ‘I feel dizzy.’
Student: ‘What do you mean by that?’
Patient: ‘I feel I’m going to pass out sometimes’ – pre-syncope
‘I feel as though I’m (or the room is) moving’ – vertigo
‘I feel as though I’m falling’ – disequilibrium (often neurological)
‘I feel light-headed all the time’ – possible anxiety or depression
Similarly, misunderstandings regarding ‘indigestion’ can be clarified by using the patient’s own definition as the starting point (see Box 2.5). In the case of upper alimentary symptoms, agreement between patient and clinician on the patient’s experience is critical to the diagnosis, including malignancy.
Associated Symptoms
Many students are proficient at obtaining a detailed description of a presenting symptom but fail to explore associated symptoms. By associated symptoms, we mean those symptoms which, though subsidiary to the patient’s main concern (e.g., pain), are nonetheless present and may contribute to the patient’s overall concern about the diagnosis. The main associated symptoms for any presenting symptom most commonly arise from the same organ system as the presenting one (see Box 2.3). A patient presenting with a cough may be concerned about the inconvenience of the cough and the impact on their ability to function. However, the diagnosis is not apparent from this symptom alone. For example, diagnosing pneumonia requires enquiry about sputum, fever, breathlessness, chest pain, and haemoptysis. Having to present the history at the end of the interview and getting the student to list the problems and formulate a differential diagnosis forces students to use associated symptoms in their reasoning [5] (see Box 2.6).
Interpreting Features in the History
Studies of the history have assessed which features might be more diagnostic than others for a particular condition. One looked at what features of the history in chest pain are most helpful to clinicians in differentiating cardiac from non-cardiac causes [6]. The authors found no single element of the chest pain history was a powerful enough predictor of non-cardiac pain to allow a clinician to decide on history alone, but researchers have attempted to combine features in the history which can be of use in clinical practice (see Box 2.7). Pain that is stabbing, pleuritic, positional, or reproducible by palpation has likelihood ratios
Box 2.5 Patient perspectives on upper alimentary
tract symptoms
Under CC BY 4.0. Humphrys E, Walter FM, Rubin G et al. (2020). Patient symptom experience prior to a diagnosis of oesophageal or gastric cancer: a multi-methods study. BJGP Open; 4 (1): bjgpopen20X101001. https://doi.org/10.3399/bjgpopen20X101001.
near zero, meaning the likelihood of this kind of pain being cardiac is very low. Conversely, chest pain that radiates to one or both shoulders or arms or is precipitated by exertion has higher likelihood ratios (LR = 2.3 − 4.7) meaning this kind of pain is more likely to be cardiac. Likelihood ratios are discussed in more detail later.
Natural History and Context
The natural history of a disease is the sequence of changes occurring within the body from the beginning of the illness until its resolution. The disease resolves to either complete restoration of health, to loss of function, which may also be progressive, or the patient’s death. Familiarity with the symptoms and signs of a
Box 2.6 The importance of associated symptoms is formulating a reasoned diagnosis
‘Mr. Jackson had a car crash yesterday when his car was hit from behind. The pain (presenting symptom) is worse today and he has been taking paracetamol and ibuprofen. A hot water bottle has helped to relieve his symptoms, but he has been unable to use his keyboard at work. I think he has whiplash.’
Exploring the mechanism of injury, the radiation of pain and the presence of sensorimotor deficit clarifies the diagnosis: ‘Mr. Jackson was hit from the left yesterday just after he had got into his car. Unfortunately, he had not put his seatbelt on. He was thrown against the door and then rebounded back again into his passenger. He felt a sudden pain on the right side of his neck. His pain is worse this morning. Mr. Jackson describes pain radiating to his right shoulder. His arm feels heavy and weak (associated symptom). He has tingling down his arm to his right thumb (associated symptom), in the C6 dermatome. I am concerned he has a hyperextension injury and right C6 nerve root irritation.’
Adapted from Bokshan SL, DePasse JM, Eltorai AE et al. (2016). An evidence-based approach to differentiating the cause of shoulder and cervical spine pain. Am J Med; 129(9): 913–8.
Box 2.7 Differentiating cardiac from non-cardiac chest pain by history
• Low risk of cardiac pain: the 3 P’s – pain that is pleuritic or stabbing, positional, and reproduced by palpation
• Probably low risk: pain not related to exertion that occurs in a small inframammary area of the chest wall
• Probably high risk: pain described as pressure, similar to a previous heart attack/angina or accompanied by nausea, vomiting, and diaphoresis
• High risk: pain that radiates to one or both shoulders or arms or is related to exertion
When combined with the patient’s age, sex, and past medical history, it should be possible to accurately identify those at low risk of cardiac pain, although these are a minority of patients who present to hospital with chest pain. Other serious causes of the chest pain still need to be considered.
Adapted from Swap CJ and Nagurney JT. (2005). Value and limitations of chest pain history in the evaluation of patients with suspected acute coronary syndromes. JAMA; 294 (20): 2623–9.
condition facilitates its diagnosis. The classical presentation of disease as described in textbooks is quite uncommon. Some symptoms and signs occur early in the disease while others occur later. For example, in early heart failure, the patient may become breathless only on severe exertion. Later, breathlessness occurs on more moderate exertion (e.g., climbing the stairs), while later the patient has a problem lying flat in bed (orthopnoea) and may be breathless at rest.
If a learner interviewing a patient with breathlessness believes the symptoms of orthopnoea must be present for heart failure to be diagnosed, they will miss patients in the earlier stages of disease who may benefit from proven treatment. This combination
of the patient’s symptoms and the natural history of the condition is reflected in the New York Heart Association’s functional classification of heart failure (see Box 2.8). The correlation of each class with echocardiographic findings can be used to guide evidence-based therapy.
This concept can be demonstrated to students with a simple class exercise of constructing the natural history of a peritonsillar abscess (quinsy). All will have experienced a viral sore throat, many will have experienced bacterial tonsillitis, and a few suffered a quinsy. Discussion of the symptoms from onset to recovery provides the group with appreciation of the natural history of a condition and its relevance to clinical reasoning (see Box 2.9).
Similarly, students who are taught lists of features of various diseases are not as accurate at diagnosis compared with those taught probabilities of symptoms being present [7]. For example, nearly all adult patients with meningitis in the UK have headache and a fever at presentation, the next most common symptom is nausea and vomiting (70%), but only around half complain of photophobia and a stiff neck [8]. Severe, or late bacterial, meningitis, on the other hand, is associated with prototypal symptoms and signs as described in textbooks. Simply teaching prototypes and ‘medical triads’ is likely to encourage superficial pattern recognition that results in overconfidence, premature closure, and incorrect diagnosis.
The patient’s context, including age and gender, should be emphasised when considering the clinical probability of a disease. Epidemiological studies provide learners with an evidence base to assist in their clinical reasoning. For example, the Interheart Study [9] found over 90% of all patients suffering their first myocardial infarctions had the risk factors of smoking, hypertension, diabetes, abdominal obesity, low physical activity, and poor diet. Box 2.10 shows the percentage pre-test probability for American
Box 2.8 New York Heart Association Functional Classification of Heart Failure – correlation of symptoms, echocardiographic findings, and therapeutic options Class IIIIIIIV
Symptoms
Ejection fraction 40–50%30–40%<30%<30%
Use these drugs
ACE inhibitor Beta-blocker
ACE inhibitor
Beta-blocker
Low-dose
diuretic
Statin
ACE inhibitor Beta-blocker
Diuretic
Spironolactone
Digoxin
Statin
Sacubitril/ valsartan
ACE inhibitor
Spironolactone
Digoxin
Diuretic
Beta-blocker (if compensated)
Sacubitril/ valsartan
Adapted from The Criteria Committee of the New York Heart Association. Nomenclature and criteria for diagnosis of diseases of the heart and great vessels 9th edn. Little, Brown & Co, 1994. pp. 253–256.
Box 2.9 Discuss with your colleagues a possible natural history of a peritonsillar abscess (quinsy)
Students will usually come up with the following list, as it within their own experience:
• Feverish
• Sore throat
• Neck pain
• Pain on swallowing
• Yellow spots on the tonsils
• Neck swelling
• Unable to eat or drink/throat feels blocked
• Generally unwell
Key discussion points:
• Early disease is undifferentiated
• Not every sore throat leads to a quinsy
• Early disease can be difficult to differentiate from serious disease –hence the importance of red flags
• Signs with higher LRs are likely to occur later in the disease
• Red flag/serious symptoms can be used for safety netting/shared decision-making – explanation to the patient about the possible course of events
Box 2.10 Estimating the pre-test probability (%) of coronary heart disease
The figure on the left of each box shows the percentage probability at each age and by sex when the patient had no risk factors and the number on the right-hand side of each box shows the percentage probability of coronary heart disease in patients with the risk factors diabetes, smoking, and hyperlipidaemia.
Reproduced with permission from Henderson MC, Tierney LM, Smetana GW. The Patient History: An Evidence-based Approach to Differential Diagnosis, 2 edn. New York: Lange/McGraw-Hill, 2012.
In summary, exploring and defining symptoms, including associated symptoms, has to be accompanied by knowledge – of the underlying causes of symptoms, what particular features, or combination of features, are helpful in making a diagnosis, the natural history of disease, probabilities (rather than lists) of symptoms associated with various diseases, and epidemiology. The process of history-taking illustrated in Box 2.2 is designed to obtain accurate data, the content of which can then be interpreted by the clinician.
The Physical Examination
While the history is the key component in formulating a diagnosis, the physical examination, used in conjunction, can double the diagnostic power of the history [10]. Inadequate physical examination, mostly failure to perform one at all, is a preventable source of medical error [11]. Despite these facts, there are welldocumented deficiencies in the basic clinical skills of historytaking and physical examination [12].
In diagnosing lobar pneumonia, an evidence-based approach to physical examination asks, ‘What findings most accurately increase the probability of pneumonia?’ by studying patients with respiratory complaints and comparing their physical examinations to chest radiographs. Based on evaluation of over 6000 patients, there are six traditional physical findings that reliably predict the results of the chest radiograph (see Figure 2.1). The remaining findings, whether present or absent, add very little to the diagnosis of pneumonia. Using an evidence-based approach can trim the clinician’s focus from 15 findings of unknown value to six findings with proven value, thereby increasing confidence, efficiency, and accuracy. Clinicians applying this method can then approach their next patient with cough and dyspnoea as if they had personally examined each of the 6000 patients in these studies and then recalled the value of the physical examination gleaned from that experience.
TRADITIONAL FINDINGS EVIDENCE-BASED APPROACH
Fever
Tachypnea
Tachycardia
Reduced oxygen saturation
Grunting respirations
Cyanosis
Asymmetric chest excursion
Percussion dullness
Diminished breath sounds
Crackles
Egophony
Bronchophony
5 findings increase probability
Asymmetrical chest excursion
Egophony
Bronchial breath sounds
Percussion dullness
Oxygen saturation <95%
1 finding decreases probability
All vital signs normal
patients of different ages presenting with chest pain. The patients all had normal 12-lead electrocardiograms and diagnosis was based on history alone. Even without any risk factors, age alone substantially increased the risk of coronary heart disease. The presence of risk factors was particularly helpful in diagnosing middle-aged patients. Women lagged men in the incidence of coronary heart disease until menopause was reached.
Whispering pectoriloquy
Bronchial breath sounds
Pleural rub
Figure 2.1 Diagnosis of lobar pneumonia. Textbooks present 15 traditional physical findings of pneumonia (left), along with the assumption that each finding has similar diagnostic weight. The evidence-based method (right), based on study of actual patients, shows that five findings accurately increase probability of pneumonia, and only one finding decreases it.
Likelihood Ratios
When a symptom or sign is present (or absent), how do we know how useful that finding is in making a diagnosis? Likelihood ratios (LRs) are diagnostic weights. The likelihood ratio is the probability of the finding in someone with the disease over the probability of the finding in someone without the disease. Thus, if a finding is equally likely in people with and without the disease, the likelihood ratio is 1 (i.e., unhelpful). Each finding from the history and physical examination is associated with a unique LR, a number whose values ranges from zero to infinity. An LR greater than 1.0 increases the probability of disease, and the higher the value of the LR, the greater the increase in probability. An LR of less than 1.0 decreases the probability of disease, and the lower the value of the LR, the greater the reduction in probability (see Figures 2.2 and 2.3).
One simple method of interpreting LRs is to memorise the association between three LR values – 2, 5, and 10 – and the first three multiples of 15 – 15, 30, and 45. A finding with an LR of 2 increases the absolute probability by around 15% (that is, the clinician adds 15% to the pre-test probability); a finding with an LR of 5 increases the probability by around 30%, and one with an LR of 10 increases the probability by around 45%.
For those LRs less than 1.0, the clinician simply inverts the 2, 5, and 10 ‘rule’ (that is 0.5, 0.2, and 0.1). A finding with an LR of 0.5 decreases the probability by around 15%; one with an LR of 0.2 decreases the probability by around 30%, and one with an LR of 0.1 decreases the probability by around 45%. Provided clinicians round off final probabilities greater than 100% to 100%, and those less than 0% to 0%, this method suffices for the purposes of clinical reasoning.
Box 2.11 summarises the absolute changes in probability for the most used LRs. Findings with LRs greater than 3 or less than 0.3 are most helpful because these values identify findings that either increase or decrease probability by 20–25% or more.
Can LRs Be Combined?
LRs can be combined only if the two findings are independent of one another (independence implies the LR for the first finding is
(1) Detecting pneumonia: In patients with acute respiratory complaints, “percussion dullness” is found in 18% of patients with pneumonia and in 6% of patients with another cause of respiratory distress. Therefore,
Figure 2.3 Likelihood ratios: diagnostic weights. Clinicians should classify LRs into three groups: those with values greater than 1.0 increase probability; those with values less than 1.0 decrease probability; and those with values near 1.0 change probability very little or not at all.
Box 2.11 Likelihood ratios and bedside estimates
ratioApproximate change in probability*
(2) Detecting coronary artery disease: In patients with chronic chest pain, “dysphagia” is reported in 4% of patients found to have coronary disease and in 20% of patients with another cause of chest pain. Therefore,
Figure 2.2 Likelihood ratios: examples. From McGee (further resources).
*These changes describe absolute increases or decreases in probability. From McGee (2002). J Gen Intern Med; 17: 646–9.
the same whether or not the second finding is present). For example, typical angina (an LR of 5.8) and hyperlipidaemia (an LR of 2.2) are likely to be independent because the accuracy of a history of typical angina is unlikely to be affected by the presence or
absence of hyperlipidaemia. To combine findings, the clinician can simply multiply the two individual LRs (5.8 × 2.2); the resulting product (12.7 or a +50% probability) becomes the LR for combined ‘typical angina and hyperlipidaemia’. Alternatively, the clinician could first apply typical angina (LR of 5.8 or a +35% probability), then hyperlipidaemia (LR of 2.2 or a +15% probability) to obtain the increment in probability for the combined findings (35% + 15% or a +50% probability).
Clinicians should not combine the LRs of more than two individual findings unless clinical studies have proven that the findings are independent. If there is any possibility that the individual findings are dependent on each other, their LRs should not be combined (for example, typical angina and ‘duration of pain < 5 minutes’ should not be combined, because pain lasting less than 10 minutes after rest or nitro-glycerine is a criterion for stable typical angina).
The Limitations of LRs
Statistical calculations are appropriate only when the clinical problem is defined by a diagnostic (or reference) standard, such as laboratory testing or clinical imaging (Figure 2.4). Examples, and their reference standards, are pneumonia (chest radiographs), ascites (ultrasonography), coronary artery disease (coronary angiography), anaemia (full blood count), and hyperthyroidism (thyroid function tests). In each of these disorders, the evidence-based approach compares findings from the history or examination to the accepted reference standard and identifies the findings most accurately predicting the results of that standard. Since many clinical problems lack reference standards, evidence-based reasoning using LRs is not always applicable. For these problems, empiric observation based on the clinician’s prior knowledge and experience of similar patients – what the clinician sees, feels, and hears at the bedside – remains the sole diagnostic standard and LRs cannot be used.
Although LRs describe how the probability changes, they cannot determine the pre-test probability of a disease. For example, the LR for the physical finding ‘fluid wave’ in detecting ascites in patients with abdominal distension is 5.0 (a +30% probability). If the clinician works in a hepatology practice in which 60% of all patients with abdominal distension have ascites (that is a pre-test probability of 60%) the finding of a fluid wave is diagnostic (that is 60% + 30% or
a 90% probability of ascites). On the other hand, if the clinician works in a community practice where only 20% of patients with abdominal distension have ascites (the other 80% have increased abdominal fat or gas), the presence of the fluid wave is less conclusive (20% + 30% or a 50% probability of ascites). Proper application of evidence-based medicine here requires intimate knowledge of the types of diseases found in one’s own practice.
The Future of the History and Physical Examination
Increasingly, researchers are comparing clinical findings to diagnostic standards to reveal LRs for a wide variety of clinical disorders. This is through diagnostic accuracy studies reported to the STARD criteria [13]. These include:
• Both the test (clinical symptom, sign, or laboratory test) and diagnostic standard are clearly defined
• All enrolled patients have symptoms suggestive of the diagnosis under study
• Determination of the test result is blinded from determination of the diagnostic standard
• The study presents enough information to allow calculation of LRs and their confidence intervals.
Clinicians applying this approach can focus on findings with greatest diagnostic accuracy. Nonetheless, this does have limitations. Even when a problem has been studied, conclusions often rest on relatively few patients. Whether diagnostic accuracy depends on clinical technique is largely unaddressed, although the few studies on this subject show diagnostic accuracy with students as observers is the same as with specialists, provided the finding is well-defined. Finally, most literature on the subject focusses on individual findings, although it is well known that expert clinicians typically combine many findings simultaneously when diagnosing disease.
Point of care ultrasound is increasingly being used in acute care settings as an extension of the physical examination (e.g., to estimate volume status, or differentiate fluid from consolidation in the lungs). However, the same caveats for all diagnostic tests apply (see Chapter 3) – the history and physical examination remains fundamental in establishing the clinical probability of disease and ultrasound ‘findings’ need to be interpreted in light of this. Point of care ultrasound has several limitations and should be seen as a decision aid pending more definitive investigations.
Developing Skills in Teaching
It is challenging for busy clinicians to be experts in clinical communication and in teaching evidence-based history and physical examination. This has contributed to a decline in bedside teaching since the 1960s. It is however both a patient and student-centred activity. The scope for evidence-based history and examination is exciting, with potential to improve patient safety. Role-modelling of reflective practice by bedside teachers can assist learners in developing resilience and dealing with the uncertainty of clinical practice. Careful planning and engagement with patients can help develop clinical teachers. Box 2.12 lists some tips for teaching
Box 2.12 Tips for teaching evidence-based physical examination
• Practice teaching concepts of diagnostic accuracy
• Practice estimating the pre-test probability of disease
• Practice teaching methods to estimate post-test probabilities
• Know where to find evidence-based physical examination data and prepare to use it
• Prepare an answer to the question, ‘Can the likelihood ratios of multiple findings be combined?’
• Answer the common question, ‘Why should we examine patients if it is so unhelpful?’
• Teach the basics of evidence-based physical examination and prepare students for bedside teaching
• Orientate the patient to the purpose of the teaching and explicitly discuss evidence-based physical examination
• Encourage students to commit to their own description of findings
• Encourage students to commit to a next step in management
• Facilitate deliberate practice and give feedback to learn evidencebased physical examination
• Acknowledge uncertainty and follow up on unresolved issues
Adapted from Mookherjee S, Hunt S, Chou CL. (2015). Twelve tips for teaching evidence-based physical examination. Medical Teacher; 37(6): 543–550.
evidence-based physical examination. Seeing variations in demonstration of the physical examination is a source of discomfort for students, particularly around assessment. Reasons for variation in technique should be discussed with learners to help them manage their uncertainty and to apply these critical skills.
Summary
Practicing evidence-based history and examination is challenging but rewarding. Since the history and examination is so critical to the patient’s care, a robust evidence base is essential, and merits increased research. An initial step for learners is establishing the importance of the history and examination not only for initial formulation of the patient’s problem list and differential diagnosis, but the correct interpretation of any subsequent investigations. Clinical teachers should be supported in developing their own confidence and skills in teaching evidence-based history and examination.
Acknowledgements
Thanks are due to my co-author of the first edition of this chapter, Steven McGee, Emeritus Professor of Medicine, University of Washington, Seattle, USA, whose work and contribution to evidence-based history and examination continues to be an inspira-
tion, including this revision.Thanks also to Lucille Middleton, Graduate Entry Medicine student at the University of Nottingham, UK, for contributing Box 2.1 to this chapter.
References
1. Hampton JR, Harrison MJ, Mitchell JR et al. (1975). Relative contributions of history-taking, physical examination, and laboratory investigation to diagnosis and management of medical outpatients. British Medical Journal; 31; 2(5969): 486–489.
2. Silverman J, Kurtz SM and Draper J. Skills for communicating with patients, 3rd Ed. CRC Press, 2013.
3. Silverman J. The consultation. In: Cooper N and Frain J (Eds). ABC of Clinical Communication. Wiley-Blackwell, 2018.
4. Frain J and Abdalla M. Teaching clinical communication. In: Cooper N and Frain J (Eds). ABC of Clinical Communication. Wiley-Blackwell, 2018.
5. Kilian A, Upton LA and Sheagren JN. (2020). Reorganizing the history of present illness to improve verbal case presenting and clinical diagnostic reasoning skills of medical students: the all-inclusive history of present illness. Journal of Medical Education and Curricular Development; 7: 2382120520928996.
6. Swap CJ and Nagurney JT. (2005). Value and limitations of chest pain history in the evaluation of patients with suspected acute coronary syndromes. JAMA; 294(20): 2623–2629.
7. Elieson SW and Papa FJ. (1994). The effect of various knowledge formats on diagnostic performance. Academic Medicine; 69(10 Suppl): S81–S83.
8. Thomas KE, Hasbun R, Jekel J and Quagliarello VJ. (2002). The diagnostic accuracy of Kernig’s sign, Brudzinski’s sign, and nuchal rigidity in adults with suspected meningitis. Clinical Infectious Diseases; 35(1): 46–52.
9. Yusuf S, Hawken S, Ounpuu S et al. (2004). Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet; 364: 937–952.
10. Paley L, Zornitzki T, Cohen J et al. (2011). Utility of clinical examination in the diagnosis of emergency department patients admitted to the department of medicine of an academic hospital. Archives of Internal Medicine; 171(15): 1393–1400.
11. Verghese A, Charlton B, Kassirer J et al. (2015). Inadequacies of physical examination as a cause of medical errors and adverse events: a collection of vignettes. The American Journal of Medicine; 128(12): 1322–1324.
12. Holboe ES. (2004). Faculty and the observation of trainees’ clinical skills: problems and opportunities. Academic Medicine; 79: 16–22.
13. Cohen JF, Korevaar DA, Altman DG et al. (2015). STARD guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open; 6: e012799. doi:10.1136/bmjopen-2016-012799.
Further Resources
1. McGee S. Evidence-based physical diagnosis, 5th Ed. Elsevier/Saunders, 2021.
2. Talley N and O’Connor S. Clinical examination, 9th Ed. Elsevier, 2021.
CHAPTER 3
Choosing and Interpreting Diagnostic Tests
Nicola Cooper
OVERVIEW
• Test results are affected by a number of factors which the clinician has to take into account
• There is no such thing as a perfect test
• The interpretation of new information depends on what you believed beforehand, based on your assessment of the patient
• Predictive values combine information about sensitivity, specificity, and prevalence and indicate how likely a test result is to be correct
• Thresholds provide a useful way of thinking about whether a test should be performed at all
Introduction
The history and physical examination provide a differential diagnosis and/or problem list. This is refined further using diagnostic tests. The appropriate selection of tests depends of the quality of the history and physical examination. Test results then have to be interpreted in light of the patient’s history and examination findings because test results are affected by a number of factors (see Box 1.1):
• How ‘normal’ is defined
• Factors other than disease that influence test results
• Operating characteristics
• Sensitivity and specificity
• Prevalence of disease in a population Unfortunately, commonly used measures of test accuracy, such as sensitivity and specificity, are poorly understood. A systematic review of 24 studies found that most qualified healthcare professionals were poor at providing definitions of sensitivity and specificity, and were poor at estimating the post-test probability of disease [1]. This chapter aims to introduce key concepts and provide further resources for this important area of clinical reasoning.
Factors other than disease that influence test results
Operating characteristics
Sensitivity and specificity
(see Box 3.3)
Prevalence of disease in a population
(see Box 3.5)
• ‘Normal’ can refer to values within the reference range for the population to which the patient belongs
• It can also refer to a value below or above a pre-determined cut-off point designed to maximise true positives and minimise false positives
• It can also be an ‘abnormal’ result that is actually normal for the particular context in question
• These are biological and/or laboratory factors that make test results ‘abnormal’ when they are not, or vary when there has not been a true change
• This refers to the method of performing the test itself which, if not optimal, can affect its accuracy
• The sensitivity of a test refers to its ability to correctly identify patients with the disease
• The specificity of a test refers to its ability to correctly identify patients without the disease
• Sensitivity and specificity are characteristics relating to the accuracy of a test relative to a reference standard
• The prevalence of disease in a population can significantly alter the predictive value of a test
• The positive predictive value is the proportion of people with a positive test result who truly have the disease
• The negative predictive is the proportion of people with a negative test result who do not have the disease
How Normal Is Defined
Many diagnostic test results are expressed as continuous variables on a numerical scale and many quantitative measurements in human populations have a Gaussian (normal) distribution. The ‘normal’ range is defined as those values that encompass 95% of the healthy population, or two standard deviations from the mean. This means that 2.5% of the healthy population will have values above, and 2.5% of the population will have values below, the normal range. For this reason, it is more appropriate to use the term ‘reference range’ (see Figure 3.1). Diagnostic test results in people with a disease also have a Gaussian distribution but with a different mean and reference range. In some diseases there is no overlap between results from the abnormal and normal population, but in some diseases there is. In the latter, the greater the difference between the result and the reference range of the normal population, the higher the chance that the person has the disease.
Arbitrarily dividing a range of values into ‘normal’ and ‘abnormal’ has disadvantages – it does not take into account the magnitude of the result. For example, a highly sensitive troponin T result in a patient with chest pain is more likely to indicate myocardial injury when the value is very high, as opposed to slightly raised. Some test results have a binary classification (‘normal’ vs ‘abnormal’), for example, an exercise electrocardiogram (ECG) looking for signs of ischaemic heart disease. However, in deciding where the cut-off point between ‘normal’ and ‘abnormal’ should be, there is a trade-off between sensitivity (true positives) and specificity (true negatives). The optimum cut-off point is calculated using receiver operating characteristic (ROC) analysis, which is described in more detail later.
In medicine there are some situations when a normal result is abnormal, and an abnormal result is normal. For example in a clinically severe asthma attack when one expects the PaCO2 to be low, a normal PaCO2 on an arterial blood gas is not normal at all and indicates life-threatening asthma. On the other hand, a raised d-dimer is normal in pregnancy. So what is ‘normal’ and ‘abnormal’ has to be interpreted in light of the clinical picture. Clinicians who use diagnostic tests should have a good working knowledge of the tests they use in their everyday practice, and how they should be interpreted in light of the patient’s history and examination findings.
Factors Other than Disease Which Influence Test Results
There are a number of factors other than disease which influence test results. They include:
• Age
• Sex
• Ethnicity
• Pregnancy
• Body position
• Chance
• Spurious (in vitro) results
• Lab error
• Critical difference values
For example, normal values for paediatric blood results can be significantly different to those of adults. Old people often have a normal white cell count in the presence of infection, and can have a significantly reduced glomerular filtration rate with a normal creatinine. Men have slightly different reference ranges to women (e.g., for haemoglobin) and healthy black adults may have an ‘abnormal’ 12-lead electrocardiogram (due to early repolarisation) that can resemble serious disease, but is in fact a ‘normal variant’ [2].
Pregnancy significantly alters many test results due to the physiological changes that occur, particularly in the third trimester. A large foetus splints the diaphragm and compresses the lungs causing supine hypoxaemia as well as a respiratory alkalosis (important facts to remember when considering the possibility of pulmonary embolism in a pregnant woman). Circulating volume increases by 50% in late pregnancy causing a flow murmur, tachycardia, and a rightward axis on the 12-lead electrocardiogram. Kidneys also swell as a result, and renal ultrasound shows increased size and dilatation.
Body position is important in some tests, for example, lung function and tests where the patient has to lie in a certain position to get optimal images. Finally, a test result may be abnormal by chance (e.g., the patient is an outlier on the normal curve); the result may be spurious (e.g., hyperkalaemia caused by haemolysis or some haematological conditions); or may be due to lab error (e.g., as a result of a technical or human error). It is always worth pausing before acting when a very unexpected test result crops up.
Lab results also vary in the same person at different times. The critical difference, also known as the reference change value, is the smallest difference between sequential laboratory results in the same patient which is likely to indicate a true change. Let’s imagine a person has their cholesterol measured every single day. The result will not be identical every time. The reason for this is natural biological variation but also lab variation. The combination of the two is the critical difference – the amount by which the test can vary before it can be considered a true change. This is calculated using knowledge of normal intra-individual variation and lab variation for different tests. The critical difference is different for different lab tests. Some calculated critical difference values for common biochemistry results are shown in Box 3.2. For a person having their serum cholesterol monitored, the critical difference is 17%. An initial value of 5.2 mmol/L can therefore vary between 4.3 mmol/L and 6.1 mmol/L without being a true change.
Box 3.2 Calculated critical difference (CD) for some common biochemistry results
Operating Characteristics
Before ordering a test, it is important to be aware of certain operating characteristics of the test. This refers to the method of actually performing the test itself. For example, measuring lung function requires that the patient be able to hear, understand, and co-operate with instructions, as well as hold their breath. Exercise electrocardiograms require patients to be able to walk briskly and cannot be accurately interpreted in people who have left bundle branch block.
Some tests are highly operator dependent – in other words, the skill of the operator influences the results and the report provided. Ultrasound is the best example of this, as dynamic images have to be skilfully interpreted by the sonographer. For radiology investigations in general, the interpretation of results can be highly influenced by the patient’s body habitus or clinical state. In ultrasound, for example, morbid obesity can make getting good views difficult, and in people of all sizes, intra-abdominal organs can be obscured by bowel gas. In computed tomography, images can be severely degraded by movement artefact, or interpretation can be affected by whether or not contrast was used, and whether it circulated as anticipated to get optimal images. If a report says, ‘Limited views due to … but within these limitations, no abnormality detected’ consider whether it is in fact a non-diagnostic scan, rather than a ‘normal’ scan. It is also important that radiologists, as well as other clinicians such as physiologists, are provided with a clear clinical question and key information in the history, past medical/surgical history, and physical examination. This is so that ‘abnormalities’ or incidental findings can be interpreted in light of the clinical context.
Sensitivity and Specificity
The sensitivity of a test refers to its ability to correctly identify patients with the disease. The specificity of a test refers to its ability to correctly identify patients without the disease. Even a test with
Box 3.3 Sensitivity and specificity
DiseaseNo disease
Positive testA (True positive) B (False positive)
Negative testC (False negative) D (True negative)
The sensitivity of a test refers to its ability to correctly identify patients with the disease, i.e. A/(A + C) × 100.
The specificity of a test refers to its ability to correctly identify patients without the disease, i.e. D/(D + B) × 100.
a high sensitivity, for example 95%, will miss 5% of people with the disease. Unfortunately, there is no such thing as a perfect test. Test results consist of ‘true positives’ and ‘false positives’; ‘true negatives’ and ‘false negatives’. Box 3.3 illustrates this. Tests differ in their sensitivity and specificity for detecting certain diseases, so clinicians need to have a sound working knowledge of the accuracy of the tests they use on a day-to-day basis.
A very sensitive test will detect most disease but generate abnormal findings in healthy people. We see this with the aptly named high sensitivity troponin T. On the other hand, a very specific test may miss significant disease but is likely to establish the diagnosis beyond doubt when the result is positive. You may have heard of the acronyms ‘SNOUT’ and ‘SPIN’. SNOUT stands for ‘sensitive test when negative rules out the disease’ and SPIN stands for ‘specific test when positive rules in the disease’. However, SNOUT and SPIN are misleading. This is because the diagnostic power of any test is determined by both its sensitivity and specificity, as well as the prevalence of disease in the population – more of that later. The trade-off between sensitivity and specificity is explored in what is termed a ‘ROC analysis’.
ROC Analysis
ROC stands for ‘receiver operating characteristic’ – so called because it was developed by radar engineers during World War II for discriminating enemy objects in the battlefield. It is also known as the ‘relative operating characteristic’ because it compares two operating characteristics (true positive results and false positive results) at various settings (see Figure 3.2). It is used in medicine to select the best cut-off point for a test in a way that maximises true positives while minimising false positives. ROC analysis is conducted in a research setting whenever investigators measure the ability of a test to detect a diagnosis in a population with the disease and exclude the diagnosis in those without it. (Of course, the results of the analysis also depend on what study population was used – if the same performance is expected in practice, the test must be used in a similar population). For example, if we define an exercise electrocardiogram as ‘abnormal’ when there is at least 0.5 mm of ST depression, we could pick up every case of ischaemic heart disease but generate many false positives. On the other hand, if we define an exercise electrocardiogram as ‘abnormal’ when there is at least 2 mm of ST depression, we could detect most cases of clinically important ischaemic heart disease but with far fewer false positives, which is far more practical.
Data from Professor Trefor Higgins, Department of Laboratory Medicine and Pathology, University of Alberta.
Figure 3.2 Receiver operating characteristic (ROC) curve. The curve is generated by adjusting the cut-off values defining ‘normal’ and ‘abnormal’, calculating the effect on sensitivity and specificity, and then plotting these against each other. The closer the curve gets to the top left-hand corner, the more useful the test is. The dotted line represents a test with no discriminant value.
Conditional Probability
Conditional probability is the probability that something is true given that something else is true. Bayes’ Theorem (named after English clergyman Thomas Bayes 1702–61) is a mathematical way to describe this. It estimates the post-test probability using information about pre-test probability and the sensitivity and specificity of the test.
Figure 3.3 illustrates Bayes’ Theorem and more detailed explanations can be found in the further resources. ‘Bayesian reasoning’ is the term sometimes used for clinical reasoning using probabilities. Test results shift our thinking, but sometimes by not very much. The probability that someone actually has a disease depends on the clinical (pre-test) probability, a judgement based on the patient’s background, history and examination findings, and the sensitivity and specificity of the test. Imagine an elderly woman has been brought to the emergency department after falling and hurting her left hip. On examination, the left hip is extremely painful to move and she cannot weight bear. Both antero-posterior and lateral X-rays of the left hip are normal (see Figure 3.4). Is there a fracture? Sox and colleagues (see further resources) state a fundamental assertion, which they describe as a profound and subtle principle of clinical medicine: the interpretation of new information depends on what you believed beforehand As a simple rule of thumb, in a high clinical probability patient, a normal test result does not necessarily exclude the disease, but in a low clinical probability patient, a normal test result does exclude the disease. Let’s go back to our elderly woman who has fallen. The sensitivity of plan X-rays of the hip performed in the emergency department for suspected hip fracture is 95%. That means 5% of fractures (or 1 in 20) are missed. In an elderly woman, likely to have osteoporosis, whose left hip is extremely painful to move and she cannot weight bear, a normal X-ray does
Figure 3.3 How a test results shift our thinking using Bayes’ Theorem. The sensitivity of a troponin test is 95% and the specificity is 80%. If we imagine a patient with chest pain and our pre-test or prior probability is 50% (i.e., we are sitting on the fence) a positive or a negative result would significantly shift our thinking about whether the patient is having a heart attack. But if our prior probability was very low (e.g., 10%) a negative test result would shift our thinking by very little and a positive test result would not by itself be conclusive (dotted line). Bayes’ Theorem is a method for interpreting evidence in the context of previous knowledge. It has wide applications and constitutes a mathematical foundation for reasoning. In clinical practice, doctors do not use algebra to work out pre- and post-test probabilities, however an understanding of the principles of Bayesian reasoning is important because the ability to accurately estimate probability is important in clinical reasoning. Bayes’ Theorem:
PRDisPDis
PDisR
PRDisPDisPRnoDisPno / / // DDis where P[Dis/R+] is the chance of having the disease given a positive test result; and P is probability, Dis is disease, and R+ is a positive test result. Figure from Brush JE. Probability: Uncertainty Quantified. In: The Science of the Art of Medicine, 2015. Reproduced with permission of Dementi Milestone Publishing.
Figure 3.4 Is there a fracture?
not necessarily exclude a fracture. But if the examination of the hip was normal and she could walk easily, a normal X-ray would be enough to satisfy the clinician that there is probably no fracture. The same test result is interpreted completely differently when the clinical (pre-test) probability changes.
The example above illustrates that when the clinical probability and the test result are discordant, we may need to think more carefully. For example, CT pulmonary angiography (CTPA) in the diagnosis of pulmonary embolism (PE) has a specificity of 98% and a sensitivity of 94%. When patients with an intermediate or high clinical probability of PE have a positive CTPA, the result can be trusted. Likewise, when patients with a low clinical probability of PE have a negative CTPA, the result can also be trusted. But what if a high clinical probability patient has a negative CTPA, or a low clinical probability patient has a positive CTPA – what then?
One study found that around 40% of CTPA results were false in these situations [3]. This is why further imaging (e.g., V/QSPECT) may be indicated in high clinical probability patients. It is also why formal clinical probability assessment, D-dimer testing, and CTPA which includes imaging of the lower limbs is used in combination before safely withholding anticoagulation in patients being investigated for possible PE. There are many other examples in medicine where clinical probability really matters in accurately, and safely, interpreting a diagnostic test result.
The lesson from these examples is that tests, even good tests, can be wrong.
Tests give us test probabilities, not real probabilities. Tests have to be interpreted in light of the clinical probability and estimating clinical probability requires knowledge – formal and experiential knowledge of basic science, epidemiology, clinical skills, and clinical medicine.
Prevalence of Disease in a Population
Now let’s get more complicated! Consider this problem that was given to a group of Harvard doctors: if a test to detect a disease whose prevalence if 1:1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person’s symptoms or signs? (Assume no false negatives.) Just under half replied with the answer 95%. Now look at Box 3.4 for the answer.
Sensitivity and specificity are characteristics relating to the accuracy of a test relative to a reference standard. They are an assessment of the test. But as a clinicians we are interested in the question, ‘What are the chances that a person with a positive result actually has the disease?’ In other words, we want to assess people. Predictive values do just that – by combining sensitivity, specificity, and prevalence of the disease in a population to answer this question (see Box 3.5). Just considering test accuracy can be misleading when the number of ‘positives’ and ‘negatives’ in different groups varies greatly.
In predictive analytics, a confusion matrix (yes, it’s real name) is a 2 × 2 table that reports the number of true positives, false positives, true negatives, and false negatives using information about the prevalence of disease in the population. This allows more detailed analysis than simply observing the proportion of correct classifications (or test accuracy).
Box 3.4 What is the chance a person found to have a positive result actually has the disease?
Many doctors give an answer of 95%, but the actual answer is illustrated in the table below:
DiseaseNo diseaseTotal Actual19991000
Positive test15051
Negative test0949949
If we sent 1000 tests to the lab, we would get 51 positive results – 1 true positive and 50 false positives. This chance of having a positive result and actually having the disease is 1 out of 51 – or 2%. This example illustrates the importance of understanding prevalence.
Box 3.5 Predictive values
DiseaseNo disease
Positive testA (True positive) B (False positive)
Negative testC (False negative) D (True negative)
The positive predictive value – ‘What is the chance that a person with a positive test truly has the disease?’ – is A/(A + B) × 100. The negative predictive value – ‘What is the chance that a person with a negative test does not have the disease?’ – is D/(D + C) × 100.
Positive and negative predictive values are influenced by the prevalence of the disease in the population being tested. Using a test in a population with higher prevalence increases positive predictive value (and decreases negative predictive value).
John Brush, in his book The Science of the Art of Medicine (see further resources) uses this next example to illustrate. We know from angiography results and post-mortem studies the actual prevalence of coronary artery disease in different patient groups. Young women with non-cardiac sounding chest pain have a low prevalence of ischaemic heart disease (1%). On the other hand, older men with typical symptoms of angina have a high prevalence ischaemic heart disease (94%). If we sent a patient from each of these groups for an imaging stress test, which has a sensitivity of 90% and a specificity of 85%, and both tests came back positive, how would we interpret the results? In other words, what is the positive predictive value of the test in these two different scenarios? Aside from the fact that we should consider whether to request this test at all in patients with such extreme pre-test probabilities, Box 3.6 shows the results we would get if we tested 100 patients just like each of them.
This example demonstrates the flaws in believing that a positive result on a highly sensitive test indicates the presence of a condition and that a negative result on a highly specific test indicates the absence of a condition. Prevalence matters. In deciding the clinical (pre-test) probability of disease, novices tend to focus on the patient’s history and physical examination findings. A more accurate way of
Box 3.6 Confusion matrix showing results of an imaging stress test in a) a 35-year-old woman with non-cardiac sounding chest pain and b) a 65-year-old man with typical symptoms of angina
An imaging stress test has a sensitivity of 90% and a specificity of 85%. Although both patients had some kind of chest pain and both were sent for the same test, how we interpret a positive result is completely different for each one because the prevalence of disease in the group to which the patient belongs is so different (see Box 3.5 for predictive values).
estimating pre-test probability is to first ask yourself, ‘Who is my patient?’ – in other words, the prevalence of disease in the group to which the patient belongs – then add in information from the history and physical examination findings to come up with an estimate of pre-test probability: low, intermediate, or high. Then use this estimate to choose and interpret diagnostic tests. See Box 3.7 for an example that illustrates this.
Thresholds
An important consideration in the diagnostic process is whether to do a test at all. If a test will make no difference to the probability or outcome of a disease, should the test be done? Tests (when they are selected rationally, that is) are most helpful when they change the management of a patient’s condition. It is also not necessary to know the true state of the patient before deciding whether to act. The therapeutic threshold combines factors such as test characteristics, risks of the test, the risks and benefits of treatment, as well as the potential penalty for being wrong. The point at which the factors are all evenly weighed is the threshold. If a test or treatment for a disease is effective and
Box 3.7
Estimating clinical (pre-test) probability
A 30-year-old woman complained of a constant, dull left-sided headache. On examination she was tender over her left temple. A junior doctor remembered learning about temporal arteritis and requested an erythrocyte sedimentation rate (ESR), a test for temporal arteritis. The result was abnormal. The junior doctor diagnosed temporal arteritis and started steroids.
The problem with this story is that temporal arteritis almost exclusively affects people aged 50 years or more. So even with this history, the pre-test probability of temporal arteritis is close to zero in this patient, which affects the predictive value of the test, and thus the interpretation of the result.
low risk then one would have a lower threshold for going ahead. On the other hand, if a test or treatment is less effective or high risk, one requires greater confidence in the diagnosis and potential benefits of treatment first.
Summary
Tests do not make a diagnosis, clinicians do. Tests give us test probabilities not real probabilities. A working knowledge of factors other than disease that influence test results, operating characteristics, and how accurate the test is for the disease in question is important. Assessing clinical (pre-test) probability is vital, without this you cannot interpret any test result. Pre-test probability is derived from knowledge of the prevalence of the disease in the group to which the patient belongs and information from the individual’s history and physical examination findings. Positive predictive values and negative predictive values are the proportion of people with a positive (or negative) test result who have (or do not have) a disease. They can be thought of as the post-test probability of a disease. Finally, thresholds provide a useful way of thinking about whether a test should be performed at all.
References
1. Whiting PF, Davenport C, Jameson C et al. (2015). How well do health professionals interpret diagnostic information? A systematic review. BMJ Open; 5: e008155 (accessed April 2022).
2. Walsh B, Macfarlane PW, Prutkin JM and Smith SW. (2019). Distinctive ECG patterns in healthy black adults. Journal of Electrocardiology; 56: 15–23.
3. Stein PD, Fowler SE, Goodman LR et al. (2006). Multidetector computed tomography for acute pulmonary embolism. The New England Journal of Medicine; 354: 2317–2327.
Further Resources
1. Sox HC, Higgins MC and Owens DK. Medical decision making, 2nd Ed. Oxford: Wiley-Blackwell, 2013.
2. Brush JE. The Science of the Art of Medicine. Dementi Milestone Publishing, 2015.
3. Stone JV. Bayes’ Rule. A tutorial introduction to Bayesian analysis. Sebtel Press, 2013.