Methodological considerations in construct development Michele Jonsson Funk
33rd ICPE August 29th, 2017 Palais des Congress, Montreal, Canada 1
Disclosures • No funding specifically for this work. • No relevant conflicts of interest for this presentation.
(c) 2017 Michele Jonsson Funk
2
Tradeoffs inherent to classification False Negatives <100% Sensitivity
Number of individuals
250
False Positives <100% Specificity
200
150
NonDiseased Diseased
100
50
(c) 2017 Michele Jonsson Funk
120
115
110
105
100
95
90
More
Outcome Score
85
80
75
70
65
60
55
50
45
40
35
30
25
20
0
3
ROC interpretation
250 200 150 100 50 More
110
95
80
65
50
35
20
0
S. Yu et al. Journal of Biomedical Informatics 2014 (c) 2017 Michele Jonsson Funk
4
ROC interpretation
250 200 150 100 50 More
110
95
80
65
50
35
20
0
S. Yu et al. Journal of Biomedical Informatics 2014 (c) 2017 Michele Jonsson Funk
5
ROC interpretation
250 200 150 100 50 More
110
95
80
65
50
35
20
0
S. Yu et al. Journal of Biomedical Informatics 2014 (c) 2017 Michele Jonsson Funk
6
ROC interpretation
250 200 150 100 50 More
110
95
80
65
50
35
20
0
S. Yu et al. Journal of Biomedical Informatics 2014 (c) 2017 Michele Jonsson Funk
7
Roles and bias due to misclassification
8
Bias due to misclassified outcome • Toward the null if both are true: – Non-differential wrt exposure – Outcome is binary
• Unbiased in special cases 1. Perfect Sp + non-differential Se <100% → unbiased Relative Risk • If outcome is rare in all exposure groups, OR unbiased • If person-time is only impacted minimally, rate ratio unbiased
2. Perfect Se + non-differential Sp <100% → unbiased Risk Difference (c) 2017 Michele Jonsson Funk
9
Effect of interest • Rate / risk in population of interest – Biased upward by poor Sp – Biased downward by poor Se
• Relative effect of exposure (RR) – Most algorithms maximize specificity because of special case
• Absolute effect of exposure (RD) – Biased to the degree that algorithm in not sensitive – Valuable for decision making (individual, population) (c) 2017 Michele Jonsson Funk
10
Bias due to misclassified exposure • Toward the null if all are true: – Non-differential misclassification wrt outcome – Error unrelated to other variables – No other sources of systematic error (selection bias, uncontrolled confounding) – Exposure is binary
(c) 2017 Michele Jonsson Funk
11
Exposure misclassification in CER A1 vs A0 • Sn: 0.85 • Sp: 0.95 • True RR=2.00 • Obs RR=1.72 B1 vs B0 • Sn: 0.90 • Sp: 0.98 • True RR=2.00 • Obs RR=1.47 Jonsson Funk M, Landi SN. Curr Epidemiol Report 2014. (c) 2017 Michele Jonsson Funk
12
Exposure misclassification in CER A1 vs B1 • •
True RR=1.0 Obs RR=1.2
Jonsson Funk M, Landi SN. Curr Epidemiol Report 2014. (c) 2017 Michele Jonsson Funk
13
Impact of misclassified incl/excl • Inclusion – False negatives: Smaller n – False positives: • Individuals included who are not indicated for treatment
– Generalizability: Popn for inference is sicker, more medicalized
• Exclusion – False positive: Smaller n – False negatives: • Patients ineligible for tx (e.g. CKD) • Patients with prior events (e.g. cancer) which will appear to be incident (impacts outcome classification)
• If differential by exp or outcome, leads to bias (c) 2017 Michele Jonsson Funk
14
Bias due to misclassified confounder • Greenland’s (1980): “Partial control” finding – Adjustment for misclassified confounder still yields less biased estimate than not adjusting Good classification
Bad classification Slide courtesy of Mitchell Conover
Truth
Truth
Adjusted
Crude
Adjusted Crude
15
Bias due to misclassified confounder • Toward the crude (aka partial control) if all are true: – Nondifferential wrt exposure and outcome – No qualitative interaction between exposure and confounder – Independent of errors for study variables – Binary confounder effect on X are monotonic within Y; effect on Y are monotonic within X – Continuous covariate: misclassification independent of true value – No other sources of error
• Either toward or away from the null (c) 2017 Michele Jonsson Funk
16
Likelihood of misclassification is related to true value
DEPEND
INDEPEND
NON‐ DIFF
DIFF
Effects within strata Qualitative modification
Slide courtesy of Mitchell Conover
17
Assessing algorithm performance • Gold standard vs alloyed • Study specific vs general purpose – Performance within key groups – exp x outcome; cov x exp; cov x outcome
• Over time – Availability of informative unstructured data sources (e.g. newly integrated radiology reports) – Changes in billing that influence practice (e.g. requirements for certain types of documentation)
• Subgroups of interest – Sex, race/ethnicity, age, insurance status, comorbid conditions (c) 2017 Michele Jonsson Funk
18
Use of BMI to identify obesity â&#x20AC;¢ https://www.ncbi.nlm.nih.gov/pmc/articles/PM C5500540/
Shah NR, Braverman ER (2012) Measuring Adiposity in Patients: The Utility of Body Mass Index (BMI), Percent Body Fat, and Leptin. PLoS one 7(4): e33308. doi:10.1371/journal.pone.0033308
(c) 2017 Michele Jonsson Funk
19
Subgroup differences: BMI>24 •
Sensitivity – M: 98% – F: 79%
•
Specificity – M: 35% – F: 87%
Shah NR, Braverman ER (2012) PLoS One. (c) 2017 Michele Jonsson Funk
20
Subgroup differences: BMI >26 •
Sensitivity – M: 96% – F: 62%
•
Specificity – M: 63% – F: 90%
Shah NR, Braverman ER (2012) PLoS One. (c) 2017 Michele Jonsson Funk
21
Major caveat on expected bias • Assumptions about direction of bias are true on average • Not necessarily in any single study • Plan on conducting quantitative bias analysis
(c) 2017 Michele Jonsson Funk
22
Quantitative bias analysis • • • •
Simple bias analysis Probabilistic bias analysis (PBA) Bayesian bias analysis Multiple imputation for measurement error (MIME) • Propensity score calibration
(c) 2017 Michele Jonsson Funk
23
PBA of construct measured with error • • • • •
Iterative form of simple bias analysis Adjusts treatment effect point estimate & CI Simultaneously adjust for other measured confounders Moves individuals between levels of one construct while holding others constant Requires estimates and distribution for: – Sensitivity and specificity OR – Positive and negative predictive values
(c) 2017 Michele Jonsson Funk
24
Applying validation results • Near perfect specificity. – All classified as obese will stay in this category
• Poor sensitivity. – Move some individuals classified as non-obese into obese category
Non-obese (C*)
Obese (C*) X=1
X=0
Y=1
38 (A1)
20 (B1)
Y=0
461 (C1)
285 (D1)
499 (M1)
305 (N1)
(c) 2017 Michele Jonsson Funk
X=1
X=0
Y=1
196 (A0)
114 (B0)
Y=0
4287 (C0)
3772 (D0)
4483 (M0)
3886 (N0)
Camelo Castillo 2015 JAMA Peds
25
Quantitative bias analysis (QBA) results Sensitivity
NaĂŻve
NaĂŻve:
Specificity
M i n
M o d e 1
M o d e 2
m a x
1.0
1.0
1.0
1.0
M i n
M o d e 1
M o d e 2
m a x
1.0
1.0
1.0
1.0
X-Y assn OR (2.5th, 97.5th)
1.46 (1.17, 1.81)
Assumes perfect Se and Sp (no misclassified obesity)
(c) 2017 Michele Jonsson Funk
26
Quantitative bias analysis (QBA) results Sensitivity
Specificity
M i n
M o d e 1
M o d e 2
m a x
NaĂŻve
1.0
1.0
1.0
QBA 1
.11
.15
.32
NaĂŻve: QBA1:
X-Y assn OR (2.5th, 97.5th)
M i n
M o d e 1
M o d e 2
m a x
1.0
1.0
1.0
1.0
1.0
1.46 (1.17, 1.81)
.35
1.0
1.0
1.0
1.0
1.45 (1.16, 1.78)
Assumes perfect Se and Sp (no misclassified obesity) Trapezoidal distribution for Se, non-differential misclassification
(c) 2017 Michele Jonsson Funk
27
Quantitative bias analysis (QBA) results Sensitivity
Specificity
M i n
M o d e 1
M o d e 2
m a x
NaĂŻve
1.0
1.0
1.0
QBA 1
.11
.15
QBA 2
.11
.15
NaĂŻve: QBA1: QBA2:
X-Y assn OR (2.5th, 97.5th)
M i n
M o d e 1
M o d e 2
m a x
1.0
1.0
1.0
1.0
1.0
1.46 (1.17, 1.81)
.32
.35
1.0
1.0
1.0
1.0
1.45 (1.16, 1.78)
.32
.35
.90
.95
.99
1.0
1.45 (1.19, 1.84)
Assumes perfect Se and Sp (no misclassified obesity) Trapezoidal distribution for Se, non-differential misclassification Trapezoidal distribution for Se and Sp, non-differential misclassification
(c) 2017 Michele Jonsson Funk
28
Quantitative bias analysis (QBA) results Sensitivity
Specificity
M i n
M o d e 1
M o d e 2
m a x
Naïve
1.0
1.0
1.0
QBA 1
.11
.15
QBA 2
.11
QBA 3: Y=1 Y=0
Naïve: QBA1: QBA2: QBA3:
X–Y assn OR (2.5th, 97.5th)
M i n
M o d e 1
M o d e 2
m a x
1.0
1.0
1.0
1.0
1.0
1.46 (1.17, 1.81)
.32
.35
1.0
1.0
1.0
1.0
1.45 (1.16, 1.78)
.15
.32
.35
.90
.95
.99
1.0
1.45 (1.19, 1.84)
.5
.6
.8
.9
1.0
1.0
1.0
1.0
1.50 (1.24, 1.81)
.11
.15
.32
.35
1.0
1.0
1.0
1.0
Assumes perfect Se and Sp (no misclassified obesity) Trapezoidal distribution for Se, non-differential misclassification Trapezoidal distribution for Se and Sp, non-differential misclassification Differential misclassification (better in those Y=1 than Y=0)
(c) 2017 Michele Jonsson Funk
29
Key considerations • Construct’s role in the study and the implications for bias due to misclassification / measurement error • Assessing performance – Related to exposure, outcome, subgroups – Validation may still be needed for existing algorithms
• Evaluate impact on effect estimates and confidence intervals – Quantitative bias analysis
(c) 2017 Michele Jonsson Funk
30
Michele Jonsson Funk ¡ mfunk@unc.edu 31