‘NUMBERS ALONE AREN’T ENOUGH’: An interview with Caroline Chen ’08 B Y L A U R E N H A R R I S , C O LU M B I A J O U R N A L I S M R E V I E W
C
work needs to be done to come to a better answer in the future.
AROLINE CHEN COVERS healthcare
for ProPublica, and while she doesn’t consider herself a “data journalist,” her reporting frequently draws from and analyzes large datasets. Since early March, she has been publishing columns about how reporters can responsibly use data in their writing during the coronavirus pandemic. The key for journalists, Chen says, is to understand that data collection is a way to understand what’s happening to people. This interview has been edited for clarity and brevity. How did the ProPublica column start? I was about to go to NICAR, the data journalism conference, to do a panel on covering the coronavirus. One of my editors said, “You should write that up.” The headline was, “I lived through SARS and reported on Ebola. These are the questions we should be asking about Coronavirus.” I thought it was going to be limited to a very wonky audience. I did math in the post! I intended it as a guide for reporters. Then we got a ton of traffic. So many people wrote to me. That made me realize that there is a hunger for this. Smart readers want a clear explanation of all the numbers that are being thrown around in the news. First-person allows me to acknowledge parts that are confusing or contradictory.
40
M AGA ZINE
Right now, the public wants to know, “Hey, if I get infected, how likely is it that I’m going to die?” That is a super reasonable question. My job as a reporter is to explain, “It’s very hard for me to give you the answer.” And that is what I’m trying to convey as I write these columns — not only what available data there is, but also the process through which scientists or researchers or doctors have gotten to that number and what more
How do you toe the line between communicating what you know and pointing out the unknowable? I’m learning along with everybody else. To give you another very specific example: the question of how many people are infected. You have options, as a reporter. You could just record a number. You could say, “There are X people infected in Y location.” You could also say “There are 50 people infected, but that’s likely an undercount right now, because there are not enough diagnostic tests available, so we’re only testing the sickest people.” You’re giving some context around that number — that’s already more helpful. What would be even more ideal is to be able to say, “And we’ve got another type of test coming along — antibody tests. We’re going to be able to start doing randomized testing in our population to see who has had past infections which will allow us to estimate what percentage of the population in this city or in this state.” And then we really have to break that down into the difference between the diagnostic test and the antibody tests. Readers are smart enough to be able to understand, and they’re actually hungry for that.