13 minute read

AI PREDICTION CLAIM TO ALLEVIATE AN OVERCROWDED JUSTICE SYSTEM... BUT SHOULD THEY BE USED?

In 2013, the state of Wisconsin charged Eric Loomis with five criminal counts in connection with a drive-by shooting. Loomis eventually accepted a plea deal and pled guilty to several lesser charges: attempting to run from a traffic officer and operating a motor vehicle without owner’s consent. Before Loomis’ sentencing, a Wisconsin Department Corrections officer produced a presentence investigation report that included a risk-assessment score to help predict Loomis’ recidivism rate — his potential chance of reoffending in the future. The introduction of this algorithm would dramatically change Loomis’ case.

This risk-assessment score was computed by Correctional Offender Management Profiling for Alternative Sanctions (COMPAS)—a privately-owned algorithmic system, designed by the company Equivant, that produces recidivism predictions based upon public data and answers from a lengthy questionnaire. Once formulated, Loomis’ COMPAS score identified him as high risk for violence, high risk for recidivism, and a high pretrial flight risk.

Advertisement

Before a COMPAS score was introduced into Loomis’ case, the prosecution and defense had agreed upon a plea deal of one year in county jail with probation. However, at Loomis’ trial, the trial court referred to the COMPAS-generated risk-assessment score as a judicial tool to help in its sentencing determination. The court classified Loomis as high-risk of re-offending based in part on this score, and proceeded to sentence him to six years of imprisonment and five years of extended supervision. Loomis filed a motion for post-conviction relief on the grounds that the court’s reliance on the COMPAS score violated his due process rights. While the Wisconsin Supreme Court ultimately denied Loomis’ motion, their closing remarks reflected the growing sense of skepticism that surrounds COMPAS and the role of risk-assessment technologies in the American legal system. “While our holding today permits a sentencing court to consider COMPAS,” the court said, “we do not conclude that a sentencing court may rely on COMPAS for the sentence it imposes.”

The essence of these closing remarks is simple: according to the court, risk-assessment algorithms like COMPAS are not a replacement for human judgement.

***

COMPAS is one of the most widely used algorithms in the U.S. criminal justice system and it has been applied or adapted by many states, including New York, Wisconsin, Florida, and California. COMPAS uses public criminal profile data and answers from a 137 interview questionnaire to generate a risk score. The questionnaire gathers information on past criminal involvement, relationships, lifestyle, personality, familial background, and education level. It produces scores grouped by risk level, ranking defendants on a scale of 1-4 Low Risk; 5-7 Medium Risk; or 8-10 High Risk.

First developed in 1998, the algorithm has now assessed over one million offenders. The recidivism prediction component of COMPAS, known as the Violent Recidivism Risk Score (VRRS), has been in use since 2000. However, despite the algorithm’s widespread use in courtrooms across the country, it is largely considered to be a black box: though its basic input information is available, the weighting of these inputs within the algorithm are proprietary, and thus not available to the public.,

Risk-assessment algorithms like COMPAS have the potential to be beneficial tools of justice for the American criminal justice system. Much of this potential stems from their technical and data-based foundation—many scholars contend that formal, actuarial, and algorithmic methods of prediction perform better than intuitive methods used by judges or other experts. According to Adam Neufeld, a senior fellow at Georgetown Law’s Institute for Technology Policy and Law, the systemic benefits these tools provide are abundant: risk-assessments can efficiently and effectively reduce costs, reduce crime, and do not waste human potential when implemented in the criminal justice system.

An example of the possible benefits of this technology is bail. Data from the Prison Policy Initiative shows that releasing a low-risk defendant on bail can reduce their recidivism rate, while detaining these individuals or individuals with similar profiles contributes to an estimated cost of $13.7 billion in annual jail fees in pretrial detention. Neufeld argues that the human potential of every individual offender released on bail is not wasted—both keeping that offender from spending (perhaps undeserved) time in jail while awaiting trial, and also by reducing a recidivism cycle that might result in that offender returning to jail in the future.

One study in particular—the National Bureau of Economic Research’s 2017 publication, “Human Decisions and Machine Predictions”— found that in New York City’s pretrial decisions, an algorithm’s assessment of risk would far outperform judges’ track records. The study further concluded that if New York relied on an

algorithm to aid in bail decisions, an “estimated 42 percent of detainees could be set free without any increase in people skipping trial or committing crimes pretrial.” This study supports the assertion that risk-assessment use in New York would indeed reduce costs, reduce crime, and would not waste human potential when implemented as a judicial aid in decision-making. For a judge who might encounter any one of the 30,000 people arrested daily in America, a tool like this thus offers a cost-effective and efficient aid in adjudication.

Relying on an actuarial tool to provide a risk score with profound, life-altering implications, however, might make many people justifiably uncomfortable. Yet, Neufeld insists that though it may seem “weird to rely on an impersonal algorithm to predict a person’s behavior given the enormous stakes... the gravity of the outcome—in cost, crime, and wasted human potential—is exactly why we should use an algorithm.”

Neufeld offers a convincing case for the benefits of using these algorithms—saved money, saved time, and some needed support for an already strained criminal justice system. But as with any growing field of technology, issues arise when these technologies become a depended-upon component of a system like the criminal justice system which, according to the Prison Policy Initiative, annually jails 443,000 people pretrial alone. What is at stake here is not simply systemic efforts to reduce costs and time; it is the ability of each individual to have access to a fair, equitable trial and sentence in an American courtroom. Did Eric Loomis receive this treatment? The Wisconsin court ruled he did. However, pressing concerns still exist around whether or not the COMPAS algorithm is capable of providing the type of fair and unbiased judicial aid that the court claimed it could.

***

In 2016, the non-profit newsroom ProPublica launched a study of Florida’s COMPAS system and found that the formula was especially likely to mark black defendants as future criminals, mislabeling them at almost twice the rate as white defendants. White defendants, on the other hand, were identified as low risk more often than black defendants. ProPublica also found that the risk scores were unreliable in forecasting violent crime, with 80 percent of the people predicted to commit violent crimes not actually doing so. Part of the reason why COMPAS is susceptible to bias is a lack of transparency, a factor that is due in part to Equivant’s right to protect its own intellectual property (a point the state of Wisconsin supported in its decision to reject Loomis’ appeal claims). Unless taken to court on charges such as impropriety—if, for example, there is suspicion that an ethics or standard of conduct violation has occurred in the company’s product or process—Equivant is unlikely to relinquish details pertaining to the system’s internal function.

University of Maryland Law Professor Frank Pasquale believes this secretive aspect of COMPAS is concerning because it refuses a courtroom actor an answer to the following question: “How is the algorithm weighting different data points, and why?” Each aspect of this inquiry is crucial in relation to two core legal principles: due process, and the ability to meaningfully appeal an adverse decision. Judicial processes are generally open and explicable to the public. Even after juries have deliberated, judges themselves are required to give an explanation for their rulings, particularly when adjudicating sentencing. When an algorithmic risk-scoring process like COMPAS is kept secret, it becomes impossible to challenge key aspects of that score because the system’s internal function remains protected. Subsequently, all members party to a case—from the judge to the defendant—are largely unable to question, challenge, or request a reassessment of any algorithmically-generated score.

As a result, any judge seeking to use COMPAS as a judicial aid cannot, at this moment, understand fully how a COMPAS risk-assessment score is developed, nor how factors about the defendant’s profile were weighted to arrive at the given risk score. Not only does this prohibit judges from properly understanding a judicial tool meant to assist their process, but it might also deny the defendant the ability to identify a fair trial outcome, should the judge base their final ruling in any way upon a score they both cannot fully comprehend. This issue of miscomprehension follows the defendant into the appellate arena, where higher circuit courts conceivably face the same confusion.

The secrecy surrounding COMPAS also leads to a failure of ‘narrative intelligibility’ — when outcomes create confusion between involved parties regarding a decision, where a decision came from, or how it was reached. For a jury’s verdict to be narratively intelligible, the judge, defense, and prosecution must clearly understand the procedure the jury followed to reach the verdict and what that verdict means. When this process is transparent and easily explainable, a defendant’s due process rights are not plausibly at risk; if the defendant can understand the ways in which a verdict was reached, they can choose to pursue legal action to appeal on those grounds.

occurs as a result of COMPAS scores can intensify the incarceration rate of young and poor men of color. Indeed, risk-assessment instruments like COMPAS seem to specifically deem defendants riskier based on indicators of socioPasquale argues that COMPAS fails to economic disadvantage, deem males riskier over females, be narratively intelligible. COMPAS is neither count crime victimization and living in a high-crime neightransparent nor easily explicable to a judge or deborhood as risk factors, and also include assessments of the fen- dant, let alone to the public. A defendant might be defendant’s attitude or mental health as risk factors. Often, unable to clearly understand how they have been “risk-asStarr points out, the familial and neighborhood-related facsessed,” and so they might encounter difficulty in challengtors that these procedural instruments consider are highly ing the score should they disagree with it. As a result, risk race-correlated. scores can function like an algorithmic brand: no matter the Sentencing decisions made in the criminal justice syseffort, a defendant is largely unable to change or eliminate it tem which consider race and gender are explicitly defined from their profile for the offense at hand. as unconstitutional practice under U.S. Sentencing Guide

Risk-assessment scores could also create problems for lines. Starr argues that by allowing judges to generate a ruljudges, who also encounter difficulty when trying to question ing based in part upon such factors, the state itself might be a risk score’s authenticity, origin, or algorithmic weighting. endorsing a practice that allows certain groups of people to Without this investigation, judicial dependence on these albe considered “high-risk” or more likely to engage in violent gorithmic scores could lead to automation bias—judges, like crime because of factors they have no control over. These are all humans, might defer to the technoljudgments based upon the characogy without questioning its validity, accuracy, or possible biases. Automation bias is an issue both inside and outside All members party to a case...are largely unable teristics of who a person is, not what actions they have done. In this respect, when a judge uses a COMPAS of courtrooms. Research on the biases to question, challenge, or risk-assessment score which seems to involved in algorithmic decision-makrequest a reassessment of label defendants as higher risk based ing systems reveal that human decision-makers frequently rate automated recommendations more positively than any algorithmicallygenerated score. upon identity characteristics like socioeconomic status, race, and gender, this judge is allowing discriminatory neutrally, even if they are aware that factors entry into the legal arena. these recommendations might be subject to inaccuracies or Such practice is, at its root, unconstitutional. By using the error. If left unchecked, automation bias is quite challengtechnical language of a risk-assessment score to obscure dising for a human actor to shake, creating scenarios in which crimination in this manner, unconstitutional judgment pracpeople have difficulty refuting automated recommendations. tices are able to enter legal ruling in ways that would otherThis can lead to a judge in sentencing relying heavily upon a wise be unacceptable if stated outright. COMPAS’ potential to risk score without questioning its legitimacy. produce bias and discriminatory results, and further, to then

Along with the potential for automation biases, it is unallow these results access into adjudication as a formal riskclear if COMPAS scores themselves are even constitutional. score a judge may rely upon, is not only ethically concerning; In particular, the potential for COMPAS to arrive at results it is also outside the bounds of constitutional practice. by disproportionately or inaccurately weighing factors like COMPAS, and risk-assessments like it, have the potential socioeconomic status, gender, or race cannot be ignored. to be beneficial tools of justice. Yet, it is still unclear whether the

COMPAS takes into account a holistic life view of the COMPAS algorithm is ready to serve American courtrooms. defendant into its risk-assessment score that extends beyond A judicial aid should be transparent, narratively intela specific criminal incident—personal details ranging from ligible, and constitutionally sound. These factors promote a gender, age, race, education level, familial background, sojudicial aid’s chance of supporting a judge in reaching a morcial capability, and more are considered. University of Michally and legally defensible decision. It also boosts the public’s igan Law Professor Sonja Starr writes in her paper, “The New confidence in the ethicality of a judge’s decision in sentencProfiling,” that as a result of this holistic practice “judges and ing. Without certainty that COMPAS is able to adhere to parole boards are told to consider risk scores that are based these principles, a judge’s ability to reach a fair and just decinot only on criminal history, but also on socioeconomic and sion could be in jeopardy. Until this certainty arrives, the use family-related disadvantages as well as demographic traits of COMPAS, and risk-assessments like it, should be heavily like gender and age.” Notably, gender-based sentencing that scrutinized by courtroom actors.

***

In their opinion for Eric Loomis’ case, the Wisconsin Supreme Court wrote that “it is incumbent upon the criminal justice system to recognize that in the coming months and years…the system must keep up with the research and continuously assess” the use of tools like COMPAS.

Yet, while these technologies are scrutinized and debated, there’s a lot at stake—the future lives of individual defendants; unbiased procedure in criminal justice sentencing; public confidence in the criminal justice system at large; and the integrity of a judge’s discretion in sentencing. Whether or not the judge is able to make a sound decision on another’s livelihood relies on this discretion remaining intact.

Should an algorithm designed to help alleviate a strained system should do so at the potential cost of individual justice? This question ultimately remains unresolved and will face increasing scrutiny in the near future. However, until COMPAS and other similar risk-assessments are proven to consistently advance justice for the individual, courtroom actors must employ strong caution when engaging with them. Risk-assessment tools do hold an important place in the future of automated justice efforts. Yet, to use these tools before they are ready could effectively hinder a proper delivery of justice in the American sentencing procedure.

Alexandra “Mac” Taylor ‘20 studied Political Science and Art History and graduated from Stanford this past spring.

This article is from: