4 minute read

ARTIFICIAL INTELLIGENCE & ARTIFICIAL INTELLIGENCE & ARTIFICIAL INTELLIGENCE &

MACHINE LEARNING: MACHINE LEARNING: MACHINE LEARNING:

HOW IT WORKS, AND HOW IT FAILS HOW IT WORKS, AND HOW IT FAILS HOW IT WORKS, AND HOW IT FAILS

Advertisement

The rise of machine learning and artificial intelligence has led to mind-boggling technological developments that have impacted our lives immensely From viral videos of Boston Dynamics robots doing backflips to Netflix recommendations on what to watch next, we cannot go a day without AI trying to make our lives easier. But for underrepresented individuals––LGBTQ+, racial minorities, etc ––there is significant harm being done instead How do these artificial intelligence models work? What are the basic processes happening to make them? A computer is only as smart as we make it––how could they be discriminating against these individuals, and is it possible to stop this?

WHAT IS ML? WHAT IS ML? WHAT IS ML?

Machine learning (ML), a subgenre of artificial intelligence, is truly a fascinating and innovative technological advancement. The idea behind this process is to, quite literally, teach a machine how to learn so that it can make logical predictions based on data it has already analyzed before To those unfamiliar with machine learning (and even to those who are), it really does seem like magic. But to understand how and why underrepresented individuals are being harmed by some of these algorithms, it is important to understand how a basic machine learning algorithm works. Though there are many different algorithms out there, the main process occurring in all of them stems from a common idea: calculating distances between data points For example, the number of streams of a song would be quantitative data, while the genre that a song fits into would be categorical data. It is important to understand what kind of data we are working with when it comes to implementing ML algorithms because we will need to adjust them accordingly.

One of the most common algorithms, knearest neighbors (KNN) regression/ classification, generates a predicted value for a particular label based on a set of features. For example, given sufficient data, one could generate a predicted temperature value for a specific day based on factors such as wind, humidity, distance from the ocean, and more But how does the algorithm really work?

Let’s continue with our example of predicting temperature. Say that we have a dataset of 730 rows where each row corresponds to a day over the past 2 years. There are several columns to describe these rows, such as the factors described above (wind, humidity, distance from the ocean, and more). Using this preconstructed dataset, we can iterate through each observation (day) in the dataset and generate a predicted temperature value by looking at the k nearest neighbors in the dataset, where the value of k is up to us. We can find the mean of these k neighbors’ temperature values and use that as our prediction. This is a very basic machine learning algorithm, but this idea of extrapolating based on averaged distance or error between points is a common idea that shows up in all machine learning algorithms But what if the dataset we used to build this predictive model only contained weather data from, say, Antarctica?

This data would not extrapolate well to a different environment, such as Jamaica in the summertime. Since a model only learns based off the data we supply, it would likely make inaccurate predictions for new, different data In order to combat this, we would need to feed the model diverse data that contains many different kinds of weather observations.

WHAT COULD GO WRONG? WHAT COULD GO WRONG? WHAT COULD GO WRONG?

Unfortunately, ML will inherently misunderstand and mislabel members of the LGBTQ+ community, racial minorities, and disabled people. Let’s start by examining how the LGBTQ+ community is being hurt. As cited in the article, “Why Artificial Intelligence Is Set Up to Fail LGBTQ People,” anthropologist Mary L Gray states that LGBTQ+ people are “constantly remaking” their community For algorithms that are built upon previous observed data, this raises concerns Individuals who identify as LGBTQ+ will go through a life-long journey with their identity, and as they evolve and change, algorithms will not be able to keep up. They will continue to calculate data points with the least amount of distance to a new data point to make ‘logical’ predictions. But when data points represent people, with unique and extensive identities, this does not lead to accurate predictions Instead, it results in misclassification that can be harmful to an individual’s security in their identity. Take Automatic Gender Recognition, for example, that is built on training data of mostly cisgendered males and females. To classify a new person’s gender, the algorithm compares their physical facial features to those in the training data to make a logical classification.

Many transgender individuals are being harmed by these superficial assumptions

When it comes to racial minorities, we are seeing significant biases especially in the medical field. Algorithms that predict which patients need care have been unintentionally discriminating against black people. In their study entitled “Dissecting racial bias in an algorithm used to manage the health of populations”, Obermeyer et al. examined a commonly used medical ML algorithm and found that black patients needed to be significantly sicker than white patients to be recommended for the same type of care The reason that we see such biases in this algorithm is because the model was built on “past data on health care spending, which reflects a history in which Black patients had less to spend on their health care compared to white patients, due to longstanding wealth and income disparities” (Grant). Since the data has been strongly impacted by such racial issues, the model reflects this and results in more recommendations for white people to receive care This is a major issue because black patients may require care but not receive it due to the model’s biases

This article is from: