28 minute read
Avatars of Anxiety
Falling into the deepfake-industrial complex
by Simon V.Z. Wood
Advertisement
Illustration by Sam Mason
In February 2020, there was an unsung development in the realm of dystopian technology, upon the release of the music video for the Strokes song “Bad Decisions.” The video begins with a shot of a woman, bored at home, who amuses herself by pushing a button that allows her to clone the Strokes. The clones proceed to sing “Bad Decisions” until their heads fall off and the song ends. In the YouTube comments, some viewers remarked that Julian Casablancas, the frontman, looked more showered than usual. Others observed that the song sounded remarkably similar to Billy Idol’s 1981 hit “Dancing with Myself.” It does. So much so that Idol is credited as a writer on the track. Which makes sense when you consider that Idol wrote “Dancing with Myself” after playing a mirror-filled Japanese discotheque in which people danced with their own reflections. It makes even more sense when you consider that, while all five members of the Strokes starred in the video, none of them acted in it.
It happened like this: A production company called Invisible Inc hired five actors who kind of looked like the Strokes and filmed them performing “Bad Decisions.” Invisible Inc then sent a bunch of old video footage of the actual Strokes to a guy from Canada who calls himself The Fakening. The Fakening proceeded to junk the old stuff, pull better clips he found on YouTube, train a machine-learning algorithm on the faces of the real band, and digitally
graft those faces onto the actors. So there it was: a “deepfaked” music video, featuring synthetic doppelgängers of flesh-and-bone rock stars. He’d cloned the Strokes.
The creation of “Bad Decisions” hinted at tantalizing possibilities. “They didn’t have to spend a day on set to shoot their own music video,” The Fakening—a/k/a Paul Shales, age forty, who used to work a boring marketing job in Toronto—told me. “I’m sure Hollywood is hot on the trail of, ‘Can we just license Tom Cruise? He won’t have to show up.’ ” He likened the idea to the 2002 movie Simone, about a fading director who creates a computer-generated actress. Except, in this case, studios could use someone who is already a bankable star. “Hire a better-looking body double with his shirt off. You just need Tom Cruise’s face.”
Funny he should say that. A month after “Bad Decisions” came out, Shales was summoned to Los Angeles by Matt Stone and Trey Parker, the creators of South Park. Stone and Parker wanted him to join
an A-Team of deepfakers, among them other viral hitmakers such as DerpFakes, Ctrl-Shft-Face, and Dr. Fakenstein. They had plans for a secret project called Deep Voodoo. Shales signed on. Shortly thereafter, the pandemic arrived, pushing the world’s white-collar laborers onto online grids of their own disembodied heads.
As Deep Voodoo got to work (remotely), the presidential election grew nearer, and worries about deepfake technology mounted. Ever since late 2017, when a Reddit user calling himself “deepfakes”—as in, creating digital fakes of people using the artificial intelligence technique known as “deep learning”— was spotted uploading manipulated video to the internet, the potential for next-gen ratfuckery had been obvious: release a clip of a politician saying something she didn’t, wreak havoc. What distinguishes deepfakes from other forms of digital manipulation—whether amateur Photoshop jobs or sophisticated special effects—is the AI; not unlike a social media algorithm, deepfakes mimic you by studying you.
In May 2019, without any technical expertise whatsoever, a guy from the Bronx slowed down a video of Nancy Pelosi, the House Speaker. “Drunk Nancy Pelosi,” a so-called cheapfake, went viral. The crappy quality of that effort, combined with the American predilection for believing anything, drove panic over the damage that could be inflicted by someone with an actual grasp of deepfake technology—especially in a QAnon-ified political atmosphere already losing its grip on consensus reality. The implications were ominous not only for politics, but for basic epistemology.
Shortly after the Pelosi video came out, Adam Schiff, congressman of California, convened a hearing on deepfakes. A couple months later, New York University’s Stern Center for Business and Human Rights released a report on disinformation that deemed deepfakes a primary threat to the integrity of the coming presidential election. Soon, Facebook banned non-satirical deepfakes, and California passed a law designed to do the same. The Department of Defense’s DARPA branch disbursed millions of dollars in grants to develop deepfake-spotting technology, as well as tools to track down the identities and intentions of creators. By late last year, Sensity, a deepfake detection firm, had catalogued more than eightyfive thousand deepfakes, up from about seven thousand two years earlier.
Two weeks before the presidential election, a very good deepfake dropped. Not from some troll farm, but from Deep Voodoo. Called Sassy Justice, it was the studio’s first product. At fourteen minutes, it is probably the longest and most sophisticated piece of content ever made using only deepfakes. Shales said that an NDA prevented him from telling me how it was made, so I’ll just try to describe it. It stars Fred Sassy, a deepfaked TV journalist based in Cheyenne, Wyoming, who patrols the city in a white van with a jumbo gavel on its roof, uncovering consumer scams. Earnest, dogged, and perplexed by technology, he has Donald Trump’s face; the white, curly hair of a retirement-age Harpo Marx; and a wardrobe of ascots and zebra print.
In this particular “episode” of Sassy Justice, Fred Sassy reports on the threat posed by deepfake technology. “As human beings, we all rely on our eyes to determine reality,” he begins. “Except for blind people. I’m not sure how they do it.” He conducts Zoom interviews with deepfakes of Michael Caine, Al Gore, AI researcher “Lou Xiang” (actually Julie Andrews), and a toddler version of Jared Kushner, whom the White House has put in charge of the “Anti-Deepfake Club.” (Sassy: “All of this has to have a cost. What is the taxpayer paying?” Kushner: “He said it never happened.” Sassy: “Who did?” Kushner: “My daddy-in-law, Trump. He said the Holocaust never happened.”) There is a deepfake of Fox News’s Chris Wallace interviewing a deepfake of Trump, who suffers a stroke on camera and then repents for his sins, as well as a deepfake of Mark Zuckerberg, cast as the “Dialysis King” of Cheyenne.
In other words: the most well-realized deepfake on the internet is not political propaganda, but an absurdist news segment satirizing the hysteria around deepfakes. The catch is, the video’s premise is undercut by jarringly good execution. Sassy Justice is an almost uncategorizable form of entertainment, and—save for an article about it in the New York Times—it received little press coverage, as though the culture didn’t know what to do with it. When I called the City of Cheyenne, the mayor’s press secretary told me he’d never heard of it and would have to check it out.
What’s more, the Times story seemed like a potential stunt, in which a square newspaper was hoodwinked into promoting a nonexistent deepfake studio. In the article, Parker described Sassy Justice as the “most expensive YouTube video ever made,” which made me burst out laughing and was, I assumed, uttered in the fraudulent spirit of the project. Others took their skepticism further. Recently, I called Hao Li, a decorated computer scientist now at work on a DARPA-funded deepfake detection project. (Hedging his bets, he’s also the CEO of a firm that creates “photorealistic virtual humans.”) “Deep Voodoo—I don’t think it’s a real company,” he told me. “I’m pretty sure it’s just one guy.”
On December 31, 2018, the government of the western African nation of Gabon released an enigmatic video of its president, Ali Bongo. Bongo had been in charge since 2009, when he took over for his father, Omar Bongo, who had ruled the country since 1967. For a few months, Ali Bongo had gone quiet; it was reported that he’d suffered a stroke and was convalescing in Morocco. The video, a New Year’s message, is two minutes of pablum. “My dear compatriots,” he says, in French, “I’ll continue to put all my energy and power in the service of our country.” What made it odd was his appearance: Bongo hardly blinked and rarely seemed to move. It looked like a cross between a hostage video and a Weekend at Bernie’s situation.
Seven days later, citing the embarrassing “spectacle” of an unwell president, a faction of the Gabonese military calling itself “Operation Dignity” staged a coup attempt against the Bongo regime. Two of its members were killed by Bongo’s security forces, and the threat was quelled. But as the dust settled, journalists and Bongo’s political adversaries began to wonder if the video had been a deepfake, released by the government to quiet criticism of the president’s absence, while the real Bongo was dead or hooked up to an IV. Because it can be hard to find footage of people’s faces in profile, a common deepfake form features a seated figure, perfectly still, looking head-on into the camera, as Bongo had been. Realistic audio is hard to synthesize, so Bongo’s somewhat slurred speech felt like another potential tip-off. I was following the story at the time, struck by the prospect that a deepfake had instigated a military uprising.
Eventually, the video was run through a deepfake detection algorithm, which concluded that it likely hadn’t been manipulated. Bongo returned to the capital, Libreville, and settled back into power. In January, two years after the failed coup, I reached out to a couple of journalists who cover Gabon. Neither found the deepfake theory remotely credible, and one compared it to a contemporaneous rumor that Nigerian president Muhammadu Buhari, himself frequently out of the country to receive medical attention, was in fact being played by a body double named Jibril. (“On the issue of whether I’ve been cloned or not, it’s the real me, I assure you,” said Buhari—or Jibril—addressing the rumors.) Looking back on it, the journalists weren’t even sure that what took place two years ago could be called a military coup at all.
The Bongo video remains a telling document of our incipient deepfake era: what seems to be stoking much of the anxiety is the power of suggestion. In April 2018 there was a deepfake of “Barack Obama” calling Donald Trump “a total and complete dipshit.” The clip was commissioned by BuzzFeed, voiced by the actor and director Jordan Peele, and created using a popular open-source software called FakeApp; it was explicitly labeled as a public service announcement about deepfakes. In June 2019, “Mark Zuckerberg” asked viewers to imagine “one man with total control” over “their secrets, their lives, their futures” in a video created by the artists Daniel Howe and Bill Posters. A few months later, just before the UK election, Bill Posters (whose real name is—I think—Barney Francis) released more deepfakes, these of “Boris Johnson” and “Jeremy Corbyn,” in which each candidate endorsed the other, then revealed himself as a case study in misinformation.
As the US election approached, the pace picked up. In July 2020, MIT’s Center for Advanced Virtuality completed one of the more conceptually interesting projects to date: a deepfaked Richard Nixon giving a speech about a botched moon landing—which had, in fact, been written for him in case the astronauts of Apollo 11 did not return to Earth. At the beginning, a disclaimer read, “This is not real.” At the end, another read, “This project shows the dangers of misinformation.” In September, new deepfakes of Vladimir Putin and Kim Jong-un were commissioned by a good-government group called RepresentUs, to warn of the dangers of misinformation. In October: a deepfake of Matt Gaetz, a Florida congressman, commissioned by his opponent in order to link Gaetz to the threat of misinformation. In December: a deepfake of the queen of England, created by an Oscar-nominated special effects studio, intended to be a “powerful reminder that we can no longer trust our own eyes.”
Of these examples, none was designed to deceive; all were intended to raise awareness about the potential to deceive. Searching for evidence that bad actors were weaponizing artificial intelligence for political gain, what I found instead was an emerging field of detection firms, government grantees, startups, academics, artists, and nonprofits that seemed to depend on one another to sustain interest in deepfakes. Call it the deepfake-industrial complex—or, perhaps, a solution in search of a problem. In January, a trio of academics, from Harvard, Penn State, and Washington University in St. Louis, released a draft paper studying the efficacy of a deepfake in which “Senator Elizabeth Warren” called Donald Trump “a piece of shit.” The video, framed as a “leak,” persuaded 47 percent of 5,750 study participants. Okay, that seems a little alarming. But in order to run the experiment, the
researchers, of course, had to create the deepfake. To do that, they contracted a firm called DeepFakeBlue, which makes “ethical” deepfakes to “raise awareness.” The paper, summarizing the threat posed by the technology, cited the video of Ali Bongo.
Given the disproportionate ratio of “awareness” deepfakes to “real” deepfakes, it’s hard not to wonder if artificial intelligence nerds simply enjoy the rush of designing synthetic humans but feel duty-bound to couch their monstrous creations in the language of public service. As in, say, Jurassic Park. (There is also a thriving genre of excellent, if less convincing, absurdist deepfakes—Donald Trump as Honey Boo Boo, Brazil’s Lula as Mariah Carey—that tend to be created by apolitical face-swapping wizards.) According to an academic paper about deepfakes, the number of academic papers about deepfakes has risen from zero in 2017 to sixty in 2018 to four hundred fourteen by July 2020.
Journalistic attention to deepfakes has followed a similar arc. In 2016, before the term “deepfakes” existed, a team of academics created a seminal “real-time facial reenactment” software called Face2Face. One of the researchers on that team, Dr. Matthias Niessner, is now the cofounder of a London-based firm called Synthesia, which creates deepfakes for corporate clients. (Think training videos, using the cheap, and COVID-free, labor of virtual humans.) Recently, I Zoomed with Victor Riparbelli, Synthesia’s CEO and a former student of Niessner’s. I asked him where the demand for deepfakes was coming from. He pulled up a running tally. “So far, we’ve had 487 journalists contact us because they want to make a deepfake of Donald Trump or Putin,” he said. That represented 98 percent of his requests.
Brief technical interlude: How are deepfakes made? The roots of the technology date to 2014, when a Montreal-based postdoc named Ian Goodfellow was at a bar with some grad school friends. Their field was artificial intelligence, and they were puzzling over how to improve on the computer-generated faces they’d been creating. Thus far, they’d been using a deep-learning algorithm, or neural network, trained to study and mimic real-life images. Goodfellow’s breakthrough was to create two neural networks—one to synthesize images, and another to test their accuracy—and pit them against each other in real time. The result was a Generative Adversarial Network. Goodfellow, in AI circles, became known as the GANfather. And GANs became the backbone of deepfake technology.
Within a few years, the sophisticated free software tools Faceswap and DeepFaceLab appeared. The field grew. Eventually, three distinct deepfake genres emerged: First, there is the simple “face swapping” version, in which you can, say, blend your face into that of a celebrity. People got into that technique last summer using Reface, a Ukrainian app that rose to number one on the App Store. Second, there are mouth-manipulation deepfakes, in which only lip movements are synthesized. BuzzFeed used this technique for the Jordan Peele video. Third, there are “puppet master” deepfakes, in which digital personas are overlaid onto footage of hired performers, as in Sassy Justice. (The “puppet” in that case was Trump; the “master” the actor voicing him and moving his body.) Rather than manipulating footage after the fact, a puppet-master deepfake brings to life a new character altogether.
Anyone who has a pretty good PC and a graphics card can learn to make a deepfake. That includes the Russian government: ahead of recent protests in Moscow defending the opposition leader Alexei Navalny, the Kremlin appears to have geotagged #redsquare with GAN-generated still images of random people, as a means of reducing the overall proportion of pro-Navalny content. But creating a convincing deepfake takes money and artistry, without which chins blend into necks and eyes fail to blink. “There’s still not a push-button technology to create a compelling deepfake, particularly if you
The winners in this ecosystem of constant possible delusion are the people who exploit fears of deepfakes to create plausible deniability about real-life events.
want to have audio,” Matt Turek, who oversees DARPA’s deepfake detection grants, told me. (Turek promised me that DARPA isn’t helping the US government create deepfakes.)
That hasn’t stopped journalists from embracing the drama, as we are wont to do, especially when the drama involves technology that distorts people’s perception of the world (something that the press used to have more control over). A few years ago, anxious coverage about the rise of creepy augmented-reality tools (e.g., Google Glass) wound up far outpacing the technology’s real-world impact. The same has been true thus far of deepfakes. In early 2018, BuzzFeed labeled the advent of deepfakes “a potential ‘Infocalypse’ ” and a “fake news Manhattan Project.” Countless alarmed headlines followed, in both the tech and mainstream press. In March 2019, The Verge ran a piece, “Deepfake Propaganda Is Not a Real Problem,” yet continued to publish regular warnings about deepfake propaganda. Cut to the present: zero known propaganda deepfakes were weaponized during the 2020 election.
In the meantime, all the awareness-raising stories have effectively seeded the notion that our eyes are constantly betraying us. “They imply we’re surrounded by deepfakes,” said Sam Gregory, the program director at a Brooklyn nonprofit called Witness, which focuses on the “threats and opportunities” presented by emerging technologies and itself occupies a key position in the advocacy wing of the deepfake-industrial complex. “They imply the scale of visual deception around us is far greater than it is.” (Unless, of course, deepfakes have gotten so good that we are in fact surrounded by them at all times. In which case, we live in Plato’s cave, and the problem is beyond solving.)
The winners in this ecosystem of constant possible delusion are not the technical creators of deepfakes, but the people who exploit fears of deepfakes to create plausible deniability about real-life events. There’s even a handy new academic term for this principle: “the liar’s dividend.” On January 7, Trump delivered an address from the White House, condemning the mob that stormed the Capitol the day before and signaling to his supporters that he would no longer contest the results of the election. The next morning, there was rampant speculation on Twitter, Parler, and 4Chan that the address was a deepfake, released against Trump’s wishes. “This is a terrible #deepfake,” wrote Parler user @OldBear. “The head movements, his cadence, he never went off-script.”
In 1985, Donna Haraway, the scholar and critic, published an influential essay called “A Cyborg Manifesto.” In it, she argued that the advent of machine-human hybridity created a framework for women to escape prescribed gender norms. Since a cyborg had no biological origins and bore no innate physical characteristics, it was liberated from expectations about reproduction, parenthood, and sex. “Why should our bodies end at the skin?” Haraway asked. “The cyborg is a kind of disassembled and reassembled, postmodern collective and personal self. This is the self feminists must code.”
As one might have predicted, things veered in another direction. Roughly 95 percent of the known universe of deepfakes, according to a report published in 2019, consists of the heads of celebrities being grafted onto the bodies of porn actresses. Put another way, female cyborgs are now all over the internet, but created by men to star in bespoke video fan fiction, for the purposes of masturbation. And none of this should come as a huge surprise, given that the original Reddit user known as “deepfakes” was the guy who put Gal Gadot’s head on the body of an adult-film star to make “her” do a scene with “her” “stepbrother.” That pornography is already a performed version of sex further underscores the warped fantasia of the deepfaked version.
In the deepfake universe, there are two main forums where creators congregate to trade tips and content. There is a “SFW” (safe for work) chat room used by devotees of the Faceswap software, hosted on the message board platform Discord. Then there is a porn site, whose name I see no good reason to publicize. (A couple of years ago, Vice ran an investigation about an AI-powered app that “undressed” women. The app’s server crashed after the article came out, due to increased demand.) I think of myself as fairly thick-skinned, but the deepfake porn site was one of the most disturbing places I’ve visited on the internet.
The site is divided into two sections. First there are the videos themselves, searchable by model (“celebrities” or “pornstars”) and category (there is, for example, a “gay” section, though it has only about forty videos); the most popular have a hundred thousand or more views. The second section is for the “community,” where its more than three hundred ninety thousand members talk to one another and request to buy custom deepfakes from creators. Many of their posts are incongruously innocent, suffused with technical chatter about deepfake production. Others are not innocent. Here is a comment from a user who is also a prolific poster of deepfakes: “As a long time porn addict going back to the golden age of porn, I will say this: Deepfake Celeb Porn trumps all other porn for me. I rarely fap to the real thing anymore.”
Dpfks, the site’s administrator, told me that he started the site after Reddit and Pornhub announced bans on deepfake porn, in 2018. He figures there are around a hundred “talented creators” of pornographic deepfakes, a third of whom are active. Commissions for a ten- to fifteen-minute video, he
estimated, run about two hundred dollars. There are a few rules—dpfks maintains a ban on child pornography, bestiality, and images of noncelebrities—and an important stipulation: “We do not tolerate people trying to pass deepfakes as ‘real’ videos,” he told me. The zero-tolerance policy on deception seems to underscore a critical misapprehension about the risk that deepfakes pose: It’s not their supposed veracity that is causing trouble. Nobody on the porn site, or any other, seems fooled. It’s unreality that people are after.
There are other problems, however, notably that of consent, and the potential for deepfaked revenge porn. Shales—The Fakening—told me that he gets a ton of requests for adult “commissions.” They all start the same way: someone sends him an image of a woman, claiming that she is their wife or girlfriend, and asks him to put her in a porn video. Then Shales asks the person to send him a video of the woman agreeing to be in the deepfake. “They all vanish,” he said. “Every single one of them. No one is looking at it on the legit level.”
In this subterranean realm, the distinction between deepfake and real collapses under the potential for harassment. It is also the realm in which, maybe, the first actual political deepfake was weaponized. In April 2018, Intidhar Ahmed Jassim, a professor at a university in Baghdad, was running for a seat in Iraq’s parliament when an explicit threeminute video of a man and woman in bed went viral; the footage was blurry, but social media users suspected from the voice and appearance that it was her. Jassim ended her campaign, though she insisted that the video was fake. International coverage followed; The Economist suggested that the video was posted to discredit her candidacy. (Jassim could not be reached for comment.)
Yousif Astarabadi, who had recently started a company called NotEvil, which analyzed and debunked deepfakes, caught wind of the video and contacted Jassim, offering to analyze it. NotEvil’s report, which Astarabadi forwarded to me, concluded with 95 percent certainty that the video was deepfaked. It pointed to the appearance of a third eyeball, evidence of a “replacement face,” and a missing left earring. Still, there was room for doubt. In February, when Sensity released a free public version of its software, I dropped the video in and, after five minutes, was notified that it did not exhibit evidence of a face-swap. But that result couldn’t be trusted as certain, either, and it belied a larger truth: the firm’s own research has tracked the abundance of deepfake porn that features women without their consent; the deepfake porn site displays several female politicians. As for Jassim, the damage is already done—a potential voter would have a hard time dissociating her from the invasive footage, whether it was real or forged.
In 2019, Astarabadi gave up on NotEvil. Now he’s got a new startup: The Oasis, which uses deepfake technology to create live avatars for people to use on video chats. The company’s slogan is “Be who you really are.”
In December, Sassy Justice released a second video. This one featured an uncannily good “Donald Trump” in a Christmas sweater, reading aloud from a children’s book about a reindeer who dies after an election has been stolen from him. It was well done, but I still couldn’t shake the suspicion that Deep Voodoo was less a studio than a high-concept joke. I asked Shales. “We’re all real,” he replied. Shales then connected me to his boss, a certifiably authentic South Park producer named Frank Agnone. Over numerous phone calls, Agnone promised me exclusive clips of forthcoming content and a Zoom call with Matt Stone and Trey Parker. But they kept pushing me off, and the footage never came. I had to wonder.
What I gleaned about Deep Voodoo is this: Sassy Justice evolved from Sassy Trump, a character created by a British actor and South Park collaborator named Peter Serafinowicz, who had been playing around with deepfakes on his own. When he, Parker, and Stone teamed up, they originally envisioned a full-length film, in which Fred Sassy somehow bumbled his way into Trump’s orbit. The project was apparently self-financed as a proof of concept; Stone and Parker were banking on a distributor to pick it up. When the pandemic hit, they scaled down their ambitions and settled for the short version, promoting it cryptically on TV and radio stations in Wyoming. Stone and Parker did most of the writing and Serafinowicz much of the voice work, with assists from his wife, an actor, and Parker’s seven-year-old daughter, who played the role of Jared Kushner.
In a way, it’s fitting that I never got the chance to interview the comedians behind Deep Voodoo. After all, Sassy Justice spoke for itself, capturing deepfake technology’s muddled implications better than any of the reporting thus far published on the subject. If press coverage about deepfakes has tended toward earnest alarmism—which paradoxically winds up flattening the experience of watching a deepfake— Sassy Justice performs a kind of anti-advocacy, refusing to take a position on how threatened we should feel.
At one point in the original video, Fred Sassy (Serafinowicz) conducts a Zoom interview with deepfake Michael Caine (also Serafinowicz). What gives a deepfake away, Caine tells him, is sound. “The human brain is a very clever thing, and it can detect the difference in what is a real voice and what is a fake voice,” he says, in his trademark Cockney. “A perfect impersonation. It can’t be done.” Even in the oversaturated market of Michael Caine impressions,
Serafinowicz’s is about as perfect a Michael Caine as you’ll hear. Which is to say: In almost any other circumstance, deepfake Michael Caine would have a point. Except in this scenario, he’s being voiced by a guy who specializes in impressions, rendering the deepfake utterly convincing.
If the success of your deepfake depends on something as analog as the caliber of your voice actor, maybe technology itself is beside the point. It’s easy enough to spread misinformation without fancy videos; the same study that created the Elizabeth Warren deepfake, for instance, found that manipulated video was no better at fooling anyone than faked headlines or audio clips. (When I asked Hao Li, the computer scientist, why he thought Deep Voodoo was bullshit, he said he couldn’t remember, but thought he read it somewhere. “Wikipedia,” he said. “Or something.”) Besides,
who says we don’t want to be duped? The average cable news junkie is searching for narratives that confirm her priors; the person who buys into a politically damning deepfake may, on some level, be embracing the reality she wants.
Where Fred Sassy goes from here, I do not know. But in late February, something almost as impressive appeared on the internet. A TikTok account called “Deep Tom Cruise” posted perhaps the most realistic celebrity deepfakes yet: Tom Cruise doing a magic trick, Tom Cruise golfing, Tom Cruise telling a joke. I recognized them as puppeted deepfakes, à la Sassy Justice: someone had hired an actor to impersonate Cruise, then grafted on Cruise’s deepfaked features. The tech was seamless, the impersonation pitch perfect. The videos instantly went viral—and generated countless bemused clickbait headlines. A BuzzFeed journalist plugged the deepfakes into Sensity’s test system: “No faceswap detected.”
Seeking to find the videos’ creator, I messaged Shales and another Deep Voodoo deepfaker named Chris Ume, a thirty-one-year-old Belgian. Shales didn’t answer me. Ume did: “Im not doing any interviews about it.” Of course it was him. I should have known; just weeks earlier, Ume had posted a nearly identical Tom Cruise deepfake on his YouTube page, working with an actor named Miles Fisher, who already resembles a Jerry Maguire–era Cruise. Ume seemed surprised that he hadn’t received more inquiries from journalists about the provenance of Deep Tom Cruise. “Most of them don’t have a clue, but keep writing articles, just copying them from another newspaper,” he messaged me. “And of course, ‘it’s the end of the world’ in caps.” Eventually, Ume’s name got out, and the real Tom Cruise created a TikTok account.
Science fiction brims with anxiety over our inability to distinguish between real and synthetic humans. The most famous example is Blade Runner’s replicants, but there are countless others, from the robot Maria of Metropolis to the pod-people of Invasion of the Body Snatchers and the fembots of The Stepford Wives. In the 1987 Arnold Schwarzenegger movie The Running Man, the United States government grafts images of wanted fugitives onto footage of actors, to pretend the bad guys have been apprehended. Now art just imitates life. Deepfakes are a sinister feature of both the recent BBC drama The Capture and the Netflix hit Lupin.
In the real world, however, the deepfake-industrial complex has a strong utopian streak. Lisha Li, the CEO of a San Francisco startup called Rosebud AI, has a vision that’s not far from Donna Haraway’s. Rosebud made the Matt Gaetz deepfake, but its main suite of products includes AI-generated stock photos (think: custom appearances to suit targeted demographics) and an app called TokkingHeads, which can create a video-and-audio representation of a person using only one photo. The promise there, especially intriguing in the remote-work era, is to decouple people’s talents and personalities from the way their bodies look. Imagine appearing on Zoom in the form of a deepfaked avatar, Li said: “You’ve actually opened up a huge amount of opportunity for a lot of people who just don’t want to be out in the public sphere using, say, their own identity.”
We’ve already seen similar technology grow popular on social media: last year, the talent agency CAA signed a freckled Instagram star named Lil Miquela, who has several million adoring fans— and is not a real person. When Miquela arrived on the scene, with a sexy-CGI persona, Instagram was already ground zero for people altering their images to achieve a uniform, synthetic kind of beauty. Lil Miquela took the medium to its logical end point. As Emilia Petrarca noted in an enthralling New York “profile,” Lil Miquela started off looking distinctive, then gradually began to adopt the robotic mannerisms of human influencers. “The effect is twisted,” Petrarca wrote. “Miquela seems more real by mimicking the body language that renders models less so.” At the same time, Miquela’s pixel DNA has granted her a cyborgian form of personhood, by which she can evade labels altogether. “I’m not sure
I can comfortably identify as a woman of color,” she posted at one point. “ ‘Brown’ was a choice made by a corporation. ‘Woman’ was an option on a computer screen.”
The scariest kind of AI dystopia is the one where the machine acquires a mind of its own. When I spoke with Bill Posters, the deepfake artist, he suggested that we were already living in such a realm. “Most images today,” he said, “are being shared by machines, between machines, with no human oversight or intervention at all.” The data extracted from our images, in turn, is what social media companies sell to advertisers to make money. In general terms, he’s right. But deepfakes are reliant on a Pygmalion-like human touch. We’ve created them to elicit reactions from other humans— negative, positive, hateful, creepily titillated, whatever. For the moment, we’re in control. cjr