Speakers: John Walsh, Nina Schick, Jason Chipman and Matthew Ferraro
Chipman: Hi, Nina and Matt. It's fantastic to have you here. It's really exciting. I can't think of two better people to talk to about deepfakes. So, welcome. Thank you very much.
Schick: Thank you. It's great to be here.
Ferraro: A real pleasure.
Chipman: Nina, I'd like to start with the basics if we can. What is a deepfake? Where does that term come from?
Schick: Well, I think it's a really good starting point because this entire field is so new that the taxonomy really hasn't been decided, but how I define it in my book is a piece of media that is either manipulated or entirely generated by artificial intelligence so that could be an image, it can be an audio clip, it can be text and it can be video. And I think the key point to understand here is this ability of artificial intelligence to actually create fake media or manipulate media is something that's only been in the realm of the possible for about the last five years. There's a negative connotation to deepfake because this is an implication of a piece of synthetic media that's somehow used in a malicious context. However, it's worth saying right from the outset that synthetic media will not all be used for bad purposes so the way I define a deepfake is a piece of media generated or created by AI that's used for the purposes of mis- or disinformation.
Chipman: So are we seeing deepfakes out in the world today? Could you share some examples of what a real deepfake looks like?
Schick: Yeah, we are and again the name itself, deepfake, comes from the first and most malicious and prevalent use of this kind of fake media and it comes in the form of non-consensual pornography. The name itself was a play on the words of deep learning and fake and it emerged on Reddit at the end of 2017 where essentially an anonymous user—we still don't know who he is—but we know he was someone who was interested in machine learning basically managed to figure out how to use some of the kind of advances that were emerging from the cutting edge of the AI research community to make these fake pornographic films which featured actresses. Now, fake pornography is not a new phenomenon. It's definitely been around for as long as photoshop has been around but there's something very different about these creations that he was making on Reddit because they weren't simply an actress with their face photoshopped onto a porn star's body. These were actual films where AI was used to superimpose the face of an actress onto the body of a porn star. Needless to say, as so often with the internet, pornography is pioneering and as soon as this anonymous user kind of had made these initial creations the Reddit thread went wild, it was eventually shut down. But since then, and remember this was only at the end of 2017, an entire deepfake pornographic ecosystem has emerged online and the very alarming thing about it is that all women are victims, it is undeniably a gendered phenomenon but it's not only celebrities who are in the firing line, increasingly we are starting to see how AI is being weaponized against not only normal women but even minors in the creation of deepfake pornography. And again I think the point to make here is that this might seem like a tawdry women's issue but to me this is only the harbinger of a much deeper societal problem because once you can make AI generated fake media the first thing people have done is make fake pornography but obviously it's going to be used in other contexts as well and we are really starting to see that emerging now.
Chipman: Wow, that's really extraordinary. I think, Matt, I'm curious about how accessible the tools are to do what Nina's describing. Is this something that a savvy computer user can create, or does it take a lot more skill than that?
Ferraro: So, the short answer is that it is becoming increasingly accessible. There are apps available. Zao is a very popular Chinese app that does face swapping. There's websites. Deepfakes web which similarly does face swapping where you take your face and basically put it on someone else's body. The best deepfakes are still created by researchers. The most recent really good example was created by MIT to demonstrate how easy it is in fact for them to create a very believable deepfake and what they did is they took a speech by Richard Nixon and they hired a voice actor and using AI they altered the speech and altered his voice to make it sound like Richard Nixon was announcing the failure of the Apollo moon landing. For those who don't remember the Apollo 11 moon landing succeeded but this was Nixon giving a speech in which announced that it had failed and it's very believable and you can go online it's called “In Case of a Moon Disaster.” And so, the very best are still done by researchers but increasingly you're able to create fairly good fakery just online using widely available technology.
Chipman: Are there counter-deepfake tools? Can a discerning user tell what's a deepfake and what's real? Can a sophisticated person who understands the technology make that distinction? Or is the point that you can't tell, that nobody can tell if a deepfake's been made well?
Ferraro: Well I think that there are sort of tells, right, you can look off and the ears look kind of funny. Sometimes the corners of the mouth don't move properly in a video because that's the part that's been copied over as it were by the computer system. And there are technological ways of dealing with deepfakes. Basically, two technological methods. One is to detect after the fact and there has been a lot of advances in that regard. Microsoft recently came out in, I think it was like October 2020, with a tool that would give basically a confidence score on imagery on whether or not it was false. So that's one basically is to detect it afterwards and then the other is this is an emerging field is what's called provenance media and that is basically flipping the equation on its head where you say let’s not try to find what's fake, let's verify what's true and then that the idea is that you take a piece of media—and it's mostly photographs but it can apply to any media—and you more or less tag it with metadata and such that if it's ever manipulated and that data goes up to a distributed ledger to blockchain technology and then if you ever alter that media, there's going to be breadcrumbs back to the original. And so the hope is again you'll sort of be able to look in the same way that you look at a browser today and you see like a lock next to the url and know that has secure socket layers it's a secure website. You'll be able to look at an image and see a similar kind of tell that is in fact a verified image and perhaps you can click on it if you're on a computer and gives you sort of its history.
Schick: And just to add to that when it comes to some of the methods that have been used to detect deepfakes, so digital forensics people looking at what doesn't look right as Matt already suggested you know that's going to become irrelevant soon because the technology is accelerating so quickly that to the naked human eye, we're not going to be able to tell when something's been manipulated by AI. And even if we could, it wouldn't be a sensible way to identify which videos were synthetic because they are going to become ubiquitous as these tools become accessible not only to those who have a lot of resources but basically to everyone through easy applications to use like smartphones. On the kind of detection side there for you have to you have to think about how do we use AI to detect these kind of deepfakes because the human is not going to be able to do it. And I think interestingly the jury is still out on whether or not we reach a point where the generation technology, so the actual deepfake, has become so sophisticated that any detector is unable to detect that this is actually a deepfake and the kind of researchers I've spoken to about it are unsure whether or not we'll ever reach that point. What we have currently available I know Matt just mentioned for example the Microsoft deepfake detection tool—the problem with that is they are often scoring highly in terms of how many deepfakes they can detect but they are only as good as the training data that they're fed on, and when you're potentially looking at multiple different types of deepfakes created by various different machine learning models in the wild it's going to be very difficult, I would say impossible, to actually create a detection tool that is a one size fits all one so I thing ultimately this is always going to be an adversarial game of cat and mouse just like all cybersecurity, just as your detection capabilities get better so will the ability to kind of calm those detectors. When it comes to media provenance (one last point on that), very exciting work being done in the industry for example Adobe is leading an initiative with Qualcomm and Truepic and they've actually just launched their first prototype where they imbed provenance technology into the hardware of a phone. People said it wouldn’t' be possible. They basically rolled it out within 12 months. I think the key point there is how do you get industry wide acceptance of this as a new media standard. So, it's very interesting to see how Adobe is leading the work there along with other really interesting partners.
Chipman: Nina, you've described deepfakes in your writings as a societal problem. I think you just mentioned that a moment ago. Can you talk about that for a moment? Is this, in your mind, a political problem, an economic problem, is it a business problem, maybe all of the above? How do you see the issues that deepfakes are going to force us to confront?
Schick: I think it's an all encompassing problem covering everything from economics to society to politics and this is really where Matt and I when we first started discussing this issue when I was researching my book we completely saw eye to eye on this because we both come from a geopolitical military intelligence type of background. I think the context in which deepfakes are developing is that for the past 30 years the technological progress of the exponential age one of the side effects of that is that we've created this new information ecosystem. One that when it was at its inception was kind of hailed as an unmitigated good for humanity and what's become abundantly clear now, especially in the last decade of my work in geopolitics, is that actually we are facing a monumental crisis of mis and disinformation that has been accelerated by this new information ecosystem. We see it changing politics not only on the international level but also on the national level, on the domestic level, but also beginning to affect all businesses and all private individuals in their lives in a way which wasn't possible before. So the context in which this technology is evolving is one of a very dangerous information ecosystem where we already have a plethora of mis and disinformation and now you're saying this tool, which is going to allow anybody to create the most sophisticated visual disinformation known to mankind for basically free within a matter of years, and then be able to release that into this information ecosystem to the world. I think when you start to consider it in these terms the implications of what deepfakes could do in this already very rapidly evolving information ecosystem are stark and vast and as I mentioned right at the top all encompassing. It is not only a political issue, it is not only an economic issue, it's a society wide issue that we have to understand is a new paradigm, really.
Ferraro: Yeah, if I could just add to that I would say if there's any circumstance in which you rely upon any media to ascertain truth that's an area in which deepfakes pose an evolving threat. Because I think a lot of what Nina said I completely agree with because we live in an age of increasing disinformation, loss of trust in institutions, the democratization of voices which leads, I think, increasingly to this fact that if every voice is as equal as every other increasingly people will believe whatever it is their own biases compel them to believe, and then you add to that the accelerant which is extraordinarily believable media and you're in for a real mess. I mean just to give you one example from the political context, you could think of the January 6th insurrection here in Washington and what you think what happened there—those were essentially victims of disinformation and they were driven to delusion by falsity that is as basic as it could be, right? Just written and spoken lies about the election, about phantom fraud, about things about what the vice president could or couldn't do. Now imagine add into that witches brew just very believable video of say an election worker wearing a Biden t-shirt shredding Trump ballots or quote unquote undercover video of Nancy Pelosi and AOC plotting to steal the election. I think the delusion would have been much more broadly shared and much harder to shake and we're in a state where millions of people will believe that stuff anyway. We’re lawyers, Jason, and we think about, and I know Nina does too, the private sector you know there's so many areas in which this poses private sector risks there's company reputational risks. Imagine a CEO being caught on an undercover video, a false video saying horrendous things, maybe using racial epithets. Or a false video promoting a merger that isn't going to happen and that can goose the stock price so there's a level of market manipulation at work. There's terrible fraud. I mean there's been one example reported in the Wall Street Journal of someone using deepfake audio, which is a very similar kind of technology just not video, to conduct a fraud scheme where they telephone the CEO of a company impersonated the CEO of the parent company and tricked the CEO into wiring a quarter million dollars to a false recipient and then it turned out of course that it wasn't the parent company CEO he was on the phone with, it was just somebody using an AI-enabled technology and that's the kind of thing that I think is just an ambient business risk for increasing companies. Anytime that you would need to rely on the truth, the deepfakes can pose a real danger.
Schick: I would love to just add one more point to that. I absolutely agree with everything Matt has said and I think perhaps I should have clarified at the top of the podcast that one of the things that is so astounding about deepfakes and synthetic media is the ability of AI to replicate humans. So we've kind of tried it for ages even if you look at the best CGI and computer effects that are available now it doesn't quite work. A good example for instance Martin Scorsese's film The Irishman where he has this epic storyline spanning seven decades where he kind of de-ages his protagonist and in order to film that it took him five years, he had the best kind of special effects artists in Hollywood, he had a multimillion dollar budget, he filmed with a three-way camera, and if you saw the movie when it came out in 2019 the kind of de-aging effect was good but, you know, perhaps not really convincing as a consumer. Fast forward to 2020 and a single YouTuber with a budget of zero basically used some open source AI software so, i.e.k deepfake technology to have a crack at de-aging those protagonists and you can see the videos on YouTube and arguably his end result—it took him a week to do—is far better than anything Scorsese had managed to do from 2015 to 2019. So the ability of AI to visually replicate humans is not only superior to anything we've seen before but another very important point is that it can also be trained to hijack your biometrics, right? The way this technology works is that it is given training data and then it can learn to either look like you or sound like you and because the technology is accelerating at such a rapid rate the amount of training data that is needed is becoming less and less. So for instance if you wanted to create a deepfake—when I first started looking at this at the end of 2017 to early 2018—and let's say I wanted to synthesize Donald Trump. We would have needed hours and hours of video footage of Trump's. Hours and hours of his voice audio to kind of get the AI to learn what he sounds like in order to be able to recreate him as a deepfake. Now, barely three years later and already there are companies out there that say they can synthesize voice with a clip of 10 seconds. And even if that's a bold claim I think the direction of travel is clear. Anybody who has any kind of digital footprint—you don't even have to be a politician or a public figure—if you have an Instagram profile, if you have a Facebook profile, if you there is any digital footprint of you whatsoever on the internet then you could potentially become the target of a deepfake and I think that’s the thing that is so alarming about it.
Ferraro: You know, hearing Nina talk about it makes me think of one other thing I should add which is that we speak a lot about audio and video and imagery but there's also text. Just written text and the ability to create at scale text that sounds like it was written by humans can also pose major dangers. One area where this has already been somewhat documented is in administrative notice and common procedures here in the US if you want to enact an administrative regulation that's open to public notice and comment and there's been one example already of that being hijacked by false text so that is to say a computer generated text and certainly sounds like humans and that can manipulate the input into government functioning and then also of course online. I mean, think of how much effort is spent by businesses, by politicians for that matter, moderating, controlling, shaping online discussion and if you can create, at scale, millions of tweets that sounds like they were written by millions of individual users but only use a deepfake network, you can radically alter business perceptions, corporate perceptions, perceptions of politicians, and again, it sort of creates this morass in which truth and falsehood are ever so much intertwined.
Schick: That's such an important point. We don't often talk about synthetic text generation. I think OpenAI actually released their synthetic text generating model—GPT3 it's called—last summer, and it's arguably the most powerful AI that does exist, and there's been a very interesting study by the Middlebury Institute into terrorism as to how this model is basically, they tested it because OpenAI allowed researcher to do certain prompts and testing on this machine learning system, and they found that it could actually be a very powerful tool of radicalization because essentially if you think about the new model of how you radicalize individuals on the internet it's through one on one speaking on them, being down on message boards and now imagine you can do that at scale with AI. I think the implications are really potentially severe.
Chipman: What you're describing sounds to me like the commoditization of what used to be the realm of governments or organizations with enormous resources that would be required to create something like what you're describing, but you're talking about a world where anyone can create effectively a realistic looking video that seems real and post it out into the information ecosystem. What should we do about that? What should governments do about it? Nina, is this something that requires new laws? Is it something that requires private public partnerships? How do we think about even coming to grips with the problem you're describing?
Schick: I think it requires all of that and I think the conceptual starting point and this is really why I wrote my book and I kind of try to put a conceptual framework around this information ecosystem itself I called it the infocalypse and I described it as this increasingly dangerous and untrustworthy ecosystem where in everything exists you have to focus on fixing the integrity of the information ecosystem itself. Now, that's a lot easier said than done. I think the starting point is conceptualizing it and then beginning to see that this problem is so vast that there's no one single entity or part of society that can tackle it alone, right? So, industry, the private sector is going to have to work with government, the tech companies are going to have to be involved. You really have to take a networked approach. But in terms of your question as to government and legislation and regulation, absolutely, there's a huge role to play. From my own experience, however, of working with government and working with political leaders, I think, sometimes these changes have happened so quickly that some policymakers are not equipped to kind of handle the realities of this new information ecosystem, for instance, I used to advise the former secretary general of NATO Anders Fogh Rasmussen and of course when you were talking about war, he was more talking about artillery, tanks at the border, where are the green men, and I was talking about deepfakes, so I think like there's a generational divide to be bridged there. That's not to say that all policymakers are created equal and some really do get it but the bottom line here is again, that this is not something that government can take the lead on alone. It needs cross-society collaboration and it also needs really forward-thinking policy on things like digital education, you know, how are we going to change schooling so that from a young age people know that we live in this kind of untrustworthy information ecosystem and can kind of begin to understand how they need to protect themselves in it as well.
Chipman: Matt, what's happening in the US at the federal and state level on this topic?
Ferraro: There's been a surprising amount of action when it comes to legislating on synthetic media deepfakes and I think some of it, you could say, is successful and some of it is unsuccessful but I'd like to say that they're trying. Baseline, five states, California, New York, Maryland, Texas, Virginia have barred some kinds of deepfakes, the use of deepfakes in some manner or another. Those are primarily in either deepfakes that influence voters, or are of politicians within certain days of an election. And primarily that's sort of basket one, basket two is deepfake porn which Nina was talking about which is such an epidemic. It varies, most bills are civil, but they allow for civil remedies although a few are criminal. So far as I know there've been no successful prosecutions yet but again, they've only been enacted the past couple months really. In some cases, a year. At the federal level, there've been no laws that have been changed in terms of like the criminal code but there have been, interestingly, they've adopted, I guess it's probably now four bills on deepfakes depending how you count in the past 14 months. The first required the director of national intelligence to write a report on how foreign states were using deepfakes to affect US national security. The sort of most recent major bill was passed and signed by the President in December of 2020 and that would require the Department of Homeland Security to write a yearly report for five years assessing the dangers of deepfakes across the range of harm. So, from national security to frauds to their effects on vulnerable groups and civil rights. It also would require the DoD to write intelligence assessments of the threat posed by foreign governments creating deepfakes to target the military. So, there's been a lot of bill making and I should say for those out there in podcast land who aren't totally familiar with how this all works, this sort of widened the reports by government can lay a predicate for further legislating whether it's creating criminal laws or perhaps more administrative actions so I do think that there's actually a lot of work on it. I think part of it, honestly, is Nina's work raising the consciousness of this issue and others who are just very concerned about what this might mean for the future. But it's definitely evolving very quickly and I’m not sure, to be honest, that all problems have solutions.
Chipman: Matt and Nina, we've been a bit doom and gloom here. We've talked a lot about the dangers of deepfakes. Is there anything positive about deepfake technology? Is there a way to cultivate the use of this technology in a manner that's good for society? How should we think about that?
Schick: Absolutely. Like all powerful technologies of the exponential age, this is another one that is simply and amplifier of human intention. Now as it happens with my background in geopolitics and information warfare I've been very concerned about what deepfakes can mean with regards to corroding information ecosystem, but if you look at how synthetic media, and I use the term synthetic media to talk about the positive applications of deepfakes. It's going to change the world. I mean it's going to be one of the most transformational moments in the history of human communication, right? You're looking at a future where increasingly all of the media that we interact with going to be generated and synthesized by artificial intelligence. It's going to mean that content creation is going to open up and be democratized for almost anyone. So, a lot of people are saying, who work on the kind of generation side of synthetic media right now, that this is a real boon for creatives. If somebody who's on YouTube can create video footage that is just as compelling with fidelity effects that are just as high as something that is only exclusively accessible right now to Hollywood studios, what does that mean for the future of industries like entertainment, sports, art, corporate communications? There's actually a synthetic media generation start-up who I know very well based in London, they're called Synthesia, and they're entire vision is to change the future of corporate communications. They basically want to be able to make it just as easy to make a video as typing an email and they already work, right now, with Fortune 500 companies where essentially CEOs who want to communicate in several different regions across the world in several different languages. All that can be generated with a few clicks of a button on their back end, right? So, for them, this is a tremendous boon. Another potential really amazing use case, as I said, AI is so fantastically good at recreating your biometrics, so I know of another group of researchers for instance who's already looking into how they can use synthetic voice to be able to give those who've lost the ability to speak, for instance, through a neurodegenerative disease, through a stroke, how can they give them a synthetic voice to speak with again? So, I think the most important thing to understand here, again, is this point that I made at the very beginning, that this technology is not all bad. It's going to be an amplifier for human intention. It's extremely powerful and just as it will be used for many malicious purposes it will be used for good. So, this makes the question of how do we regulate this? What do we do about this as a society increasingly difficult because it's not going to be a case where you can just say oh, well, all synthetic media is bad and all deepfakes are bad so we can ban it. There's going to be a lot of gray areas and it's a paradigm change.
Ferraro: I agree that there are potential positive use cases for deepfakes. Particularly in addressing disabilities and restoring voice to those who have lost it to disease as Nina says. But I do think that the expansive use of deepfakes even for good purposes poses concerns that society should ponder further. Let me make two quick points, first, the increase use of deepfakes will invite universal skepticism of all media. This has been called by some "disbelief by default" and it can give power to the powerful to deny the veracity of real media if it shows them doing or saying things they don't like. In that world the leader can just claim that any video is merely a deepfake and the public will be more and more primed to believe them. Second, I think the truth has intrinsic value. The broad use of deepfakes can create an atmosphere of falsity that undermines this abstract good. And by that, I mean an empirical, verifiable world whose essential reality is shared by all. Going forward, I think we, as a society, have to grapple with what it means to live in a world where we all inhabit our own realities with tailor-made media to buttress our beliefs no matter the actual truth. It's not a today problem, but it's the kind of thing we need to think about going forward.
Chipman: Matt and Nina, I feel like we're at the tip of the iceberg on this extraordinary topic. Thank you very much, Nina. Thank you very much, Matt for joining us today to talk about his exciting topic. We really appreciate it.
Schick: A real pleasure to be here. Thank you for having me.
Ferraro: Thank you so much, Jason and Nina. This was so much fun.
Walsh: I agree that was a lot of fun and it’s a great topic although candidly a little disturbing. A big thanks to all three of you for joining us on this episode of In the Public Interest and for letting me listen in on what was really a fascinating discussion. It was a particular privilege to have Nina join us and share her expertise and incites on this subject. We are all going to be hearing a lot more about deepfakes and digital misinformation as that technology continues to evolve and as we all, including our governments, figure out how to deal with it. If you enjoyed this podcast please take a minute to share it with a friend and also to subscribe, rate, and review us wherever you get your podcasts. See you next time on In the Public Interest.