CLIP
You are listening to a Frequency Podcast network production in association with city news.
Jordan Heath Rawlings
You may or may not use it, but you’ve probably been offered it at some point as a way to easily and safely verify your identity. It’s called voice id, and while it does accurately recognize your voice and use it to log you in, your voice, it turns out, doesn’t have to come from you.
CLIP
Welcome to Lloyd’s Bank, so we can get you to the right place. In your own words, please say the reason for your call. Check my balance. Okay. Please say my voice is my password. My voice is my password.
Jordan Heath Rawlings
That is a clip of our guest today using an artificial intelligence program to mimic his voice and log into his online banking portal. When voice ID was created, it likely was totally secure. And depending on how yours is set up and with whom, it might still be, but there is no denying that as artificial intelligence gets better at imitating humans from examples we provided security programs like this become less and less reliable. In this case, all someone would need is a lot of audio of you speaking, like, say, episodes of a podcast that you host. And then AI can say things using your voice. It is the beginning of an online security arms race. And if it sounds scary now, just wait.
I am Jordan Heath-Rawlings. This is the big story. Joseph Cox is a reporter who covers hackers crime and privacy for motherboard on Vice. Hi Joseph.
Joseph Cox
Hi. Thank you, uh, for having me.
Jordan Heath Rawlings
You’re welcome. This story is a little scary to watch that video footage you put together.
Joseph Cox
Yeah, totally.
Jordan Heath Rawlings
And I, I mean, I’m still surprised that it even worked.
Joseph Cox
Yeah.
Jordan Heath Rawlings
And that this sort of thing is even possible and how quickly it has become possible to do this sort of thing. I want to get into exactly what you did and how you did it, but maybe first, and this is something kind of near and dear to the heart of people who podcast for a living.
Jordan Heath Rawlings
Before we talk about that hack, how far has AI generated audio come recently compared to, uh, the text and images that, that we’re constantly seeing?
Joseph Cox
Yeah, so I mean, on one side you’ll have, you know, applications like chat, G p T, which are purely handling text or Dali, which is handling images. And there’s been a, a massive boon, uh, in the quality and the quantity of material produced by those audio has had something very, very similar, but people are almost not paying attention to it despite the quality being. Incredible at this point, just over the past, I don’t know, 3, 4, 5 months, something like that, an entire industry of companies have sprung up the offer to generate and create synthetic artificial intelligence powered voices. So rather than, you know, I’m gonna make an image with AI, or I’m gonna make an essay with artificial intelligence, I can clone somebody’s voice. Potentially anybody’s voice and then have that AI read out a script. And the quality today is impressive. Just say the least.
Jordan Heath Rawlings
You mentioned 3, 4, 5 months, this has really ramped up. How long has this been being attempted for? Because I do remember a couple of years ago hearing that this was kind of here and coming and uh, I played around with it on one platform and it sounded disjointed. It sounded kind of nothing like a natural person talking in a natural voice. When you listen to it now, How different is it?
Joseph Cox
Yeah. We’ve had the same issue where we’ve actually tried some voice AI solutions or software one or two years ago and we cloned our own podcast host’s voice and it wasn’t great, to be perfectly honest.
It really wasn’t that impressive. Now it’s not just the quality which is up there, including being able to, being able to clone my British accent. Right? Which is actually what something that some of the AI programs do have a little bit of trouble with, cuz it defaults to American. It’s not just that quality, but it’s the speed. And one of the companies I use that I’m sure we’ll talk about in a minute, can do this. Near instantaneously, while others can take, you know, from half a dozen hours to a dozen hours up to sometimes 24 hours to generate the voice. Some companies are doing this on the spot.
Jordan Heath Rawlings
Well, let’s talk about that company now.
Joseph Cox
I think it’s called 11 Labs, right? Yes, that’s right. What is their offering? What does their AI program purport to do? So 11 labs made by an ex Google and an ex Palantir pair of workers. What they’re trying to do is make it really easy to do AI powered voices across languages, I believe. So, you know, I have a video or a film or some sort of content in English, and maybe I want to automatically translate it convincingly into Spanish or whatever. Now that’s one side of their business. The other side is voice cloning and that’s what I was focused on and what a lot of the abuse is focused on as well. And what you could do there is say I’m a voice actor and you know, I want to replicate my voice, and then I could almost artificially do more recording sessions because I’ve just made a clone of myself that is sort of what they offer and they target, you know, newsletters, maybe you wanna read that out. They target audio, uh, more generally like podcasts and film as well. So they really are trying to target whole spectrum of industries there with this technology.
Jordan Heath Rawlings
If you can, without getting way too technical and above my head, how does this actually work? What is this program doing that is actually kind of a secret and we don’t really know.
Joseph Cox
I mean, very generally speaking, these systems will, you know, use some combination of machine learning or artificial intelligence. In the same way chat gpt does, or the image generating ones where they have this massive corpus of material, right? They have a massive data bank of stuff that they’ve pulled from, be there images or text, or in this case audio. And they’re finding a way to be able to replicate, uh, the voices there. And of course, they’ve as we said, they’ve, they really vary in quality. But 11 labs here. It’s, there’s just something else about it. It is uncanny in a lot of ways, and I honestly can’t tell you exactly how they do it, especially how they do it so quickly, as in near instantaneously, but they’ve clearly got something going on there, which attracting all sorts of people to that platform.
Jordan Heath Rawlings
And I was gonna ask you, you know, about the use cases, uh, for this kind of thing, the legal use cases, but I think you’ve laid those out pretty well, what you set out to do. In fact, why don’t you just introduce us to the whole scheme. You, uh, you tried to run here, what did you set out to do and how did you go about it?
Joseph Cox
Sure. So I saw a lot of people using 11 labs, and it made me realize, wait a second. My bank in the uk, Lloyd’s Bank, it uses this sort of voice verification thing where when I phone up and I wanna check my balance or something, it will ask you to say, okay, please say my voice is my password, and it will recognize it’s me and it will let me in. So I thought, well, can I fool that with AI? Can I use 11 labs or some other company to generate a synthetic version of my own voice and use that to break in? And I’ve actually been trying this for several weeks, not just because I, you know, the system failed and they did a couple of times. So I eventually got in, but it was, as I mentioned, mostly because of my British accent and very few of these companies facilitating for that. I was given a quote by one company to pay something like 2,500 US dollars just to turn the British setting on. Uh, and I was like, I’m okay. I’m, I’m not going to pay that. I don’t think Vice would cover that for this article. So I went about continuing and eventually 11 labs came up. So I started that and what it involved was you log into the website very easily and I recorded just five minutes of audio.
Uh, I think you can do it straight into the site, or I actually recorded some high quality audio and uploaded it, and then instantly my voice was ready to use. So here I am. Ready to phone up Lloyd’s Bank and try out this voice. It didn’t work immediately. I had to sort of poke and prod the AI at 11 Labs a little bit to make the cadences rights. The cadence of my voice is my password. Actually took a little bit of work to get going, and the trick I found was, well, if I get the AI to say, a long paragraph or two of text, it somehow makes the cadence a lot more natural. So I did that. I then opened up some audio software, trimmed out the clip I wanted, and eventually played that. And then what everybody sees in the video that we uploaded in our article is that, yes, this AI successfully tricked the security system I bank to let me in, even though it wasn’t really me. Of course. So there are some voice verification programs that can, you know, give you a balance or give you a statement or let you access stuff.
Jordan Heath Rawlings
To be clear, what kind of access did you have using this artificial intelligence? Could you get into your bank? Could you have, say, sent me a thousand bucks?
Joseph Coxe
It really varies from bank to bank, so I can only really talk about specifically Lloyd’s, but what I definitely could do was check a balance, check recent transactions, and then check transfers. I believe in the case of Lloyd’s, you may need another piece of verification, as in it may actually connect you on the line to a bank representative, but I would say that only applies to Deloitte’s. You know, I, I, I couldn’t say if that applies to all the different banks, apart from the fact that more and more banks are using this across the us, across Europe and the, and the uk. This is an increasingly popular, uh, method for verification and authentication from banks.
Jordan Heath Rawlings
Is it just banks? How common is this technology? And uh, it’s kind of weird to me that, you know, we’re describing how quickly AI is outpacing it, and yet you’re talking about it as though it’s a verification form still on the upswing.
Joseph Cox
Yeah, I mean, you could even go back to things like Siri, you know, and iPhone, and I know that’s not necessarily a security tool, but it’s something. That you were speaking to your phone to change settings or to search the internet or search YouTube or whatever. In some limited circumstances, Siri can actually be used by police forces or, or other sorts of third parties to extract information from a phone. You know, maybe the phone is locked, but they can turn Siri on. They can request information that way. So there is sort of a whole other avenue for these artificial intelligence voices down there beyond banks. I mean, it’s still a financial industry, but you’ll have places where you can trade stocks and that sort of thing. And if you can get into an account with an AI generative voice into, I don’t know, fidelity or, or Vanguard or whatever, I mean potentially, uh uh, bottom, you could at least find some very interesting information if you’re a hacker on the target, you know, potentially seeing how much money they have in there.
Jordan Heath Rawlings
What is the selling pitch for this kind of security? Given that, you know, you’re already kind of breaking it, why would they use voice verification theoretically?
Joseph Cox
Yeah. I mean, in 99.9% or whatever of cases it is absolutely going to be more convenient for ordinary consumers to phone up the bank and rather than fumbling over a pin number, they wrote down somewhere and they can’t remember, or another passcode, or they’re Social Security or their date of birth or, or, or whatever it is, undoubtedly easier just to ring up and say my voice as my password, and then it’ll, it’ll let you in. And arguably, and this is what Lloyd’s Bank told me, it is a lot more secure. For the most people, you know, Phish, where a hacker will send a text message or a, or a call or an email and try to extract login information from you is simply an everyday reality of the internet, uh, and, and the worldwide web that we have today. So why not make something that can’t be fished theoretically, and that’s your voice right now. Those systems were made. You know, a few years ago, I think that’s when Lloyd’s introduced it for me anyway, so they’ve been designed for a long time. They were clearly designed and implemented in an age where you couldn’t easily clone somebody’s voice. And as I said, in the past few months, that has now completely flipped. That’s j, that’s just not the reality anymore.
Jordan Heah-Rawlingst
Did you reach out to Lloyd’s after you got into your account this way, and did you reach out to, you know, any of the other banks, uh, or places that we’ve talked about that use this kind of stuff? And what did, what did they say when you’re like, Hey, your, your few years old, technology is already vulnerable.
Joseph Cox
Yeah, so for Lloyd’s, I sent them the video itself. You know, I wanted to give them a very fair chance to comment and, Not keep it vague. I wanted to show them, Hey, look, this is literally me going into my bank account on your platform. Uh, and, and they were actually very helpful in that, you know, they provided me a long statement and some additional information and they said that, you know, voice verification is great for protecting phishing. They said they’ve seen a significant drop in fraud in the introduction of this. But with that being said, they are aware of artificial intelligence powered voices, they are paying attention to it, not just themselves, but technology partners as well. And I believe across the industry, you know, so even though it may be right at this moment, a sort of fringe issue where you could only potentially be targeted if there’s tons of your voice so out somewhere on the internet such as a YouTuber, a podcaster, or a streamer, for the majority of people it will still be secure, but you absolutely cannot ignore this frat, especially if you’re a bank, obviously in a highly regulated industry.
Jordan Heath-Rawlings
That was kind of gonna be my next question, which is, you know, you did this, uh, with your own bank account and your own voice and you know, presumably, you know, already your, all your personal information and whatever else they could ask you. How easy would it be for you as somebody who does have familiarity with hacking and, and is used to doing this kind of stuff, to do this to somebody who wasn’t you? How would you do it to me?
Joseph Cox
So there will be two things, at least with Lloyd’s when you first phone up, it does ask for your date of birth, which is sort of the first little piece of information. Now you may think, oh, well that’s the end. You’re never gonna get that. No. The date of birth is very easy to get. You know, maybe it’s in your credit report that you can just buy online. There are plenty of data breaches. Maybe it’s in there.
Jordan Heath-Rawlings
I probably posted a meme that, uh, told you, you know, if I was born in this month and this year and this day, then my porn star name is X, Y, z. You know the people that do those things.
Joseph Cox
Exactly. Or not even that somebody can just, all their friends are telling them happy birthday on Facebook. It’s really not difficult to figure out the day. You just have to guess the year. Right, right. Okay, so you gotta, you got a birthday, what else you need? Right, exactly. Well, so I have that first piece of authentication, whatever it is, and now it’s about getting the voice. It’s about building this synthetic voice. Oh, crap. Yes. The way you’re gonna do that is obviously going to your previous recordings. Of the podcast and simply just ripping them out, trimming them, and getting five minutes of audio and uploading it to one of these services, it would almost be trivial. It, it, it would be trivial to do this to specifically a podcast host who is constantly recording high quality audio. I do think a YouTuber would be similar. A Twitch streamer, uh, a politician, somebody who gives speeches, somebody who appears on TV a lot. You really do not need a lot. Of source material to build this sort of thing. So those are the sorts of people. I think it would be pretty damn easy to make a, a voice clone off. Now, whether you could use that voice clone specifically to break into a bank, you know, that’s gonna take a little bit of, um, uh, poking and prodding, but you could absolutely make a voice. Yeah.
Jordan Heath-Rawlings
What would you do if they weren’t a podcast host or a public figure, I guess, do you, do you call them up and get them talking the same way?
Joseph Cox
I guess that’s Phish, uh, just a new version of Phish. Yeah, exactly. I mean, it would be a little bit labor intensive, but we already have, you know, call center scams, right? We already have scams where you phone up a family member or a friend or whatever, and they’re saying, oh, we, we saw your account was getting hacked. You needed to give us your code, whatever. Just get them to talk for five minutes. So now to be clear, I haven’t seen any evidence of that actually happening, but. I could certainly at least see somebody giving it a go because if you need only a few minutes, then just get them on the phone for that amount of time.
Jordan Heath-Rawlings
Did you talk to online security companies or experts, um, who sort of fight back against this stuff and, and show them what you’ve done and what did they say about it? Yeah, I spoke to the c e o of, uh, one cybersecurity company. There’s focused on social engineering, you know, that is tricking the human element in security rather than the software or the hardware. Uh, and of course at bottom, this is almost, uh, more social engineering than technical, you know, because we are talking about the clone of clone of a human here. So I spoke to them and they basically thought that this sort of voice authentication should not be used by banks, and they recommended they switched to something else.
Um, Multifactor authentication. Now, you know, plenty of people will know that as the text message you get or the code you have in the little app, or maybe you even have something like a, a security key that you, you plug via USB into your laptop. They recommended that. I would push back against that just slightly in that when we are talking about banks, this is such, these banks operate at such a massive scale that you cannot ignore just how big they are, right? So if voice verification is good for 99% of people, even 90% of people, that is, you know, millions, tens of millions, hundreds of mil, whatever, of people, and you know, if you switch all of them to text verification, well then you have the issues of text verification as well. If you switch them all to an app, There’ll be issues there. So I think it’s a difficult decision, but now the banks still need to factor in the existence of AI voices into that decision. When you look at the security technology and then the advancements in AI that allowed you to do this, is there any way.
Jordan Heath Rawlings
Institutions like banks are going to be able to outpace, uh, the hackers out there as these technologies become more and more widespread and available to anybody who wants to sign up for a beta, right?
Joseph Cox
I, I think it’s gonna get more difficult for sure. Lloyds, they say that, you know, they analyze for a hundred different characteristics of your voice, including, you know, how large you open your mouth and all this other stuff that’s imperceptible to the human ear, apparently. Well, I had a recording of my synthetic voice and that was still good enough to get in eventually. So, you know, I dunno what those a hundred characteristics are exactly, but maybe they need to do 200 or 300 or something like that. But it is going to get harder, as you say, just because not only the quality of these AI voices now. But the crazy availability of them is, it’s one thing to have a technology, it’s another to put it in the hands of basically anybody, including the sorts of hackers who may not be super sophisticated, but they don’t really care and they’ll just keep trying and trying and trying until they get in. They’re almost, well, they’re sometimes the ones you have to worry about more. The ones who don’t care and they’ll just keep trying.
Jordan Heath Rawlings
One last question. What’s your advice for regular people who aren’t interested, perhaps in the ramifications of any of this, but just wanna be safe online? How does this change their approach?
Joseph Cox
I would check whether your bank offers this sort of service, and I would check whether it is opt out or opt-in, you know, maybe it would actually be very convenient for you and you don’t want have to worry about remembering a password, that sort of thing. It’s probably gonna be a good call for you, but before you do that, just check how do I turn it on? How do I turn it off? I, I if I don’t want it. And that’s usually a Google search away and available on the bank’s websites. So I would just do that quick Google search, it’ll take you 30 to 60 seconds. And then if you want to turn it on, do it, and then you don’t need to worry about it. But I would just take half a minute to think about whether you want this or not.
Jordan Heath Rawlings
Joseph, thank you so much for this.
Joseph Cox
Thank you. I appreciate it.
Jordan Heath Rawlings
Joseph Cox, writing on motherboard. That was the big story. For more, you can head to the big story podcast.ca. We’ve done lots of episodes of ai. We could do an episode about a scary AI thing, literally every day for a month. In fact, how do you know that I’m not an AI talking to you right now after being fed 500 episodes of the Big Story podcast? I promise I’m not. Jordan is still employed. You can talk to us anytime. The real us, I promise. There are humans looking at these things on Twitter at the big storyfFpn. You can write to us via email hello at the big story podcast.ca, and you can call us and use AI to mimic your voice. And leave us a voicemail. 4 1 6 9 3 5 5 9 3 5. Thanks for listening. I’m Jordan Heath-Rawlings. Really. I am. I swear. We’ll talk on Monday.
Back to top of page