Phishing Tests Are Getting Downright Mean

Phishing scams are growing increasingly sophisticated, and IT departments seeking to help people outfox them are throwing sensational test traps at their employees and students. But as WSJ reporter Robert McMillan says, some of those who fall for them say the tests have gone too far (“free football tickets, anyone?”). Plus, AI reporter Belle Lin on how big tech wants to solve AI’s hallucinations using hard math.

Learn more about your ad choices. Visit megaphone.fm/adchoices

[00:00:00] Klar können wir Multitasking. Aber wenn's drauf ankommt, konzentrieren wir uns gern auf eine Sache. Das ist jetzt möglich mit dem neuen Samsung Galaxy S25 Ultra. Klick aufs Banner und entdecke deinen persönlichen AI-Begleiter. Aktiviere Google Gemini und frag die AI einfach zum Beispiel nach Restaurant-Optionen und teile sie mit deinen Kontakten. Das klingt dann so. Hey, such mir ein indisches Restaurant in der Nähe raus und sende es an Luca. Was das Galaxy S25 Ultra noch kann, erfährst du auf samsung.de.

[00:00:33] Welcome to Tech News Briefing. It's Monday, February 10th. I'm Pierre Bien-Aimé for The Wall Street Journal. How useful can artificial intelligence really be if it sometimes makes stuff up? Amazon is turning to so-called automated reasoning to cut down on hallucinations. And IT departments routinely try to fool their employees in order to get them to recognize hackers' phishing attempts. And some say they've gone a little too far.

[00:01:02] Hackers who engage in phishing, sending deceptive emails aimed at stealing sensitive information, are cooking up some increasingly sophisticated scams. As a result, IT departments at companies and universities are throwing sensational tests at their employees and students. The idea is, if you opened this email and clicked on the link, you've failed the test. And failure comes at a cost. Phishing, spelled with a PH, was the first step in about 14% of cyberattacks last year.

[00:01:31] That's according to an analysis of data breaches done by Verizon. Bob McMillan writes about computer security for The Wall Street Journal, and he reported on the value of these test traps. So Bob, how does phishing typically work? Well, they try to play in your mind. They try to get you in some kind of panic mode. So usually what happens with these phishing emails is there's some very, very important piece of information they promise. Like, your vacation days are being cut.

[00:02:00] And you're like, what? My vacation days are being cut? You click on the link. Then you have to log in. You don't even realize you're not on your corporate website. You're on some fake website the hackers set up. So you're giving them information right there. What are some of the examples of these kind of phishing traps set up by IT departments to lure people and maybe trick them into falling for it? There was one email. It was about a lost puppy dog in a parking lot.

[00:02:26] There was a guy who sent an email to NASA staffers promising, you know, a chance to win a ticket to see the final space shuttle launch. And he apparently made a staffer there cry when she realized it was a fake. The craziest example that I heard of was the University of California, Santa Cruz, which last summer sent a phishing email test themed Ebola outbreak at, you know, on campus.

[00:02:54] And it basically sent some people into a panic there thinking that there was a case of Ebola on campus. There wasn't. They were just trying to do a phishing education test. What are IT departments learning as far as the most effective way to spread awareness and boost resilience against phishing, which is the whole idea? I did interview a guy from Google who was talking about some other approaches to curbing this. Like education is not a bad thing.

[00:03:19] It's the idea of embarrassing people and putting them in an adversarial position and then going like, now listen to me. There are other ways of doing education, having phishing awareness months and fun, less shameful kinds of ways of teaching people to report phishing emails and to spot them. There's some research from the University of California in San Diego that basically looked at a variety of phishing email tests and then educational responses to them.

[00:03:47] And they found that basically the sort of classical approach to doing this yields negligible results. At best, they found sort of a 2% improvement in the likelihood of the targets to avoid phishing emails in the future. Now, the phishing attempts of your look pretty ridiculous. Things like free prizes, I love you in all caps, or, you know, the notorious email from a Nigerian prince.

[00:04:13] But what do really advanced phishing attempts look like now? The hackers are getting very clever. They know how corporations work and they know which kind of emails are high priority. They know like an email from the CEO demanding some kind of immediate response about corporate facts or what's going on with this pitch or something like that. They know those kinds of emails are very successful.

[00:04:36] At the open enrollment time of year, you know, when you're re-upping your medical, they know that an email with that theme that's sent around November, you know, gets a very high response rate. The problem is these phishing attacks, they lead to ransomware. They lead to, like, catastrophic consequences for some corporations, for some hospitals. And there's a sense of urgency around stopping them from working. That was WSJ reporter Bob McMillan.

[00:05:03] Coming up, AI bots occasionally say the darndest things, giving flat-out wrong answers. We hear about the obscure field of research that could help solve that problem. That's after the break. Artificial intelligence is known to sometimes make up answers and to share these so-called hallucinations with confidence.

[00:05:32] Now, Amazon's cloud computing unit, Amazon Web Services, is looking to automated reasoning for hard mathematical proof that these errors can be stopped, at least in certain areas. Some analysts say that success could mean millions of dollars' worth of AI deals with businesses. Belle Lin writes about AI and enterprise technology for The Wall Street Journal, and she joins me now. Okay, so, Belle, how does this automated reasoning work, this mathematical concept that Amazon is turning to to solve hallucinations, in part?

[00:06:00] Automated reasoning is actually a branch of AI. So, in some ways, you can think of it as using AI and math to sort of fight back against a different form of AI's hallucinations or propensity to spit back this inaccurate data. And automated reasoning is really using computers to automate the mathematical logic behind putting rules into AI and sort of hard-coding it.

[00:06:23] So, machine learning differs from automated reasoning in that it basically hoovers up a bunch of data, and that can be structured or unstructured data. It can be words or text. It could be numbers. And it teaches machines or computers how to capture patterns from that data. So, how to separate a dog from a cat, how to identify a number from a letter.

[00:06:46] And so, that's how the machine captures or gets its intelligence, whereas in automated reasoning, you're sort of hard-coding a set of rules and logic into a system. So, the AI is able to check itself for errors in a way? Yeah, that's right. So, similar to the way that we've heard that some large-language models can reason through problems, is the system working the way that it's intended? Is the model spitting out an answer that's accurate based on a preset-defined set of rules?

[00:07:16] And those rules for a company can be a set of internal company guidelines for employees, or it could be a product guidebook for customers to know what sorts of services and products you have in your catalog. Okay. So, speaking of customers, has Amazon had much success taking this approach to market? It's relatively new, and it's something that they call in preview. So, they're certainly testing it and hoping that it really unlocks a lot of business deals for them.

[00:07:43] Because right now, hallucinations are a big blocker, not just for consumers like you and I to use chatbots more fully in our daily lives, but for businesses who need that reliability when they're doing things like creating advertisements for pharmaceuticals and they can't run afoul of regulations or, at worst, promote something that is completely inaccurate. So, for instance, PricewaterhouseCoopers, the big audit accounting and tax firm, is actually a customer of Amazon's.

[00:08:11] And so, that's important because, on one hand, the AI may be trying to help PricewaterhouseCoopers achieve the goal of creating really good advertising, but that runs up against the goal of ensuring that the regulations are met for how these drugs are marketed. And so, you need automated reasoning to come in and say, yes, we are adhering to the regulations or no, we're not adhering to the regulations. What are experts saying about automated reasoning? Is it something that's really going to put an end to hallucinations? Oh, absolutely not.

[00:08:41] When you pose this question to an automated reasoning system, the AWS scientist who told me about how they're using these systems said the answer is, quote, undecidable. And so, that's a really interesting answer that I interpreted as no, because when automated reasoning can't tell you something with accuracy 100% of the time, that means it's probably a no because it can't say that it's 100% a yes. Are there some other possible solutions?

[00:09:09] The solution that Amazon is pushing and its competitors like Microsoft and Google also have something similar for reducing chatbot hallucinations. And so, they're saying that maybe the hallucinations can be mitigated or chatbots might be taught to say, I don't know, rather than eliminating them altogether.

[00:09:26] There are actually really great uses for hallucinations in creative sectors and fields where you want a really wacky image because you are a painter or you want some out-of-the-box song lyric because you're a lyricist. So, it's by design that these chatbots hallucinate, but we really do want them to not hallucinate in circumstances where it really, really matters. But until then, it seems like you'll always maybe want a human in the chain to check that things are right. Yeah, that's right.

[00:09:54] None of the big tech companies are saying that humans should be out of the loop altogether, that automated reasoning and other methods like retrieval augmented generation should supplant the need for a human to basically check the output of a chatbot. Or for a doctor to check the output of a medical question you input into the system. PricewaterhouseCoopers still has their legal team review the advertisements, for instance. So, the chatbot and automated reasoning forms the first layer of checking. That was our reporter, Belle Lin.

[00:10:23] And that's it for Tech News Briefing. Today's show was produced by Julie Chang with supervising producer Catherine Millsop. I'm Pierre Bien-Aimé for The Wall Street Journal. We'll be back this afternoon with TNB Tech Minute. Thanks for listening.

Search Episodes