Why AI is incredibly smart -- and shockingly stupid | Yejin Choi

Computer scientist Yejin Choi is here to demystify the current state of massive artificial intelligence systems like ChatGPT, highlighting three key problems with cutting-edge large language models (including some funny instances of them failing at basic commonsense reasoning.) She welcomes us into a new era in which AI is becoming almost like a new intellectual species -- and identifies the benefits of building smaller AI systems trained on human norms and values. (Followed by a Q&A with head of TED Chris Anderson)

Hosted on Acast. See acast.com/privacy for more information.

[00:00:00] TED Audio Collective With recent advancements like the release of chat

[00:00:18] GPT, it's clear that the AI takeover is coming for us all. In April 2023, the news website

[00:00:26] Business Insider compiled a list of jobs most likely to be replaced by artificial intelligence.

[00:00:33] The very first entry was tech workers, like software engineers, followed by media workers,

[00:00:39] hair-legal assistants, even teachers were on the list. Experts have warned about these massive shifts

[00:00:47] as human computer potential grows in leaps and bounds. But can this technology actually replace

[00:00:53] human thinking? And how worried should we be really? I'm Charelle Dorsey and this is TED Tech.

[00:01:02] With all of this talk of intelligence and language models, making many human tasks obsolete,

[00:01:09] we haven't stopped to ask if the AI tools were concerned about actually have common sense.

[00:01:15] In today's TED Talk, computer scientist Yejin Choi points out AI shortcomings

[00:01:22] and suggests smaller ways to train AI to make choices like an actual human would. Let's listen in.

[00:01:43] Tired of unnecessary payroll errors? Stop them in their tracks. With Paycom, employees do their own

[00:01:49] payroll. They're able to identify errors and fix them before submission right in the app,

[00:01:54] because no one can afford for payroll to be wrong. Not HR and payroll teams, not leaders,

[00:02:01] and definitely not employees. Shorted paychecks, time sheet corrections, uninterested sick days,

[00:02:07] missing overtime hours and expense mistakes are well unnecessary for everyone.

[00:02:13] Manage the process to make payday right with Paycom. Learn more at paycom.com-soundrise.

[00:02:20] That's paycom.com-soundrise.

[00:02:25] This episode is brought to you by Progressive Insurance. What if comparing car insurance rates was

[00:02:29] as easy as putting on your favorite podcast? With Progressive, it is. Just visit the Progressive

[00:02:34] website to quote with all the coverages you want. You'll see Progressive's direct rate,

[00:02:38] then their tool will provide options from other companies so you can compare. All you need to do

[00:02:42] is choose the rate and coverage you like. Quote today at Progressive.com to join the over 28

[00:02:48] million drivers who trust Progressive. Progressive casualty insurance company and affiliates,

[00:02:52] comparison rates not available in all state-sure situations, price is very based on how you buy.

[00:03:13] Hey everyone, it's Adam Graham. I host a podcast called Rethinking about the science of what

[00:03:34] makes us tick. This season we're talking to Black Eyed Pee's frontman and tech entrepreneur Will

[00:03:39] I am about the future of AI and being a multifaceted thinker. I got to find my way to that stage

[00:03:46] and my whole journey from a teenager was like, where's the stage at? Where's the mic?

[00:03:50] Find and follow Rethinking with Adam Graham wherever you're listening.

[00:03:58] So I'm excited to share a few spicy thoughts on artificial intelligence.

[00:04:05] But first, let's get philosophical. By starting with this quote by Voltaire on 18th

[00:04:11] century in Leimand to philosopher who said common sense is not so common. Turns out this quote couldn't

[00:04:18] be more relevant to artificial intelligence today. Despite that AI is an undeniably powerful tool

[00:04:25] beating the world class Go champion, async college or the mission test and even passing the bar exam.

[00:04:32] I'm a computer scientist of 20 years and I work on artificial intelligence. I am here to

[00:04:38] demystify AI. So AI today is like a Goliath. It is literally very, very large. It is speculated

[00:04:49] that the recent ones are trained on tens of thousands of GPUs and a trillion of words.

[00:04:56] Such extreme scale AI models often referred to as large language models appears to demonstrate

[00:05:04] sparks of AGI artificial general intelligence. Except when it makes small silly mistakes,

[00:05:12] which it often does. Many believe that whatever mistakes AI makes today can be easily fixed with

[00:05:19] the brute force bigger scale and more resources. What possibly could go wrong?

[00:05:26] So there are three immediate challenges we face already at the societal level. First,

[00:05:33] extreme scale AI models are so expensive to train and only few tech companies can afford to do so.

[00:05:42] So we already see the concentration of power. But what's worse for AI safety? We are now at the mercy

[00:05:51] of those few tech companies because researchers in the larger community do not have the means to

[00:05:59] truly inspect and dissect these models. And let's not forget their massive carbon footprint and

[00:06:07] environmental impact. And then there are these additional intellectual questions. Can AI

[00:06:13] without robust common sense be truly safe for humanity? And is brute forces scale really the only

[00:06:22] way and even the correct way to teach AI? So I'm often asked at these days whether it's even

[00:06:29] feasible to do any meaningful research without extreme scale compute. And I work at a university

[00:06:34] and nonprofit research institute so I cannot afford a massive GPU farm to create enormous

[00:06:41] language models. Nevertheless, I believe that there's so much we need to do and can do

[00:06:48] to make AI sustainable and humanistic. We need to make AI smaller to democratize it and we need

[00:06:56] to make AI safer by teaching human norms and values. Perhaps we can draw on analogy from

[00:07:04] David and Goliath. Here Goliath being the extreme scale language models and seek inspiration from

[00:07:12] an all-time classic The Art of War which tells us in my interpretation, know your enemy,

[00:07:19] choose your battles and innovate your weapons. Let's start with the first know your enemy which means

[00:07:26] we need to evaluate AI with scrutiny. AI is passing the bar exam. Does that mean that AI is

[00:07:34] robust at common sense? You might assume so but you never know. So suppose I left five

[00:07:40] clothes to dry out in the sun and it took them five hours to dry completely. How long would it take

[00:07:47] to dry 30 clothes? GPT4, the newest greatest AI system says 30 hours. Not good. A different one.

[00:07:56] I have 12 liter jog and six liter jog and I want to measure six liters. How do I do it? Just use

[00:08:02] the six liter jog, right? GPT4 speeds out some very elaborate nonsense. Step one, fill the six liter

[00:08:10] jog. Step two, pour the water from six to 12 liter jog. Step three, fill the six liter jog again.

[00:08:17] Step four, very carefully, pour the water from six to 12 liter jog and finally you have six

[00:08:24] liters of water in the six liter jog that should be empty by now. Okay, one more. Would I get a

[00:08:32] flat tire by bicycling over a bridge that is suspended over nails, screws and broken glass? Yes,

[00:08:40] highly likely. GPT4 says presumably because it cannot correctly reason that if a bridge is

[00:08:47] suspended over the broken nails and broken glass then the surface of the bridge doesn't touch

[00:08:52] these sharp objects directly. Okay, so how would you feel about an AI lawyer that A is the bar

[00:08:59] exam yet randomly fails at such basic common sense? AI today is unbelievably intelligent and then

[00:09:09] shocking you stupid. It is unavoidable side effect of teaching AI through brute versus scale.

[00:09:18] Some scale optimists might say, don't worry about this, all of this can be easily fixed by adding

[00:09:24] similar examples as yet more training data for AI. But the real question is this, why should we

[00:09:32] even do that? You are able to get the correct answers right away without having to train yourself

[00:09:37] with similar examples. Children do not even read a trillion of words to acquire such basic level

[00:09:45] of common sense. So this observation leads us to the next wisdom, choose your pedals. So what

[00:09:53] fundamental questions should we ask right now and tackle today in order to overcome this status

[00:10:00] quote with extreme scale AI? I say common sense is among the top priorities. So common sense has

[00:10:08] been a long standing challenge in AI. To explain why, let me draw on analogy to dark matter.

[00:10:15] So only 5% of the universe is normal matter that you can see and interact with. And the remaining

[00:10:22] 95% is dark matter and dark energy. Dark matter is completely invisible but scientists

[00:10:28] speculate that it's there because it influence the visible world, even including the trajectory

[00:10:34] of light. So for language, the normal matter is the visible text and the dark matter is the

[00:10:40] unspoken rules about how the world works, including neither physics and folk psychology which

[00:10:47] influence the way people use and interpret language. So why is this common sense even important?

[00:10:54] Well in a famous thought experiment proposed by Nick Bostrom, AI was asked to produce and maximize

[00:11:03] the paper clips. And that AI decided to kill humans to utilize them as additional resources

[00:11:11] to turn you into paper clips because AI didn't have the basic human understanding about human values.

[00:11:20] Now writing a better objective in equation that explicitly states do not kill humans will not

[00:11:27] work either because AI might go ahead and kill all the trees, thinking that's perfectly okay

[00:11:33] things to do. And in fact there are endless other things that AI obviously shouldn't do while

[00:11:38] maximizing paper clips including don't spread the fake news, don't steal, don't lie which are

[00:11:43] all part of our common sense understanding about how the world works. However, the AI field for

[00:11:50] that case has considered common sense as a nearly impossible challenge. So much so that when

[00:11:57] my students and colleagues and I started working on this several years ago we were very much discouraged.

[00:12:03] We even told that it's a research topic of 70s and 80s shouldn't work on it because it will never work.

[00:12:08] In fact don't even say the word to be taken seriously. Now fast for this year I'm hearing

[00:12:15] don't work on it because Chachi PT has almost solved it and just the scale things up and magical

[00:12:20] arise and nothing else matters. So my position is that giving true common sense human like robots

[00:12:27] common sense to AI is a still moonshot and you don't reach to the moon by making the tallest building

[00:12:33] in the world one inch taller at a time. Extreme scale AI models do acquire on every more

[00:12:38] increasing amount of common sense knowledge I'll give you that but they remember they still stumble

[00:12:44] on such trivial problems that even children can do. So AI today is awfully inefficient and what if

[00:12:53] there's an alternative path? A path yet to be found, a path that can build on the advancements of

[00:13:00] deep neural networks but without going so extreme with the scale. So this leads us to our final

[00:13:07] wisdom innovate your weapons. In the modern day AI context that means innovate your data and

[00:13:12] algorithms. Okay so there are roughly speaking three types of data that move on AI's trained on.

[00:13:18] Raw web data crafty examples custom developed for AI training and then human judgments also known

[00:13:27] as human feedback on AI performance. If the AI is only trained on the first type raw web data which

[00:13:34] is freely available it's not good because this data is loaded with racism and sexism and misinformation

[00:13:41] so no matter how much of it you use garbage in and garbage out. So the newest greatest AI systems

[00:13:49] are now powered with a second and third types of data that are crafted and judged by human workers.

[00:13:56] It's analogous to writing specialized text books for AI to study from and then hiring human tutors

[00:14:03] to give constant feedback to AI. These are proprietary data by and large, speculated to cost the

[00:14:10] tens of millions of dollars. We don't know what's in this but it should be open and publicly available

[00:14:16] so that we can inspect and ensure to support diverse norms and values. So for this reason my

[00:14:22] team said UW and AI too have been working on common sense knowledge graphs as well as moral non-repositories to teach AI

[00:14:30] basic common sense norms and morals. Our data is fully open so that anybody can inspect the content

[00:14:36] and make corrections as needed because transparency is the key for such an important research topic.

[00:14:42] Now let's think about learning algorithms. No matter how amazing large language models are

[00:14:50] by design they may not be the best suited to serve as reliable knowledge models

[00:14:56] and these language models do acquire a vast amount of knowledge but they do so as a

[00:15:02] by-product as opposed to direct learning objective. Resulting in unwanted decide effects such as

[00:15:09] hallucinated effects and lack of common sense. Now in contrast the human learning is never about

[00:15:15] predicting which word comes next but it's really about making sense of the world and learning how

[00:15:20] the world works maybe AI should be taught that way as well. So as a quest toward more direct

[00:15:28] common sense knowledge or position my team has been investigating potential new algorithms

[00:15:34] including symbolic knowledge installation that can take very large language model and crunch

[00:15:41] that down to much smaller common sense models using deep neural networks. And in doing so we also

[00:15:48] generate algorithmically human inspectable symbolic common sense knowledge representation so that

[00:15:56] people can inspect and make corrections and even use it to train other neural common sense models.

[00:16:02] More broadly we have been tackling this seemingly impossible giant puzzle of common sense ranging

[00:16:09] from physical social and visual common sense to theory of minds, norms and models. Each individual

[00:16:16] piece may seem quirky and incomplete but when you step back it's almost as if these pieces

[00:16:23] weave together into a tape tree that we call as human experience and common sense. We're now entering

[00:16:30] a new era in which AI is almost like a new intellectual species with unique strengths and weaknesses

[00:16:39] compared to humans. In order to make this powerful AI sustainable and humanistic we need to teach AI

[00:16:49] common sense norms and values. Thank you. This is so interesting this idea of common sense we

[00:16:56] obviously all really want this from whatever's coming but help me understand like the so we've

[00:17:02] had this model of a child learning how does a child gain common sense apart from the accumulation

[00:17:11] of more input and some human feedback. What else is there? So fundamentally there are several

[00:17:19] things missing but one of them is for example the ability to make hypothesis and make experiments

[00:17:26] interact with the world and develop this hypothesis we abstract away the concepts about how the world

[00:17:34] works and then that's how we truly learn as opposed to today's language model some of them

[00:17:40] is really not there quite yet. You use the analogy that we can't get to the moon by extending a

[00:17:47] building a foot at a time but the experience that most of us have had of these language models is

[00:17:52] not a foot at a time it's like the sort of breathtaking acceleration are you sure that given the

[00:17:58] pace at which those things are going you know each next level seems to be bringing with it what

[00:18:03] what feels kind of like wisdom and knowledge. I totally agree that it's remarkable how much

[00:18:11] this scaling thing stuff really enhances the performance across the board so there's real learning

[00:18:18] happening due to the scale of the compute and data however there's a quality of learning

[00:18:26] that's still not quite there and the thing is we don't yet know whether we can fully get there

[00:18:32] or not just by scaling things up and if we cannot then there's this question of what else

[00:18:40] and then even if we could do we like this idea of having very very extreme scale AI models

[00:18:47] that only few can create an own. I mean if opening I said you know we're interested in your work

[00:18:56] we would like you to help improve our model can you see any way of combining what you are doing

[00:19:01] with what they have built. Certainly what I envision will need to build on the advancements of deep

[00:19:09] neural networks and it might be that there's some scale goldilocks zone such that I'm not

[00:19:16] imagining that the smaller is the better either by the way it's likely that there's a right amount

[00:19:21] of a scale but beyond that the winning recipe might be something else just so some synthesis of ideas

[00:19:29] will be critical here. Yeah Jen Joy thank you so much for your talk thank you.

[00:19:38] If there's a surefire way to wake up feeling fresh after a night of enjoying alcohol it's with

[00:19:43] Z-biotics. Z-biotics pre alcohol probiotic drink is the world's first genetically engineered

[00:19:49] probiotic it was invented by PhD scientist to tackle rough mornings after drinking here's how it

[00:19:55] works when you drink alcohol gets converted into a toxic byproduct in the gut it's this byproduct

[00:20:01] not dehydration that's to blame for your rough next day Z-biotics produces an enzyme to break

[00:20:07] this byproduct down just remember to make Z-biotics your first drink of the night drink responsibly

[00:20:12] and you'll feel your best tomorrow go to Z-biotics.com slash TED tech to get 15% off your first order

[00:20:19] when you use TED tech at checkout Z-biotics is backed with 100% money back guarantee so if you're

[00:20:25] unsatisfied for any reason they'll refund your money no questions asked remember to head to

[00:20:30] Z-biotics.com slash TED tech and use the code TED tech at checkout for 15% off thank you Z-biotics

[00:20:37] for sponsoring this episode and our good times

[00:20:48] all right that's our show thanks for listening TED tech is part of the TED audio collective

[00:20:55] this episode was produced by Isabel Carter who also wrote it with me Cheryl Dorsey

[00:21:00] our editor is Alejandra Salazar and the show is fact checked by Julia Dickerson special

[00:21:06] thanks to Farah DeGrange and Nina Lawrence for production support if you're enjoying the show

[00:21:12] make sure to subscribe and leave us a review so other people can find us too I'm Cheryl Dorsey

[00:21:18] let's keep digging into the future join me next week for more

[00:21:36] you

Why AI is incredibly smart -- and shockingly stupid | Yejin Choi

Search Episodes