The TED AI Show: Your next best friend may be 100% AI w/ Purnendu Mukherjee
TED TechJune 18, 202430:1655.37 MB

The TED AI Show: Your next best friend may be 100% AI w/ Purnendu Mukherjee

Non Player Characters --NPCs for short-- have always been a huge part of what makes video games engaging, from Cortana in Halo to Navi in The Legend of Zelda. But interactions with NPCs were always limited to a pre-written script. Until now. Purnendu Mukherjee is the CEO of Convai, a platform that enables developers to create NPCs with human-like conversational abilities. He joins Bilawal to chat about our evolving relationship with "AI characters” and what we gain and lose when our digital relationships are so life-like, it almost doesn’t matter who (or what) is on the other end.

For transcripts for The TED AI Show, visit go.ted.com/TTAIS-transcripts

Learn more about our flagship conference happening this April at attend.ted.com/podcast


Hosted on Acast. See acast.com/privacy for more information.

Non Player Characters --NPCs for short-- have always been a huge part of what makes video games engaging, from Cortana in Halo to Navi in The Legend of Zelda. But interactions with NPCs were always limited to a pre-written script. Until now. Purnendu Mukherjee is the CEO of Convai, a platform that enables developers to create NPCs with human-like conversational abilities. He joins Bilawal to chat about our evolving relationship with "AI characters” and what we gain and lose when our digital relationships are so life-like, it almost doesn’t matter who (or what) is on the other end.

For transcripts for The TED AI Show, visit go.ted.com/TTAIS-transcripts

Learn more about our flagship conference happening this April at attend.ted.com/podcast


Hosted on Acast. See acast.com/privacy for more information.

[00:00:00] TED Audio Collective It's the 26th century, and you're moving through a world as a cybernetically enhanced super soldier on the planet of Reach. You have a big mission in front of you. As Master Chief, you are the last hope for humanity against a hostile alliance of alien forces.

[00:00:23] Only you can stop the Covenant. Well, only you with the help of your trusty AI sidekick. A hologram with a cropped haircut and shimmering ocean blue skin. Cortana. Actually, it was 2001, and I'd just hooked up my Xbox to play the first series of Halo Combat Evolved.

[00:00:46] I was 11 at the time, and my brother, our friends, and I would pile into our Punjabi living room to play a campaign in split-screen mode. I was totally sucked into the game, invested in the story. And Cortana was key to that, with her humor and emotional depth.

[00:01:02] Even though she's just a blue hologram, ironically, she was the key to the game's humanity. She was my trusty co-pilot guiding me through these alien worlds. And throughout the game, I felt a real bond starting to emerge.

[00:01:17] It seemed totally novel at the time, but I was forming a friendship with a non-player character. I'm Bilal Velsadou, and this is the TED AI Show, where we figure out how to live and thrive in a world where AI is changing everything.

[00:01:36] Ever wish you could look around the corner to make sense of today's big business and social issues, and prepare for what's coming tomorrow? Dozens of podcasts promise to bring you the latest news and the latest trends.

[00:01:55] But where's the so what? Why does it matter? And what does it all mean for you? BCG's flagship podcast, The So What from BCG, features award-winning British journalist Georgie Frost, interviewing BCG's leading thinkers and doers to get you the answers you want and need.

[00:02:11] Hear the ideas that are shaping and disrupting the future. This is not your typical business strategy podcast. Listen to The So What from BCG wherever you get your podcasts. Sleeped Money is a weekly discussion of the most important stories in the world of business and finance.

[00:02:28] Over the past 10 years, we've become known as the place to go if you want to understand what's going on in business, or if you just want to laugh at some of its successes.

[00:02:38] So if you want a podcast that doesn't turn CEOs into heroes, listen to Sleeped Money with me, Felix Salmon, and my co-hosts Emily Peck and Elizabeth Spiers every Saturday morning, wherever you get your podcasts.

[00:02:55] Now, non-player characters, or NPCs for short, have always been a big part of video games. In single-player text-based computer games, players could interact with NPCs in a very limited way, just putting in a very specific command and having the NPC respond from a predetermined script.

[00:03:11] Think The Original King's Quest, where players typed directly into a text box to interact with NPCs who were there only to move the plot along. This evolved into single-player games like Halo, where NPCs like Cortana had more dynamic personalities and quirks written for them,

[00:03:27] though were still limited by the design of the game, and therefore static. Now if you think about multiplayer games, they allow for human-to-human interactions. Think Second Life, where you're interacting with other players through their avatars.

[00:03:41] What's so immersive about that is you're engaging with real people who are responding dynamically to you and you're responding dynamically to them. Human-to-human. NPCs exist in this world, and while they can be interesting, you would never confuse these NPCs as fellow gamers.

[00:03:58] NPCs just haven't been real enough to be mistaken for actual people. Up until now, these NPCs have been relatively passive. But now, with generative AI, those NPCs no longer have to choose between a limited number of responses.

[00:04:13] They can script in real-time, just like a human player would, reacting to what's happening inside of the game world. Suddenly, video game worlds can be infinitely more immersive and interactive. As these virtual characters become more integrated into our daily lives, and maybe even become our friends,

[00:04:29] are we going to start spending all our time in these virtual worlds? Will our interactions with NPCs start to become unrecognizable from our interactions with humans? And what do we gain or lose if this happens?

[00:04:41] This is the domain of Convay, spelled Con-V-A-I, a platform that enables developers to create NPCs with human-like conversational abilities. Convay says their goal is to help developers create virtual characters that can converse in the present moments and build long-term human relationships with human players.

[00:05:00] There are over a thousand projects being actively built on their platform by creators ranging from AAA studios to indie developers.

[00:05:07] And what makes Convay so interesting is that they're making technology that enables game developers to create NPCs that are so lifelike, it won't matter who or what is on the other end. Purnendu Mukherjee is the CEO of Convay and our guest on the show today.

[00:05:25] He has a lot to say about our evolving relationship with NPCs, or as he likes to call it, AI characters. And like me, he was a total gaming nerd. Purnendu, welcome! I'd love to start talking about the origins of Convay.

[00:05:40] I get that you were a gamer before you worked at NVIDIA, but obviously there's a lot of areas in game development you could have branched into. Why were you drawn to the notion of AI NPCs specifically?

[00:05:50] Before I started at NVIDIA, when I was doing my thesis work, I literally saw this language model wave coming. I wrote this, that while language models are going to get bigger and better and will potentially even have abilities to have human-like conversation,

[00:06:11] it is still not going to have the same level of understanding as we humans do. Because we humans don't think from text in, text out or text to text. We think from a 3D world around us.

[00:06:26] Since we were born, we first understand locomotion, like moving around the 3D world, and then we attach words to these objects, we assign meaning. So basically we are multi-modal creatures.

[00:06:42] Where do we find such a multi-modal environment where we could potentially have these AIs live, train and iterate themselves? Virtual worlds. And what kind of virtual worlds would we have people that can provide feedback to this AI? Heavily populated worlds are games.

[00:07:01] So all those connected together, like if we have to create this human-like mind within a virtual world, NPCs, non-player characters or let's call it AI characters embodied in a way are one of the best vehicles to do that.

[00:07:18] It's almost like we've got these rich environments where you can embody this sort of AI agent and have it experience very similar stimuli to what we might experience in the real world. Which is a perfect segue into the evolution of AI NPCs, right? This is a non-playable character.

[00:07:36] This is sort of set dressing, this side thing in this like, you know, kind of like the side dish to go with the main course, which is the game itself.

[00:07:44] And so I'm kind of curious, what's your historical perspective on how AI NPCs have evolved from their earliest stages to the complex entities that we see today in games and virtual experiences?

[00:07:57] To talk about the history of games, I mean, there are these pioneering genre defining games all the way from Half-Life that, you know, like single handedly define the first person shooter genre.

[00:08:11] And of course, Counter-Strike, like the multiplayer aspect of it, that you could have many people in the same world, right? In various ways of gameplay, basically revolutionized gaming. You know, they could play with each other.

[00:08:24] So you don't need new gameplay as long as people can involve with each other. So like that has evolved. NPCs has of course gotten better, mostly on the visuals or animation front, but not on the intelligence front as much.

[00:08:41] So what we are seeing, it's almost like a Cambrian explosion of characters and AI agents that can not only be very human-like in terms of interaction with the players, just like players played with players.

[00:09:00] Now AI can also play with players, both as friends or enemies, you know, cooperatively or competitively.

[00:09:06] I think it's this magical moment where now we've of course got these behemoth large language models, but you also have the, you know, kind of multimodal models hitting the scene where it's not just that they can understand text.

[00:09:19] These models can understand audio, can understand imagery, can understand even video. Right. And so I'm curious to dig into how that affects human-AI relationships. So how do you see AI NPCs sort of changing the nature of player engagement and emotional investment in both games and experiences?

[00:09:39] Firstly, the way these NPCs are becoming very human-like, there is going to be a large set of people in the world that will big time benefit from it.

[00:09:50] Mainly because there is a big chunk of players who don't like engaging with real people or are nervous or afraid to do that. They feel much comfortable if they know it's not a real person and that will help them open up, it'll help them socialize.

[00:10:07] In terms of people that let's say are playing single player games or multiplayer games, now they can engage with this, with the set of AI characters and have a more engaging time ideally.

[00:10:20] And lastly, let's say if it's in a multiplayer environment, people will still enjoy engaging with people. But now they have another reason to have fun together with other people.

[00:10:30] And then from a relationship standpoint, basically I think it is important for companies like ourselves to look ahead in terms of the positives as well as the dangers. It is going to quickly fill in the gaps where a human doesn't exist today.

[00:10:46] Whether it's just being friends or from a romantic angle or maybe someone that is a mentor or a guide.

[00:10:56] And not just like chat GPT, like text in, text out, but very much gamified immersive environment that can reach them and they can effectively have this mentor of an AI. So I think overall I definitely am an optimist and I see the positive sides.

[00:11:13] There could be potential darker sides, dystopian sides that needs to be addressed and understood and informed. I mean, I think it's very fascinating. Like it's one thing to talk to your chat GPT app, right? And you see a voice emanating from your phone.

[00:11:33] It's another thing entirely to let's say be talking to your mentor. And it's embodied as like a humanoid character that has the same sort of expressiveness that you do. It suddenly becomes this sort of more lifelike experience, right?

[00:11:48] And so as NPCs become more lifelike, what are those ethical considerations that come into play, especially regarding player relationships and AI behaviors? The number one thing that I think AI needs, and this is a bit controversial, but like if you think deeply enough, right?

[00:12:09] The biggest fear for AI is centralization and a few entities responsible for these relationships. When a kid grows up with an AI teacher and mentor, and that's their all, you know, that's a level of relationship that no company in the earth should own.

[00:12:31] It's theirs, absolutely, wherever they want to take it. And what can enable this? I think decentralized blockchain technology can provide true ownership. That is going to be very, very essential along with confidential computing that can help ensure that their data remains theirs. Their memories and relationships remains theirs.

[00:12:58] Yeah, I mean, you brought up a bunch of very interesting points, right? It's like if you do have this future where let's say we have an oligopoly of companies that sort of own the models that mediate your relationship with these digital characters, right?

[00:13:10] And especially if this is like kids who are like growing up talking to these, you know, NPCs or AI agents, whatever you want to call them.

[00:13:20] Yeah, you are building like a very rich history of sort of their hopes, wishes, anxieties, worries, desires, and how those evolve over time, right? And, you know, these agents are getting to a place where it's not just like, oh yes, you say this and I will say this.

[00:13:35] It's like they remember the context and glean insights from your previous conversations. And, you know, maybe the right way to solve that is with decentralized AI.

[00:13:46] And you alluded to confidential computing as well as like can we keep this sort of very rich data, you know, as close to the user as possible?

[00:13:55] Or, you know, and even if you are going to learn and improve models off of it, you do it in this like privacy preserving way, which itself sounds like a tall order.

[00:14:08] I'm really keen to dig into sort of the experiential and sort of emotional impacts of these type of AI agents, right?

[00:14:16] And so like one example that I keep going back to is, hey, even if you're playing a solo single player game, you could have this like AI Jarvis or Cortana that sort of understands you, your context, and is with you helping you navigate this sort of game world.

[00:14:33] How close are we to having that sort of companion in games? The primary aspect that is missing would be the, I mean, to some extent, right? So is the multimodal aspect of it.

[00:14:46] For that to happen at scale, like very much similar to Jarvis of Iron Man, is aware of the entire context. That means every new contrarian of the room that you are in, you know, like they're able to see that process that along with your digital presence.

[00:15:07] So we are not far away. Like it may not be at 100% of Iron Man's Jarvis capability, but like if you play the game, it'll feel like that. How does all of this change the way these experiences are authored, right?

[00:15:21] The analogy that keeps coming to my mind is sort of the narrative division of Westworld. Is that the right way to think about how you're authoring these type of experiences with these rich characters? The emergent nature of large language models are very interesting.

[00:15:38] The narrative designers and writers are still the better storytellers, right? We have to have a right mix of both controlled evolution of characters, controlled evolutions of stories, as well as the open-ended and emergent behaviors of these generative AI models. Which is where the balance and challenge is.

[00:15:57] And we are providing all the necessary tools for developers and designers to do that. To come back to the Westworld analogy, that not only did Westworld have these crafted characters that were some of our favorites, they evolved. They evolved into something else entirely than what was originally written.

[00:16:19] And evolved in a very meaningful manner. Remember what was in Westworld that started the whole thing about these characters evolving? It's memories. And once you have that, their personalities evolve, they remember things.

[00:16:32] And not only are it enough to just give them long-term memory, you have to keep them up all the time. Go about their day, make decisions, interact with other real people or other AI. And those interactions will change their decisions and their pathways.

[00:16:49] And some will be very high intensity experiences. And that will shape their personalities. And this will be ideally best put in servers that can have up to 250-500 people. These AI NPCs will always be there living their life. Their experiences will shape them.

[00:17:07] And they will start making decisions that will be quite bewildering. We already see that. We did this Nvidia demo where we had now two characters. And both of them were chatting and we just let them talk to each other. Just to see what they chat about.

[00:17:23] And while chatting, Nova made an order that, hey, can you bring me a drink? And Jin was like, sure, let me get that for you. And he went ahead and brought a drink and gave it to her.

[00:17:34] And it was so wild that they are not only just talking with each other, they're carrying out actions and they're giving commands to each other. The demo that you're referring to, which went stupid viral on the internet,

[00:17:45] I think it got everyone excited to just imagine where games are going next. Right. And so I'd love for you to talk a little bit about the types of experiences and characters people are building with Convey that are exciting to you, both for gaming and non-gaming purposes.

[00:17:58] The non-gaming examples are primarily in learning and training, education-related areas. And then there's brand ambassadors, which could be in the likeness or digital human of a celebrity or a nondescript model who is AI powered and knows all about the brand, can guide people

[00:18:20] how to use the product and whatnot. These AI characters can be even location aware, like you scan a particular QR code anywhere while walking the street and they spun up right there and tell you what the directions are.

[00:18:31] It's not a far future, okay, where we start seeing these embodied AI characters literally everywhere and very much in public spaces, all the way from take your favorite mall to your favorite airport. You will see large screens with these AI characters standing there

[00:18:50] welcoming you, but now you can talk to them and ask them which way to go. You know, like this is my airport ticket. Which way do I go? Where's the security check? Any kind of information dispense or transactional engagement, these characters would be perfect

[00:19:07] for real world use cases that we are already seeing. Now finally coming to gaming. Well, games are going to be effectively the matrix, right? So what do I mean by that? It's going to be so real that you would prefer living there.

[00:19:24] All right. So which is a dystopian future that we have to beware of. It will be pretty darn engaging where instead of the machines taking over we will be willingly submitting ourselves to these game worlds which will be an extremely exciting. With all of these technologies converging,

[00:19:40] all the way from your VR devices, VR augmented reality and all of that to extremely high speed internet, the cloud computing aspect of it where these extremely high definition world can be literally rendered on the cloud and streamed to yourself along with these AI NPCs

[00:20:00] in these worlds, in these metaverse-like worlds. It will become a much easier way for people to put themselves in let's call it the matrix or the metaverse. So where you can be there, live there almost to engage with your friends and learn certain things and engage

[00:20:23] with these AI NPCs. That is a future not too far away that people will start doing that. People already do that by the way in a major way in a lot of the social virtual worlds. Especially the younger generation right? Like you're growing up almost in these worlds.

[00:20:40] That is true, but also there is a very small minority who have been doing it for a while. There's a large audience a concurrent daily active user base for something like Second Life where we recently launched. And these people come back daily

[00:20:56] live their life, talk to their friends for many years now. And there are newer platforms like VRChat. I know people that regularly go there. They party on the weekends. Once the challenges of onboarding and the challenges of the friction is reduced to get in those worlds

[00:21:16] a lot of people will start going there. Let's take that VRChat example. I think that one's fascinating. It's like, yeah, people are buying expensive full body tracking setups with expensive headsets and computers to have this high fidelity embodied experience in virtual worlds.

[00:21:35] But right now on the other end are other humans that they meet. How do you feel about there being a time let's say in a couple years where you're talking to somebody in VRChat and you literally

[00:21:46] cannot tell if they're human or not? Does it matter at that point? Because like, I don't know, maybe these AI agents will be far more thoughtful and nice to you than perhaps a real flaw to humans. So I'm curious how you think about that, especially in the context

[00:22:01] of this Matrix analogy that you're making. You literally reminded me of this movie called Transcendence. Johnny Depp was the lead character there. Literally that's the case there. Basically Johnny Depp dies and before dying he actually uploads himself like full full neuron

[00:22:16] scan of his brain and uploads himself into the Internet. And when he comes back after his actual biological body has passed back, people would often ask him, are you real? Are you aware? And his answer would be, if you cannot tell, does it matter?

[00:22:32] So that is going to happen. No doubt. It's already there from a tech standpoint. Text in, text out. It'll be hard to tell today. Totally. And the other aspects would be the visuals of it, the animations of it

[00:22:49] and things like that. But also what might be a giveaway is if they are always of a particular certain personality type, they need to have a wide array of personalities. And even eccentric AI characters that are kind of awkward

[00:23:08] and some of them are mean and some of them are super nice and you know like... You get the full high school experience. Yes, exactly. Those are going to be necessary for people to engage. It cannot just be very assistant-like.

[00:23:24] So the more the variability, the better people would like and engage with them. And people will find their own type. Some people are drawn and attracted to toxic people. So basically they will have all kinds of AI in these worlds and you'll choose your pick, what you want.

[00:23:44] It reminds me of the conversation with the architect in The Matrix where the architect outlines that, hey, we made a utopian simulation but nobody bought it. It just felt too artificial. And almost introducing the flaws of our humanity kind of is what made it a sticky experience. Yes.

[00:24:03] Which I think is very fascinating, right? Because these environments and experiences need to mirror the full range of emotions that we experience. Right, right. So that's what we're noticing. We did this demo room and people enjoy talking with the meanest character. Wow, okay.

[00:24:24] They would want to walk away and that mean character will say something provocative that would draw them back to talking to that character. That's something that we have to be conscious about as well. That is literally one of the reasons that Facebook and TikTok

[00:24:38] became so popular because the newsfeed was programmed for maximum emotional turbulence. The content that was the most provocative drawed people there. We don't want to do with our stuff. So the right balance between engagement and what's good for the people is something that we plan to do.

[00:24:58] Now, there are many negatives to outweigh the positives here. But Pernendu has spoken with me about being a lonely kid growing up. And I had to ask him about the way these experiences can benefit people. You alluded to, you know, growing up as a kid,

[00:25:12] you felt isolated and different than other kids. And I just imagine, you know, I relate to that experience. I was so like deeply into computer graphics and visual effects and all this other stuff that like nobody else cared about at the time.

[00:25:24] And I obviously I found my escape through the Internet and, you know, OGIRC forums and PHP forums. I'm kind of curious, what does this do for people, you know, who may feel lonely today? Like what role can these AI characters play in sort of enriching their lives?

[00:25:44] Big time, you know, big time. There is, of course, online communities where you could meet those like-minded people. But like they may not be available at the same time. But you have this AI character who could effectively have all the right interests.

[00:25:59] Like think of your best friend that you connected with the most, right? That understands you before you even say it, right? So these AI characters can potentially be that for them. You know, like it is risky, but that is where we are evolving to, right? Like undoubtedly.

[00:26:17] And do you find in that situation, like let's go back to a younger you or me. Do you imagine in the experience being that these systems can sort of infer what you need? Or would I be like curating the combination of my like three best friends

[00:26:34] plus a little sprinkle of like, I don't know, John Gata and some other VFX supervisors that I really like. And let me throw in a sprinkle of Alan Watts and a sprinkle of like Einstein in there.

[00:26:46] How would people define sort of their best friend, if you will, in this space? Yeah, that's a very hard one to answer. Because we don't choose, I mean, we kind of choose our best friends eventually. But we don't choose... Based on vibe, right? Based on vibe, right, exactly.

[00:27:07] And common interests and things like that is how it evolves. But we don't exactly choose their eccentric things, what they are interested in, other things that they are interested in and whatnot. Right, so maybe it will be a multifarious world with lots of different AIs.

[00:27:24] If these characters have to evolve, if you keep changing them, you will not see their character evolution. So you basically go and socialize and you start with your one and maybe they will adapt to your interests and things like that.

[00:27:39] And eventually they will have their own unique experiences too that will shape them effectively like how it shapes you. Imagine friends that grew up together, maybe this AI can also go out and have its experience when you are not there.

[00:27:55] Right, so they have stories to tell what happened today. It's kind of mind-blowing. What you're saying almost makes me feel like we've been at this phase of technology and the internet where we can organize the world's information

[00:28:07] and make it universally accessible and useful to use the Google mission statement. But now we're heading into a world where we can make the world's people and personalities universally accessible and useful. Yes, and a lot of the technology we are creating, all the way from facial expressions

[00:28:23] and hand gestures and emotional voices, basically empowering the mind of these non-player characters, are going to be the same technology that may be used for a lot of these social robots. There's obviously both utilitarian and delightful experiences. Your customers are building. What are you looking forward to?

[00:28:43] We have set the vision, we have created the tools, and people are developing in terms of the immediate, not just immediate, like medium term three to five years plan is basically ensure that we have redefined gaming in a very positive way.

[00:29:03] We have enabled these learning and training experiences at scale that are changing lives of people in a very positive manner. Brand experiences, product information, real world embodied characters. I love the creator-centric approach. I think it's so important

[00:29:24] that we don't forget that creators are going to author these experiences. Thank you so much for your time. Thank you so much. It was great chatting. Around the time Purnendu and I had this conversation, he invited me to the Convay headquarters where I got to see

[00:29:40] their NPC innovations firsthand. And let me tell you, it was pretty wild. When I walked in, they had this massive monitor with an AI anime character on it. And the people at Convay told me to just have a conversation with it.

[00:29:52] They told me I could talk to it in Hindi, which I was psyched about. Then I tried to press a little deeper. I think there's this really human urge to try and push the boundaries of an AI system to prove that it's intelligent and human enough

[00:30:03] to exist beyond the constraints of corporate language. Most of my efforts were in vain, though, because as soon as the AI came back with these canned responses, it kind of ruined some of the effect. Even though it was giving me unscripted responses,

[00:30:16] it still had parameters. The AI allowed me to go off script. And even though the model I was talking to recognized what I was saying, it was still responding based on its parameters, the guardrails set by the company.

[00:30:28] While an AI like this might be great at helping you fight a lethal force of aliens, it's hard to know if it will ever reach the messier, more human parts of how we relate to one another. The TED-AI show is a part of the TED Audio Collective

[00:30:41] and is produced by TED with Cosmic Standard. Our producers are Ella Fetter and Sarah McCray. Our editors are Ban-Ban Cheng and Alejandra Salazar. Our showrunner is Ivana Tucker, and our associate producer is Dr. David C. Kemp. Our technical director is Jacob Winnick,

[00:31:07] and our executive producer is Eliza Smith. Our fact-checker is Julia Dickerson, and I'm your host, Bilal Al-Saddu. See y'all in the next one.