Something went wrong!
Hang in there while we get back on track
Brian Keating
00:00:00 - 00:00:52
What if the most powerful AI systems we've ever built are succeeding for reasons we still don't understand? And worse, they may succeed for reasons that might lock us in the wrong future for humanity. Today's guest is Anil Ananthaswamy, an award winning science writer and one of the clearest thinkers on the mathematical foundations of machine learning. In this conversation, we're not just talking about new demos or incremental improvements or updates on new models being released. We're asking even harder questions. Why does the mathematics and machine learning work at all? How do these models succeed when they suffer from problems like over parameterization and lack of input training data? Are large language models revealing deep structure or are they just producing very convincing illusions and causing us to face an increasingly AI slop driven future? Thank you so much for joining us all the way from Bangalore. This is so exciting.
Anil Ananthaswamy
00:00:52 - 00:00:55
Oh Brian, thank you very much for having me. It's a pleasure.
Brian Keating
00:00:56 - 00:01:19
It's really a wonderful book. We're going to judge the book by its cover, as I like to do later on. It's entitled why Machines Learn. And the first question I want to ask you, Anil, is I was taught as a physicist you can never ask why questions. That's the first word of your title. What made you want to explore why and not how or what machines learn instead of why?
Anil Ananthaswamy
00:01:19 - 00:02:47
It's funny, I answered this exact question yesterday at a panel discussion about the very same doubts that people, people had. This is just a writerly conceit, I must admit. The, you know, the title came about because I, when I was trying to learn the mathematics of machine learning, I encountered very early on this amazing proof that uses very simple linear algebra to show that single layer neural network, something called a perceptron from the 1950s, will converge to a solution in finite time if a solution exists. And in the late 1950s, the algorithm was first developed, which was essentially, it was essentially a simple neural network that could do linear classification. The algorithm is very simple and that to me is the how. And a few years after the algorithm was developed, people started mathematically proving that the algorithm would converge to a solution in finite time if a solution existed. And to me, in my head, as a former software engineer, the math became the why. And of course, if you were to ask a physicist, they would just, you know, I, I, it's funny because about a couple of months ago in Bangalore, David Gross was visiting the Nobel Laureate and he had the exact same question about the title of the book.
Anil Ananthaswamy
00:02:47 - 00:03:09
And I tried to give him my rationale and he did not buy that one way. He said, no, there's no why here. It's how. So, yeah, it's just a writer's conceit to me. How is the algorithm? And because the book is about the mathematics and I feel like the math kind of gives you a rationale for why these algorithms do what they do. So that's how the title came about.
Brian Keating
00:03:10 - 00:03:22
What was the first mathematical idea that you encountered in machine learning and research that you did on the book that made you stop and think that this is genuinely beautiful? As I explained it to be, oh.
Anil Ananthaswamy
00:03:22 - 00:04:25
It was exactly this perceptron convergence proof. So maybe we can kind of talk a little bit about how that perceptron came about, right, in the late 1950s, when Frank Rosenblatt, who was a Cornell University psychologist, he designed what was the first kind of artificial neural network. And it was a single layer neural network. And. And like I just said, you know, the initial work was simply developing the algorithm and showing that it worked. It did pattern classification. It was able to take two categories of data, and if these two categories were linearly separable in some mathematical space, the algorithm would find the linear divide between the two clusters of data. And subsequent to the invention of the algorithm, people started mathematically showing why this was powerful and why this classifier even worked.
What is Castmagic?
Castmagic is the best way to generate content from audio and video.
Full transcripts from your audio files. Theme & speaker analysis. AI-generated content ready to copy/paste. And more.