The Stack Overflow Podcast

How Google is helping developers get better answers from AI

Episode Summary

Today’s guest is Logan Kilpatrick, a senior product manager at Google, who tells Ben about his journey from software engineering to machine learning to product management, all with an emphasis on reducing developer friction. They talk through the challenges of non-determinism in AI models and how Google is addressing these issues with a new feature: Grounding with Google Search. Plus, what working at the Apple Store taught Logan about product management.

Episode Notes

Logan previously worked at OpenAI, where he led developer relations. He’s now a senior product manager for Google AI Studio, the fastest way for devs to get started with the Gemini API.

Logan’s team just rolled out Grounding with Google Search, a feature built to help developers get fresher, more accurate responses from the Gemini models aided by Google Search. Learn more here.

Connect with Logan on LinkedIn.

Props to Stack Overflow user Jonik, who earned a Populist badge by explaining How to write an S3 object to a file?.

Episode Transcription

[intro music plays]

Ben Popper Maximize cloud efficiency with DoiT, an AWS Premier Partner. Let DoiT guide you from cloud planning to production. With over 2,000 AWS customer launches and more than 400 AWS certifications, DoiT empowers you to get the most from your cloud investment. Learn more at doit.com. DoiT.

BP Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. Today, I am chatting with Logan Kilpatrick, who is a Senior Product Manager at Google. He's leading Product for the Google AI Studio. We want to chat about all the stuff that they've been working on. This is obviously a fast-moving space that is kind of at the center of the conversation in the world of technology, and also extremely relevant to our audience –software developers– because a lot of tools have emerged over the last couple of years that claim to assist in productivity, give software developers ways to write more code, test it, debug it, who knows. So without further ado, Logan, welcome to the Stack Overflow Podcast.

Logan Kilpatrick Thank you, Ben. I've been long focusing my Stack Overflow profile, probably not for you, but spent much too much of my life answering Stack Overflow questions back in the day with the Julia programming community.

BP If you want to share your username, we'll shout it out in the show notes. We'll shout out your rep. I'm sure the listeners would appreciate that. So talk to us just really quick, level set for the audience who's listening. How did you get into the world of software development and what led you to the role you're at today?

LK I think my initial excitement with building software came back when the original Flappy Bird game was created. I had heard a crazy Flappy Bird story and I was like, “Wait, this seems like a really simple product for how much craziness it's capturing in the world,” and it got me really excited about what I would be able to do. This was while I was late middle school, early high school or something like that. I had never coded before. I started watching Matt Heaney Apps– some random guy in Europe making videos about how to make iOS apps– and started going to the library and using a Mac trying to build apps. And I think that that sort of got me hooked and then I studied computer science, was a software engineer, became a machine learning engineer, and then ultimately in April joined Google to lead Product for AI Studio and really focused on how we help developers benefit from all of the excitement that's happening in AI by putting language models into their hands so that they can go and build all of the things that they're excited about.

BP Very cool. I have to say that this fits a pattern. 90% or so of folks who come on and I say, “How did you get into software development?” there's some element of gaming involved in how they first got excited about it. So that's a cool story about Flappy Bird. You said you started in computer science but then switched to machine learning, so did you change your focus when you were still in academia or did you go into the industry world and start working on ML there?

LK So I was regular or pure CS for undergrad in my studies and then became a software engineer, was doing normal CS software engineering stuff, and then while I was actually at a company I sort of transitioned from a software engineer to a machine learning engineer. I was also doing a master's at the time. The academic/professional parallel was super helpful at that time for me. Hearing about the theory behind the stuff that I was actually applying in practice at work was really nice to have both of my brains sort of activated from that, both sides of my brain activated from that perspective.

BP Cool. I was checking out your LinkedIn and there's an incredible amount of stuff there. You had time as a Lead Developer Community Advocate with Julia, which you mentioned earlier, you did stuff related to NASA, I see OpenAI on here. Tell us a little bit about how now obviously you're working full-time at Google, but were you dabbling in a lot of different things or advising? It's a rich tableau here on LinkedIn.

LK I appreciate that. Depending on my sort of two career trajectories, I started around the same time. I actually started working at the Apple Store while I was in college which was a super formative experience for me. I don't think people appreciate the kinds of skills you learn working at the Apple Store, but really they still pay dividends for me today.

BP Those are the soft skills, man. The people skills for product managers. Learn them at the Apple Store.

LK A thousand percent. And I think the beautiful thing about the Apple Store or any sort of environment like that is that people show up from a very different background. There could be people who are literal engineers, there's my grandma who can barely work her way around a phone, and having to sort of speak to both of those audiences I think was a great experience. But also in a similar timeframe, I started interning at NASA just through random cold outreach. I had pinged a bunch of people on LinkedIn and was like, “Hey, I'm interested in learning more about your work.” One of them answered and was like, “Come on in and chat,” and I started doing a bunch of coding data science stuff for them. I ultimately transitioned at NASA to another team that I was on which happened to be using Julia, which sort of brought me into the open source world, brought me onto Stack Overflow, brought me into the world of developer relations, which then took me to OpenAI, took me to PathAI, and then ultimately brought me to Google.

BP Very cool. Let's talk a little bit about your time at Google. Did you arrive here kind of after that moment in which it became clear that Gen AI was going to be a principal focus for all the large technology companies or did you arrive prior to that?

LK So I joined OpenAI at the end of 2022. All the craziness was starting right around then. So I think by that point, the first Gemini model was released sometime at the end of 2023– December of 2023, I believe. So I think Google was already sort of in full swing with Gemini when I joined in April and then got to be a part of Google I/O which was awesome where we released a bunch of new models and sort of told the world about what I think was Google's broad plans for AI across developers and other product surfaces. And then we've been in what feels like an all-out sprint for the last six to eight months since then, and no end in sight.

BP Exactly. This is interesting. I know we want to focus on your time at Google, but this is also a little bit about you. You wrote all the OpenAI developer docs and you answered thousands of questions. So this was a time when all of a sudden the tension was pouring in and people wanted to use the API and try to build their own app. Their sort of app store was opening up. You mentioned being at the Apple Store, but now all of a sudden the attention of the world’s software developers are turned to you. What was that like?

LK I think that really caring deeply about that one to one engagement actually scales really well to the one to many engagement as well. And I think I've always looked at it from the same way of that whenever I'm doing something for developers, it really is to help proverbially –however that word is pronounced– the developer who's sitting right in front of me who I'm trying to help solve that problem, and I think having that mindset really helps when you're also doing things that are much broader scale. But it was a lot of excitement, a lot of learning. I think so many great collaborators, like my friend Ted Sanders who wrote the OpenAI Cookbook, is a great example of the collaborations that were happening at OpenAI while I was there, so it was a ton of fun and really, really smart people willing to extend themselves to tell the world about how to use this technology because there was so many people who were just ramping up and trying to understand. I think we're still on that ramp-up of the maturity curve, but I think a little bit less so now because there's a ton of content that's been put out about how to make use of these models and systems.

BP I know we're going to get to talk about something interesting in terms of grounding AI models, but like you said, one of the interesting things is that people want to work with this really powerful technology, these thought calculators that can do so much in so many different ways, but they're nondeterministic and that's not really how people are used to working with software. That's not what they learned when they were getting their CS degree or even up to this point, unless they were working in machine learning but that wasn't really very commercialized or at least not in a consumer-facing way. So tell me a little bit about how, exactly as you said, you introduced developers to these new technologies and new ideas, and then let's just touch on what brought you from OpenAI to Google. Let's discuss that switch and then maybe we can talk a little bit about just generally how you speak to developers about these new technologies and try to build the best API for them to interact with it.

LK I think what brought me to Google was the opportunity to go back to the roots of the same reasons that I joined OpenAI. I joined OpenAI as a small startup just beginning this. They'd had GPT models before, but really at the beginning of the journey of how do we externalize this technology and put it into the hands of the world. And I think for Google and the developer space, obviously Google has been doing AI. It's been the core of the product offering, whether or not people see it or not. AI is essentially what makes Google Search work the way that it does. It's just a different flavor of AI than we've historically seen with this latest iteration. But really, it's been an opportunity for me to go back to that, despite Google being a big company, that small feel of a team of people who are really pushing hard on getting this technology out into the hands. And from a developer perspective starting from ground zero, there wasn't Gemini models and an API available for the last few years like there was with OpenAI models, so that zero to one experience is a ton of fun to me, and really where actually I think I provide the most value. I think there's a lot of stuff where, OpenAI as a good example, how much more value can I help and provide in scaling from millions of developers to many more millions of developers? I'm sure there's people who are better suited at that than me, but even back in the world of Julia, it was like, “How do we go from this very small number of users to get the world on board with this vision for scientific computing?” And it feels like getting to do that again now at Google is just the most fun that I can possibly imagine having. So a lot of hard work, a lot of fun.

BP I think one of the things that's super fascinating is that if you think of yourself as a person who loves to live in that zero to one space, there has been, I believe communicated outwardly, a cultural shift at Google, and you mentioned feeling like you were on a sprint for the last six months, that it's okay to make mistakes in public, it's okay to release things but to let people know that it's still kind of a beta or it's still quickly evolving, and that in this area, in order to try to keep pace as a leader in the AI space, Google has returned to a sort of startup mentality with respect to a lot of this product line. Would you say that's fair?

LK That's a good question. I think the challenge for Google– and this isn't actually a unique challenge for Google, I think OpenAI now has the same challenge– is that when you're at the top, it's very easy for people to want to pull you down or find challenges, and the expectation and the responsibility is much higher as well. And I live in the developer world for this, but the nice part in the developer world for Gemini stuff is that we really are starting from scratch so there's less of those priors. Developers have used other Google products maybe around developer services, but actually a little bit less in this specific domain, so there's less expectations and it's easier for us to have the freedom and flexibility to move around and be quick, versus I think the other product services in Google which I don't work on where just the stakes are a lot higher in the sense that there's a large existing user base, whatever it is. So it's nice for us to be able to sort of work in that area where we can just get things out to developers quickly and get that feedback cycle which I love.

BP With Google Maps or Gmail, you’ve got to be careful when you’ve got a billion people actively using it. You're releasing tools to developers that they know are going to be updated continuously and so you're a little bit more in that zero to one headspace. All right, so let's talk a little bit about what it’s like to try to build the best API or developer tools or docs for folks in this space. You're specifically doing that for Gemini. How do you communicate that? What are the key components of those tools or APIs that make it easy for developers to work with Gemini or to build things with Gemini? What have you learned over the last seven months about what people are seeking and how you give it to them?

LK I think the core piece of this is that for developers the expectation is for there to not be friction involved in getting started and actually building with the thing, and I think that had historically been our challenge. It’s just a little bit more friction than other platforms in getting started. I think from the very formation of AI Studio which is the entry point for developers to build with Gemini, that's been the core principle– how can we remove as much friction as possible? So things like that you can just sign in with your Google account and then get an API key and access our latest models without having to put a credit card in or do any of that type of stuff, literally just use your Google account and you're good to go. It is a core principle of how we make this technology more accessible and put it in the hands of more developers. And then I think on the developer experience doc side, I think we still have a lot more work that we need to do. I think part of the iteration cycle is sort of iteratively getting to the place that we need to be. At least in my mindset, that's what's worked in the past. We don't need to have the most polished docs and API experience but we will get there, and it's sort of optimizing for speed in many cases of putting this into the hands of developers, which I think developers actually appreciate and the message that I've seen from people is that that's been resonating. What the expectation is is not that you release the world's perfect product for a developer, it's that developers can actually see the trend line of where you're going. And I'm hopeful that we've done a good job of, at least as of late, communicating what that trend line looks like for developers.

BP You make a great point. I'm sure you learned a lot about that in the open source world of Julia. I can relate to that from within Stack Overflow as a company and also just externally messaging to a community. What folks want is transparency and they want to know that you're hearing them and that things are getting on the roadmap and then see v1.7 then see v1.8. What they don't like is hearing nothing. Even if you're grinding away for six months to release something with a ton of fixes, if that's not being communicated, it doesn't build trust in the same way. Developers are very interested in also being able to participate, to say, “This is our number one issue,” or, “Can I contribute in an open source way to fix these things?” So I think that's a good point you make.

LK It's funny. Back to your point about how much of these problems are communication problems, and if you can communicate well with other people and communicate well in that sort of one to many capacity, you end up saving yourself a whole lot of other pain of having to figure out how to fix things after the fact.

BP Yeah, exactly. You’ve got an army of people perhaps or a collection of people who are there to sort of fix the typos in your docs or do a little bit of the work and take credit for it that keeps things clean around the edges. All right, so I wanted to circle back to one thing we said earlier which is, what is it like to try to educate people and work with developers when you're offering them a product that, unlike most software in the past, is fundamentally nondeterministic? They might want to make a chatbot that speaks with a customer who's buying makeup and they've tried to train it and Gemini is in the background and a customer might ask a question and another customer might ask the same question and the chatbot might give two different answers, one of which is 20 or 30% better than the other, and that question might get asked a hundred times a day. So how do you work with software that is nondeterministic to get the best outcomes, and then from there I think we can talk a little bit about grounding and how that connects back to, “Oh, okay. Maybe there are other ways outside of just using the LLM that we can improve the user experience while still relying on Gen AI as sort of the base of what we're doing.”

LK So I think there's two angles to this, and my assertion is that it's likely that this nondeterminism is actually one of the fundamental blockers for adoption of people building interesting things. I think it actually captures a lot of the air in the room and mindshare from people being like, “Ah, we need to solve this problem,” which actually probably takes away from other things like exploring the different interfaces how AI might be exposed to people from an app perspective. So all that's to say, I think this problem has a cost and I think the real cost is perhaps not necessarily the 20% uncertainty around different answers and the quality of them. It's more so that developers have a limited amount of time and they're spending their time trying to solve this nondeterminism problem. The exciting part of this, though, is that there's been a ton of interesting companies and products that have been built to actually help you solve this problem. I think there's many eval products out there, there's many observability products out there through this thread of grounding with Google Search which is something that we're working on. There's a lot of different innovation from the product perspective that can be done to help solve this problem and I'm really excited for us to move past this part so that people can do more of the interesting exploration around what is the real way that I should be building with and interacting with AI?

BP It's interesting. We've both been talking about just how fast this is moving and how most technology companies feel an imperative to understand how to adopt Gen AI and how to work with LLMs inside of their business. That's pretty widespread across the industry except for some folks who are highly regulated. But to your point, a lot of people built things and then said, “Wait, do I have any way to evaluate which one of these thousands of outputs is good versus bad and then to tweak so that I can improve it, or do I have any observability in the works that works with a model and a vector database and tells me when something's gone wrong?” All of the fine-tuning and iteration and perfection that is typically in a software stack and all of the observability and the redundancy and the SRE that is typically built into a software stack has to kind of be created anew, or at least a fresh flavor of that for this new Gen AI world, which is now, “Okay, now I'm working with an LLM.” That's not something most companies were doing prior to November of 2022.

LK Again, the thread of that that gets me excited is that that just means there's cool opportunities to build businesses that are good for developers at the end of the day, and I think there's probably a bunch of other nondeterministic niches that I'm probably less familiar with that are benefiting from a lot of this work that is going into how you take this fundamentally nondeterministic thing and build it into systems that kind of need determinism for you to evaluate them well. But I also think the other thread of this is that eval is sort of the root of all evil. Eval in LLMs, even for people training LLMs, it's really, really difficult. We make new models, we put them out to developers, and I think we have an internal perspective of how the latest Gemini model is going to be better for people, but actually, again, we don't know the answer to that until we put it in the hands of developers, which is why it's so important to be able to externalize this technology and why we do experimental model releases. All that stuff is because it's just hard to know how the evals sort of track to the real world use cases. So it's been interesting to see that be a universal problem across everyone who's building or training or working with LLMs.

BP Definitely a startup opportunity to be the top eval provider. I wanted to touch on this grounding idea because I think at Stack Overflow we've kind of been saying the same things. Again, this might get cut. We'll run it by folks on your side for approval, but Stack Overflow has openly announced that we are providing data to Gemini and to OpenAI to train their future models, and that the thing we're most interested in, along with funneling resources back into the community, is attribution– that somehow these models can give an answer but then provide grounding with a link back to the Stack Overflow answer that was the source of this. So talk a little bit about how that's going to work with Gemini and Google Search.

LK I think the general principle with grounding with Google Search is that you make use of exactly the technology that Google has built over the last 20 years to retrieve information at the scale of the internet, and actually bring that into the world of Gen AI by allowing users to ask questions. And then for you as the developer, you can sort of enable/toggle on grounding with Google Search, and then actually there's some granular levels as well around the types of queries you want grounded, the level of grounding, et cetera. And then the model will go out, retrieve information from Google Search, use that information to give the answer, and then to your point, Ben, it actually provides the attribution and links out to those sources with a sort of standard user interface that Google is also providing as part of grounding with Google Search.

BP So is that a quick sort of retrieval augmented generation loop that's running? Ask a question, okay, do a Google search, look at the top five pages, RAG that, now give an answer including links, and then go from there?

LK I think I'll make the caveat that there's slightly more nuance, but I think generally that's a good approximation of what's happening. And I think at the end of the day, there's probably five core challenges of working with LLMs. We talked about sort of the observability/nondeterminism problem, we talked about the eval problem. I think one of the other ones is that people love to talk about hallucinations for LLMs and I think this is one of the lowest-hanging fruit ways that developers will be able to drive that hallucination rate down. I'm excited to see, again, the eval thread. There's probably a bunch of really great eval benchmarks that are out there and it'll be really interesting to see how much this moves the needle in sort of getting those last few nines of reliability for people with LLMs.

BP Cool. I want to say one thing before we go and then we can wrap, which is that you mentioned working at the Apple Store, you meet all people. Some of them are seasoned technologists, some of them are older folks who can't really understand some of the intricacies. I have to say that I think the AI answers provided in Google Search is probably one of the things that I see being adopted most quickly and is most widespread. My dad is like, “Oh, I asked Google a question and it gave me this answer. It was actually kind of useful. It summarized it.” I'm not going to then dive into a 30-minute conversation with him about what that's about, but just from the perspective of a globally-recognized and used product having a fairly significant change to how the user interface works, it seems like people are finding value in the result, and I'm sure it’s also extremely valuable for you and your team to see how often that result satisfies a user or they click inside of it or they explore further or they ask a follow-up question. You're getting so much user data on how this is working.

LK I love that. I just had another tangential personal experience where I was sitting with my girlfriend and she sent an email for her job and then realized that she needed to unsend it very quickly because their coworker had just sent an email to their client, et cetera, et cetera, and went on Google and looked up how to unsend an email in Outlook and got one of the AI Overview answers. And it was literally great and it solved exactly the problem and I got it instantaneously. So it's cool to see this actually being integrated into the behavior that so many users already have, which is start with Google and ask a question.

[music plays]

BP Okay, everybody. It is that time of the show. Let's shout out someone who came on Stack Overflow and shared a little bit of knowledge. Congrats to Jonik, a Populist Badge, awarded five hours ago. Jonik, your answer was not the accepted answer, but it got so many points that it now outscores that accepted answer. You explained to folks how to write an S3 object to a file, and over 63,000 people have viewed this question, so a lot of people benefited from your knowledge. Congrats on the Populist Badge. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. Find me on X @BenPopper. Questions or suggestions for the show, want to come on as a guest, want to hear us talk about a topic, email us, podcast@stackoverflow.com. And if you liked the conversation today, do us a favor– subscribe and leave us a rating and a review.

LK And my name is Logan. I'm a PM at Google. And Ben, this was an awesome conversation.

BP Terrific. All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]