The Stack Overflow Podcast

A conversation with the folks building Google's AI models

Episode Summary

We sit down with Forrest Brazeal, head of developer media at Google Cloud, to discuss all the AI-powered goodies announced at today’s I/O event. Plus, a conversation with Paige Bailey, lead product manager for generative models at Google, who contributed to many of the projects that debuted today.

Episode Notes

Learn more about Forrest on his website and check out his newsletter.

You can follow Paige on Twitter or her LinkedIn.

Get on the list to try out some of the new stuff released today here.

Episode Transcription

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast. I have a special treat for you today: we are kicking off a new series with some of the folks over at Google. We are going to be talking about what's going on in the world of AI which has sort of set the whole world abuzz, not just software. We're going to be talking about Google Cloud and some of the news that's coming up that's relevant to the developers. I want to introduce my co-host who's going to be joining me, a bard over at Google. You said bard before Bard, right, Forrest?

Forrest Brazeal Yeah, they called me the Cloud Bard. I think Bard has deprecated that title now, though. I’ve got to find a new one.

BP But Forrest, for folks who are listening from the Stack Overflow Podcast side, introduce yourself and maybe give them a little backstory on how you and I connected.

FB For sure. So my name is Forrest Brazeal. I'm Google Cloud's Head of Developer Media. Before that, I spent a number of years as a cloud architect, builder, engineer, as well as working with a lot of folks throughout the cloud community, helping them figure out how to use cloud services most effectively. And so I'm excited to be here now doing that full-time at Google Cloud. And Ben, I think you and I first connected, boy, sometime last year talking about how this generative AI is eating the world at this point. It's majorly affecting what's going on at Google and I think what's going on at Stack Overflow. It seemed like a perfect combination for us to talk about. So we're excited over the next few episodes to be bringing in a bunch of really interesting guests from inside of Google to talk about how we think about generative AI, what we're building, why we're building what we're building. We have, I believe, the first of those guests with us today.

BP Yes, I am super excited for today's guest who just made some big news about new capabilities that are rolling out from Google Bard or with Google Bard or with Google's bard. Okay, forget it– that joke has gone on too long. But after we chat with her, Forrest and I are going to reconvene at the end of the episode and just go over a few little news items that will be dropping at Google IO related, again, to AI and cloud, which software developers who are listening will probably find interesting. So stick around until after the interview and we'll swing back with a little coda and get you some news. We want to welcome to the show Paige Bailey, who is the Lead Product Manager for Generative Models, all things generative AI, and as of this week, also works at DeepMind, so congratulations. Paige, welcome to the program.

Paige Bailey Awesome, thank you for having me. And thank you, this is news hot off the press. Brain and DeepMind had our merger into Google DeepMind just yesterday.

BP Yeah, that's very cool. So for listeners, give them a quick flyover. I know you've been working in the field for quite a long time before sort of the Big Bang of AI happened, and since then have been involved in a lot of really cool projects. Tell folks a little bit about how you got into this world and then catch them up to what it is you do day to day.

PB Absolutely. So I started doing machine learning around 2008-2009, so back when the coolest thing that you could do was basically decision trees or support vector machines or something similar. And my background is kind of in the planetary science space, so geophysics and applied math. That's what I did for undergrad and grad school. I was a practitioner for many, many years, and then only came to product relatively recently, like three or four years ago. I'm a boomerang back to Alphabet, so during the pandemic I got to spend just over a year at GitHub getting to help out with fun projects like GPUs and Codespaces, VS Code, and then also of course Copilot, and then came back to Alphabet, particularly because I wanted to continue to work on tools for large language models and large language models just generally, and of course we're expanding into the multimodal space now. But also being able to give all of the developer tools that we have at Alphabet, both internally and externally, this machine learning ability, not just for the inner loop software development, but also for these outer loop activities. That's what makes me jazzed.

BP Very cool. I have to ask– how do you make the jump from geophysicist to computer scientist or AI specialist? What was the pathway there?

PB So the fun thing about geophysics is that it's basically just an excuse to do computer science if you like rocks. Most of your day-to-day is analyzing large swaths of data, pre-processing it, post-processing it, and geophysicists actually adopted GPUs before most of the machine learning industry and world did.

BP Got into GPUs before they were cool.

PB Yeah, exactly. It's always surprising for people to hear that I started learning about CUDA specifically because of geophysics problems and not because of 2015 when TensorFlow was just open sourced and people were starting to try to figure out how to wrangle it to work with GPUs. It was a natural evolution. And basically, I like to tell people that if you're in any field that does a lot of work with large swaths of data and using programming tools to analyze it, that's pretty well-positioned for learning more about the machine learning space and about generative AI in particular.

FB So you might say you've gone from thinking about rocks to teaching rocks to think, right?

PB Exactly, very cool. And these are some very fancy rocks these days, though. If you've seen the new GPUs from Nvidia, they are pretty rock solid. And then also the TPUs on GCP are pretty great too.

FB I do want to jump in real quick though and go back, Paige, because you mentioned coming back to Alphabet, to Google specifically, to work with LLMs, which of course had been a thing for a bit. But do you remember a moment where –this probably was more than December of 2022 like it was for a lot of us– but do you remember a moment when you first went, “Oh, this changes everything. This field is about to change,” when you looked at LLMs. Was there kind of that specific light bulb moment for you?

PB I think that working with Copilot was honestly that light bulb. So historically there had been many different models that were kind of single task or few task, highly specialized systems. So you might have a model for code completion and you might have a model for debugging and you might have a model for build repair, or those sorts of things. And it was really kind of mind boggling to me to see that there were these general purpose models capable of so many things all at once, and not only could it give you really nice code recommendations and sort of code generation embedded within your IDE, but you could also figure out ways to get these models to explain your code for you. And so the foundational models that we create at Alphabet, they're becoming increasingly capable of doing not just things like source code. So they're pretty great at being software engineers, but they're also really wonderful at math and reasoning and they can generate recommended music for you. They can write emails for you. They can create a melody for you if you want it to have a specific tone. And so seeing these capabilities arise around the GPT-3, GPT-3.5 timeframe was really, really cool. I think Palm also helped show folks that some of these capabilities would be possible. I mean, having a model explain a joke to you, who would've thought that would've been something that we would see?

BP No more urban dictionary for me. All my memes will be explained by chatbots now. Thank God.

PB That's actually a thing. We're all going to be out of a job, you know? But it's just been honestly beautiful to see how these things are revolutionizing the entire industry.

BP So you mentioned that Copilot was kind of the spark for you where you realized just how capable and flexible some of these models were. You yourself wrote a blog post that just came out today –let's timestamp, this is Friday, April 21st– making some news about Bard and its ability to interact with software. Can you just tell us a little bit about that?

PB Absolutely. So Bard released the ability –or unlocked, I should say– the ability to code today in addition to some other features. So we have ‘export to Colab’ and more export capabilities will be coming soon. We also have the ability to do recitation checking, so if the model generates some code that is a verbatim copy from GitHub, it will point you back towards that repo that the open source author created and also will tell you about its license, which is quite nice. But the foundational model underlying Bard is capable of doing not just code. It's pretty good at code, but it's also wonderful at writing emails or generating itineraries or helping you plan your kid's birthday. That's kind of the idea and it's based on a foundational large language model backbone. We can't share specifics about dataset mixtures because we're in this brave new world of no publications really, or at least no detailed technical publications. But the proportions are, you could imagine lots and lots of source code tokens, lots and lots of math and reasoning data, lots and lots of science and science examples.

FB That is so cool, Paige. And I want to zoom in real quick on this idea of recitations, because one of the foundational AI principles at Google is this idea of accountability, of making sure that we can defend the factuality of things that these AI models spit out. How do you think about that more broadly in the work that you do, even beyond this specific example of Bard's coding ability?

PB That's a great, great question, and I think the responsible AI and trust and safety teams at Google spend an awful lot of time thinking about these questions in particular. Something that I also think resonates particularly well with developers is, in my lifetime, I've spent a lot of time doing open source work, and being able to feel recognized for your contributions and also just as an end user, if I want to have a function that does foo, bar, baz, and somebody says, “Well hey, here's an implementation, and also here's a link to a GitHub repo that has a library that already does what you need,” that's much more helpful to me than copy/pasting code that I would have to maintain over time. So as an end user, it makes me feel much more confident about the outputs that I'm seeing and can also point me in the general direction of other things that might be useful. And as a person whose source code perhaps lived on GitHub and was helping to train such large models, I much appreciate if my work is being referenced, show it to someone.

FB Sure. I feel like something I've heard a lot from developers is, “Well, is it even worthwhile contributing to open source anymore if it's all just going to get ground up in training and all this?” But I think you make a great point about how some of these models used responsibly can actually help surface that and provide that credibility for developers.

PB Yeah, and I have to say, as a person who worked at GitHub, a lot of the code on GitHub is real bad, like super bad. And the code review process for most repos is nonexistent. It's like, “You push changes and yolo,” and so the kind of code quality that is available on GitHub leaves a lot to be desired. Some languages are represented, so Python, JavaScript, et cetera, but the long tail languages are not represented much at all. And so I think that if we want to improve these code generation tools over time, contributing code, contributing great code, code that is code reviewed, and then also if you have a favorite language that if people want to submit Turbo Pascal examples or Fortran or whatever it happens to be, then you absolutely should. Because if you do, then the models will just get better over time.

BP I have a friend who works in software development and I was asking him about what the impact of Copilot has been, and he said exactly what you were referencing there, Paige. He works in Ruby and he says it just isn't very helpful for autocomplete, and that he has other friends who work in Python and JavaScript and it's ready and willing to help them out. But for some of the longer tail languages, it's not always so prolific with its recommendations.

PB I was just going to say, and sometimes it also recommends dated APIs, like APIs that might be from a framework that's a different version or something that's not super contemporary, and hopefully that will improve over time as well.

BP Right, there was the merging of Google Brain and DeepMind. There was your announcement today about Bard having code. Stack Overflow put out a blog post recently about how we're trying to adapt to the world of Gen AI and making some of the same points that, like you said, we want people to continue contributing knowledge to this sort of open community where folks can come in and discuss quality or suggest improvements or an update if something is out of date, and not just become a training set that's behind a black box or you have to pay in order to access. So there's a lot of questions that are arising from this, but I think also just so much excitement among developers or even folks like myself. I'm really more on the marketing team, but could whip up a little Firebase app now, going back and forth with the AI as my tutor. And so that's an exciting development to have more people maybe be able to jump into the world of software.

FB Ben, I want to jump in to something there you just said, because I think one of the things that I'm hearing a lot from software engineers is they're saying, “Well, I mean, anybody can generate some code now with some of these tools, but we're concerned about maybe the quality of what's being generated.” I mean, think about reviewing a 7,000 line pull request that somebody on your team wrote. It's very, very difficult to do that and have meaningful feedback. It's not getting any easier when it was AI that generated this huge amount of code. So we're rapidly entering a world where we're going to have to come up with software engineering best practices to make sure that we're using Gen AI effectively. Being on the cloud side, we see this with config as well as with more code-heavy types of implementations as well. Paige, maybe you could tell us a little bit about some best practices you would recommend for both software developers as well as for more broad DevOps-y/IT people who are trying to figure out how to responsibly generate and use AI-assisted code as part of their production software applications.

PB That's a great question. I also think that all of the same rules apply as to don't just copy and paste code that you see on the internet somewhere, that you should at best view it as something that an L3 SWE helper that's at your bidding, has produced for you, and that you should really rigorously look over. Over time, I do have the expectation that large language models will start kind of recursively applying themselves to the code outputs. So there's already been research done from Google Brain showing that you can kind of recursively apply LLMs such that if there's generated code, you say, “Hey, make sure that there aren't any bugs. Make sure that it's performant, make sure that it's fast, and then give me that code,” and then that's what's finally displayed to the user. So hopefully this will improve over time.

BP You're saying this is real. I mean, I was going to make a joke about Auto-GPT and just you put the code out there and then ask it to refine it and shorten it and test it. But I guess that's really something that will be coming along, isn't it?

PB Yeah, harder, better, faster, stronger. And then there are also retrieval techniques. So you have a large language model, perhaps you say, “Hey, just give me recommendations that are specific to the code conventions that you see in my company’s repos, in my company's assets, or these specific style guides.” So I do think there's a lot of ownership on us kind of baking in those best practices into the tooling, because no human is perfect. None of us will ever be able to remember all of the best practices all of the time. But in the interim, now that we've unlocked this brave new world of generative AI and the machine learning developer space, I think that, again, just at best, consider this as code that was recommended to you by an L3 SWE who is at your beck and call, and that you should really, really, sort of rigorously look at before you just autocommit to whatever repo you're working in.

BP So Paige, one of the things you mentioned that is really interesting about today's Bard announcement is that you're able to do attribution, and I do think one of the things that emerged within the last six months was this question of, to what degree these LLMs or a black box, do they know where they're getting the information from or how to validate it? Is attribution something that is an emerging capability of an LLM model that's doing it itself, or is there some additional outside tooling or work that has to happen in order to be able to say, “Hey, we got it from reading this link, or from this repo.”

PB So I will say that the way that it works in Bard is through kind of additional tooling around the model output itself, such that we are capable of doing those back references and retrieval checks. I think there's also a lot of appetite from the search team within Alphabet to make sure that any of the recommendations that are given by these products are grounded, or as grounded as we can make them. It's cool if you're using a generative tool as a makeshift DM for your D&D game, because infinite cool stories, all the time. But if you're actually using it for work that is impacting a person's life or could potentially go into production code systems, then there needs to be a lot more rigor and oversight. And that's something that we care an awful lot about at Alphabet, and we've been doing machine learning for a long time and the kind of maturity with which we're approaching integrating these LLMs into our product surfaces, I've really appreciated.

BP So one of the things that is really exciting and which I've seen demoed is that these models aren't just sort of flexible and general purpose when it comes to text, but they can be multimodal. You mentioned that before, and you mentioned music. Are you telling me that I'm going to get to play around with a model soon that I can talk to, it can do text, maybe it'll write me a little code, produce an image, and then sing me a song? Because I haven't played with a model that does all that in one place yet, so I'm very excited if that's on the way.

PB I will say that there is work in flight to make such dreams a reality. I also think that in the interim you can do really interesting things such as that if you have a model that's perhaps multimodal– so it can accept images, video, audio, et cetera, and then perhaps it just generates text as an output– you can also cap it with a diffusion model such that it will take the text that's generated and then generate images or video for you. So you could easily imagine a situation where somebody inputs a whole bunch of video into a model, or they ask a model, “Show me a video of chopping celery,” and it's capable of generating that on the fly. And we already have models that do this. There's something called Phenaki that I think was referenced recently during a 60 Minutes episode for Google Brain. But if you input text it will output video for you. And we also have the Sound LLM, which will sort of input text– so, “Play me classical jazz that I would listen to in a coffee shop on a rainy day,” and it will create audio for you.

FB So you're deprecating me as the Cloud bard in more ways than one, Paige.

PB Like I said, I am very stoked. I play piano, but I am terrible at constructing beautiful tunes. I can pluck out a melody, but not the rest of it. So having a backup Paige ensemble would be super cool.

BP Yeah, that'd be great. An arranger for you to help with some of those ideas and clean them up a little bit. All right, Paige. Well, thank you so much for joining us, especially when you had so much big news to share this week. I'm sure there's a million amazing projects you're trying to run lightspeed at, so again, thanks for taking the time and I hope we get the chance to talk again in the future.

PB Absolutely. Thank you so much for having me. This was a blast.

[music plays]

BP All right everybody, it is that time of the show. I'm going to do the outro and the lifeboat, but first, Forrest and I are going to chat just for a quick second about some of the big stuff that's happened this week which he was able to bring me first. So Forrest, take it away.

FB Cool, thanks Ben. So look, by the time you hear this, Google IO will have happened and one of the big things you'll have heard about is some new AI-assisted coding experiences that are coming to Google Cloud. So depending on where you're writing your code, if you're in VS Code, you'll have some contextual ability for Google's AI models to suggest code for you. You'll also have that in other editor places like if you use Cloud Workstations or Cloud Shell, it'll pop up there. Or even if you're doing contextual SQL in the BigQuery editor, you'll have some of those AI-assisted capabilities available for you as well. And that's exciting, but I do just want to kind of put in perspective that this is not Google Cloud all of a sudden having AI where it didn't before, but there's a pretty sick stack of things built on Vertex AI and BigQuery ML and other tools that have been around for a bit. So if you're just learning about those for the first time, go check out that whole stack. It's really sweet. But what we're seeing now is some new developer tools that will bring that AI assistance right into your development workflow, as well as just in the final product that you ship. So that is super, super exciting. We're happy about that. And I think it ties in well to the conversation we had with Paige, Ben, about this ubiquity of AI programming assistants that are coming out and what that's going to mean for the developer workflow.

BP Yeah, absolutely. I mean, I think to some degree, like you said, it's an extension of abilities or things that already existed. There was maybe some autocomplete or a bit of a tutorial that it took you on. Now that journey is just so much more ubiquitous and so much more powerful. You have this little thought calculator that can come out and help you start to put concepts together or finish your sentence where you couldn't before. And I guess, as you mentioned during the episode with Paige, one of the things that's really interesting to think about is code quality and provenance. So it'll be really interesting to see how Google implements all of that in the cloud systems, and also how developers start to think about that stuff as they go from just building toys that you can share and get everybody excited about on Twitter to trying to push something into production that has code just co-written by an AI.

FB That's exactly right. And we've shipped some generative AI support in Vertex AI, which I already mentioned is kind of Google Cloud's flagship AI workflow product. But as well, there's this thing that's come out called Gen AI App Builder. And if you're totally new to building apps with AI, you might find that interesting. It's got maybe a little bit of a Visual Basic 6.0 drag and drop vibe to it. I don't know if that's your jam or not, but if it is, maybe check that out, see what that can do. There's a waitlist for some of these things, but you can check that out at cloud.google.com/ai and you can get to the front of the line to be able to test these products as they become available.

BP All right, everybody. It is that time of the show. Let's shout out a user who came on Stack Overflow and helped to spread a little knowledge or save some from the dustbin of history. Someone is trying to reference System.Drawing in a .NET Core console application, but the assembly is not there. Yikes. Well, J. Doe has an answer for you, and over the years has helped over 50,000 people. So we appreciate it, J. Doe. Thanks for coming on Stack Overflow and spending some knowledge, and congratulations on your Lifeboat Badge for saving that question. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. You can find me on Twitter @BenPopper. Email us with questions or suggestions, podcast@stackoverflow.com. And if you like the show and you're enjoying what Forrest and I are doing, send us an email to let us know what you want to hear about in the future. We have a bunch more ideas for great guests we're going to bring on, folks who have helped to build some of the foundational technologies that Google and others are leveraging, as well as folks who are pushing new product into the world. And if you like the show, hey, you could also leave us a rating and a review, because it helps.

FB Well, thanks as always, Ben. I'm Forrest Brazeal, the Head of Developer Media at Google Cloud. You can find me on Twitter @ForrestBrazeal, and if you want to connect more deeply, go to cloud.google.com/innovators. I send a pretty thorough email newsletter update every week covering stuff like this that happens in the Google Cloud ecosystem. So make sure you sign up for that, you can get it for free.

BP Awesome. All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]