The Stack Overflow Podcast

What Gemini means for the GenAI boom

Episode Summary

The home team talks about Google’s new AI model, Gemini; the problems with regulating technology that evolves as quickly as AI; how governments can spy on their citizens via push notification; and more.

Episode Notes

Gemini, Google’s new AI model, is great at competitive programming, among other things.

AI Explained is a YouTube channel that covers the latest developments in AI.

One problem with regulating AI is that the technology evolves (much) faster than regulators can.

Wikifunctions is an open repository of code that anyone can use or contribute to.

Did we need another study to tell us that longer commutes are bad for mental health? Probably not, but here’s one anyway.

Are governments spying on you through push notifications? Sounds like it! In fact, push notifications are a privacy nightmare.

Episode Transcription

[intro music plays]

Ben Popper Now, according to this year's Stack Overflow Developer Survey, 49% of technologists said they learned to code from online courses and certifications. If you're looking to upgrade your skills, start a free trial with Pluralsight today to see if their tech courses, skill assessments, and hands-on labs are right for you. Visit pluralsight.com/stack to learn more.

BP Hello, everybody. Welcome back to the Stack Overflow Podcast: astrological sign edition, Gemini plus pro ultra edition.

Ryan Donovan Gemini in the house.

BP I am your host, Ben Popper, Director of Content at Stack Overflow. I identify as a Pisces, kind of an Aries-cusp Pisces, but roll your own, and I'm joined today as I often am by the wonderful members of the Content Team: Eira May and Ryan Donovan. How's it going, y'all?

RD Good enough.

BP Oof.

Eira May I identify as a double Leo over here.

BP Double Leo, ooh cool, and you have twins. Gemini, Leo. So let's start with the big news from yesterday– Google released its Gemini model, which is meant to be a competitor to all the other AI models out there in which they claim is state of the art, top of its class, performs better than everything else, because it's multimodal from the beginning. So the really fun stuff they do is they'll talk to it and they'll say, “Hey, I'm going to put this paper ball down on the table here and then I'll cover it up with one cup and two other cups,” and it's like, “Oh, I see what you're doing here. You want to play a game? Let's play find the ball.” And then you start humming a tune and it's like, “Oh, you like country western? Let me make you a song.” It's just got all these other modalities now: video, image, sound, text. It can do some pretty fun things. I'll take first thoughts, first reactions, and then we can dive into AlphaCode 2, which is a little bit more pertinent to this podcast. They made a fine-tuned version, or a subversion of Gemini that's all about competitive coding, which we should discuss.

RD Sundar Pichai's demo video is certainly full of very cool stuff like drawing something and describing what he's drawing. It's like, “Oh, it's a duck, and it's blue.” It's like, “I don't know of any blue ducks in the wild,” and then it's a rubber duck. It seems to have reasoning and that's really cool, but I would love to see somebody who doesn't work at Google play with this and do a demo.

BP Yes, we won't get to see it in the wild. You can play with an enhanced version of Bard, which is supposed to have better reasoning capabilities in chat, but does not yet give you any of the multimodalities, still just text only. So a super fun, as you point out, demo video. No chance to go hands-on and no users have gone hands-on aside from maybe a YouTuber, Mark Roebling, who made a video with it, but again, there I think it was mostly just suggesting things to him by text and he was kind of acting them out. Eira, thoughts on our AGI overlord?

EM I'm not sure that I have thoughts on that specifically, but I did read an article today that was all about how governments in Europe and in the United States are sort of failing to regulate AI because it is evolving too quickly and it's taking shape too fast for them to keep up with, so this does seem kind of relevant here.

BP Excellent point, and I will say my favorite YouTube channel, AI Explained, this person always goes and reads the full 60-page technical report when something like Gemini or AlphaCode comes out and then does a video. He is a Londoner and was disappointed to say that none of this is coming to the EU, precisely because Google is like, “We're not really sure what the regulatory situation is like. We're not exactly sure what's going to happen.” So for now it's going to be a US-only release, although it is worth pointing out another thing that this model does, which is a bit different from ChatGPT, not completely different, but where it excels is that it speaks like 60 languages and can go back and forth between all of them, and you can speak to it, for example, in Chinese, and it won't translate that to text. It'll understand it as audio and so therefore would understand tonality or something like that. So for a global behemoth like Google, this is really interesting. All right, let me push us forward a little bit to AlphaCode 2. So something we've referenced a bunch in our writing on the blog as we discuss AI assistants and where Stack Overflow is going to sit in the new world of Gen AI, is AI that's going to help you code and where the modern programmer sits in relation to AI. Will it replace them, will it enhance them, et cetera. So they made a version of AlphaCode back in the day, which in part draws on AlphaZero –worth mentioning the lineage here, the program that was the first to beat humans in Go, beat StarCraft, cracked the code of how to discover a protein. That's the DeepMind company stuff. So AlphaCode, now powered by Gemini under the hood, previously was an average competitive programmer. It scored in the 50th percentile. Now it scores in the 85th percentile, and they say that what is so interesting about this is that to do these kinds of competitive programming questions and to score at this level, you have to be able to ideate, theorize, test, iterate, and in some ways be novel. They made it take tests that had never been leaked on the internet– so they say, who knows– but that it can sit down and have a bit of a mind of its own. So I think this continues just to sort of push forward the discourse of how soon will this kind of program be able to replace a software developer in a certain capacity. Google I think is on board with the messaging that this is meant to augment, not replace, and they mentioned that when AlphaCode is paired with a human, it does even better. So not to worry, every AI will get a human buddy, okay?

RD The poor competitive programmers, they're out because they're competing against this machine that can do reasoning now and has access to all code ever written on GitHub or wherever.

BP Right. So the rebuttal there, Ryan, is that people continue to play chess and Go, they just don't play it against a computer, except with a tuned down rating, and they've learned things from this stuff and the average skill level of a human is now better because we've picked up some strategies and we've upped our game. We can still play for the joy of it, these are games after all. I guess if you're a competitor, it's your sport, but I don't know if you can make a living at competitive programming. We should look that up, I'm not sure.

RD I mean, you can put it on your resume and then make a living.

BP Right. One thing worth mentioning and one thing that I always thought was interesting when people are testing out an AI system is that they'll say, “I asked it to write this algorithm, and 30 seconds later it spit back this thing and it wasn't quite right.” And it's like, “Well yeah, if you asked a human being to write that algorithm and expected it back in 30 seconds, it probably wouldn't be great either.” We have this expectation that if we ask it something, it should be right on the first try. And in the competitive programming test that they let it take, they let it do chain of thought, so think it through in steps, and they let it test out 10 ideas, write them all down and then pick the one you think is best. And so again, maybe a human being would test a few ideas before going. Obviously an AI has the capacity to test way more ideas way faster, and so in that sense, it's not really fair– not that it's meant to be fair, but its approach is scalable in a way that human creativity and intelligence isn't.

RD I think that's what made this release so interesting. It's got the chain of thought reasoning. From that video, it's doing chain of thought reasoning on ‘what is duck that's blue’ and then kind of revising it and revising it. And I think the coding video talked about it coming up with a dynamic programming solution, and they're like, “Oh, this is taking a concept and then applying it to a different problem,” and that's an interesting chain of thought there– solving a problem and then adjusting it to be better.

BP I think it beating all these benchmarks is probably less interesting in the long run than continuing to make progress and it continues to develop new capabilities– that what's interesting– and we continue to move towards multimodality that would be closer and closer to the way a human would interact with the world. Ryan, you made the point about dynamic programming and what it can and can not do. They also made the claim that it can contribute to scientific discovery. The way they framed that was, here's a paper where it's sort of a meta review. They went and looked at a thousand different genetic papers and they pulled out ones that are relevant and therefore they could do a statistical analysis and therefore they could drive the field forward by saying, “If you look at these hundred papers out of these thousand, we're learning some X.” And the machine is also very capable of doing that. In fact, it's way better at reading a thousand papers, figuring out which ones are salient, and then going forward. So it's not making a new discovery in the sense of Einstein saying, “I think time is relative or flexible,” but it's able to help folks who are in the scientific community make progress by being sort of a knowledge worker in the background. And this is similar, Ryan, to what you and I talked about with the folks at Sorcero. Nobody can keep up with all the medical literature, but if you have this AI assistant, it can point you in the right direction on a day by day basis if you're focused on solving a specific problem or disease or looking for a cure.

RD And I think that specific example took the new data and added it to a graph, and it both found the right data, aggregated it, and then looked at the graph and figured out how to plot that graph and then added the new data to that graph plot.

BP All right, we have given a bunch of air time to Gemini. It certainly is interesting. We continue to write on the site about all this stuff. Ryan, Eira, and I are involved right now and writing a big ebook about AI that hopefully will be out for y'all to check out next quarter, as well as going through sort of an overview of what we think is going to be happening in the world of search and IDEs, knowledge ingestion, and how chat will change in 2024. So Eira, back to your point about whether governments are keeping up with this, I guess there's a lot of messaging that came with the release of Gemini: We have a red team, we're not releasing the ultra version yet because we don't think we fully tested it, ethics is baked in from the beginning. I would argue that I play the devil's advocate and say that at no time in history have people at the forefront of technology spent so much time as they release each new thing, talking about safety and red teaming and ethics. When we were doing the internet revolution and the social media revolution and the mobile revolution, there was none of this. So maybe we're not keeping up, but companies are sort of hyper-aware. I don't know, what would you say to that?

EM I think they're very visibly aware because I think this is something that is sort of an exciting topic to talk about– robots taking our jobs. I think that was actually an article title that we used. Sort of this tongue in cheek reference to the fact that people do have some actual existential anxiety around the topic of AI that they don't necessarily have around other pieces of technology. I think that might be a part of what makes it different, the sort of psychological power.

RD And it seems to be being developed more in the open. There's the story when they're developing the atomic bomb of, “Oh, this test could destroy the universe, there's a 10 percent chance,” but people didn't know it was being developed. There's nobody to assuage but themselves.

BP I think it is worth mentioning that there is obviously a strong and vocal and out debate between scientists and researchers and various companies about how fast things should go and when it's safe to release a model and where you should draw the line between trying to build a business around this and being mostly focused on making sure that whatever comes out of this is aligned with our interests as a species. Literally a huge part of the field is that people who were very early at OpenAI and very influential there left to create Anthropic, which is probably number two outside of the big tech companies in terms of developing this stuff and has taken a different approach. And this is no secret, we're not disparaging anybody, but there was a conflict within the board of OpenAI about how fast to move and whether or not the company was sticking to its original mission as a nonprofit, which is that the only thing that matters above all else is to make sure this stuff is developed safely and that it doesn't get out of control. So it feels like we're living in a sci-fi movie, it's pretty fun.

RD And I think there were some people that did think this was put out really quickly and it had the benefit of being first, but we just put out a post with IBM about how they built watsonx and they said they'd been doing research for years, doing these sorts of research experiments, like the regular Watson. And then ChatGPT comes out and it's like, “Well, all right. We’ve got to figure out how to make this a product and do it safely because the cat's out of the bag.

BP Other things that we wanted to chat about on this episode– Ryan, you sent along a thing about wiki functions, an open repository of code that anyone can use and contribute to under the Wikimedia Foundation. Tell me a little bit about what this is.

RD So it's been kind of vague about what it actually is and is hard to interpret, but I think it's sort of a wiki of reusable functions that people can use at will, like any of the wiki topics, but for code. And I think that's an interesting definitely addition to the world and I'm sure people will be copying and pasting Stack Overflow functions into there to get the rep.

BP I have to say, shame on Wikifunctions for saying, “At the same time, it will increase the productivity of developers everywhere as they can use a large library of code instead of relying on properly copying and pasting answers from Stack Overflow.” No, sir. They may copy from you, they may copy from us. We are fans of copying and pasting from anywhere and as much as possible. It's not an either/or, Wikifunctions.

RD I mean, check your code, but it's going to be essentially the same as what a lot of NPM scripts do. You download a little script that is a single serving function.

BP And will this have the Wikimedia approach that anyone can come in and edit, or anyone can come in and contribute?

RD I think so.

BP Because the issue with NPM and all these things is provenance. This is what we talked about with Sigstore. What's going to stop somebody from seeing whichever Wikifunction is most popular and then coming in and writing a backdoor on it? That's the thing we need to prevent.

RD And people are preventing that on Wikipedia. What's to prevent them from going into any historical figure and saying they like wearing pink hair and wigs?

EM Sometimes nothing.

BP All right, cool. Well, Wikifunctions, we're all about open source code and more places for stuff to do it. We had a great piece up recently from Eira about return to office, whether it actually is scientifically shown in any way to be associated with productivity. And Ryan, you have something here from the scientific community in Korea. Commuting to work makes you sad? I don't know if you needed a study to tell you that, I could have told you that. What does the study say?

RD Well, so they got the data around it. Korea apparently has one of the highest average commuting lengths, something like an hour, and they also have a very high rate of depression. And somebody correlated the data of how long your commute was and how bad your self-reported depression was, and they ended up correlating pretty hard outside of all other factors. So the more time you have to waste getting to the office, the sadder you are.

EM Can confirm.

BP All right, it's team ‘work from home’ here. Except for me, I like to go to the coworking space.

EM I'm also team ‘work from home,’ except I also have two toddlers, so sometimes that requires me to be team ‘work from coffee shop’ just to get a little bit of more benevolent background noise.

BP See, these are the things we don't talk about. Working mothers need a place out of the home to get away from the children. It's called an office and they deserve one. They should all have an office.

RD There's definitely things I miss about the office like talking to the people and the Friday kegs, but the commute isn't really one of them.

EM Right.

BP Yeah, the commute is not the part you miss, that's true. All right, let's go out on an interesting, slightly chilling note. Eira, this is from your home state. Senator Ron Wyden is telling us that foreign governments are spying on us through push notifications. I feel validated. I turn almost all push notifications off. What's happening here?

EM I've done the same thing. I think it's just a best practice for life, but another reason to do it if you need one is that foreign governments, and it sounds like probably our own government, have been spying on user data through push notifications. So Ron Wyden, who's a Senator from Oregon who's kind of been an internet privacy guy and forward-thinking on this stuff since the early-90’s, sent a letter to the Department of Justice basically saying that foreign officials are demanding the data from Google and Apple servers, and they want to know who is spying on who and why. So no details about what's actually happening and who's being spied upon by whom, but just another sort of potential attack surface to be aware of in your personal cyber security stance, I guess.

RD I think the thing that people don't really think about there is that all of those push notifications go through the Apple and Google servers. So you get all the metadata. I don't know if you get the data. So if you have a notification about your messages that includes part of the message, does Apple or Google get that?

BP So this blog post that was linked here, “Push notifications are a privacy nightmare,” is basically saying that rather than you deciding when to open the app and sort of what action you want to take in there, the push notification requires your device to constantly be pinging in the background off to Firebase, AWS, whoever, Apple or Google, and then behind that some server, and then the information is being routed and deciding when to wake up your phone. So just as you point out, Eira, way more attack surface, way more instances where data is being sent both ways and often unencrypted, it looks like. So we'll be sure to link that blog in the show notes.

[music plays]

BP All right, y'all. Let's take it to the outro. Like we do, we try to thank somebody who came on Stack Overflow and shared a little knowledge. “What is the exact definition of group(0) in re.search?” This is Python group(0). What does it mean, the mysterious zero? Well, if you're curious, manuel_va has an answer for you and was awarded a Lifeboat. Thanks so much, Manuel, for coming on and sharing some knowledge. This question was asked seven years ago and has helped over 25,000 people, so we really appreciate it. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. Find me on X @BenPopper. Hit us up with questions or suggestions for the program: podcast@stackoverflow.com. If you want to come on and talk about something, email us there. And if you like the show, leave us a rating and a review.

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog. And if you want to find me, you can find me on X @RThorDonovan.

EM I'm Eira May. I'm a content writer at Stack Overflow. I also work on the editorial team and work on a lot of our product marketing content, so I've been thinking a lot about AI from that perspective as well. And you can find me at all the places @EiraMaybe.

BP Sweet. All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]