The Stack Overflow Podcast

Your whole repo fits in the context window

Episode Summary

The home team discusses the challenges (hardware and otherwise) of building AI models at scale, why major players like Meta are open-sourcing their AI projects, what Apple’s recent changes mean for developers in the EU, and Perplexity AI’s new approach to search.

Episode Notes

AI shops are now releasing LLMs optimized for RAG

Turn a repo into a prompt for a long-context LLM.

Perplexity AI is an AI-powered search and discovery tool.

Good news for developers: Apple will not remove progressive web app support on iOS in the EU.

Basil Bourque earned a Lifeboat badge by explaining How to get full name of month from date in Java 8 while formatting.

Episode Transcription

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I'm your host, Ben Popper, joined as I often am by my colleague and collaborator, Ryan Donovan. What's happening, Ryan? 

Ryan Donovan Hey, Ben. How’re you doing today? 

BP I'm good. So I have been writing a piece that's kind of like the ‘build versus buy’ in the Gen AI world– what should your organization do? You’ve got lots of different options. And there's a section at the beginning that just sort of says, “Look, if you want to build your own model, you can, you could build them at different sizes, but if you want to get something pretty good, you're going to need to bring a lot to the table– a lot of data, a lot of hardware, a lot of expertise.” So today, Meta released a post open sourcing the sort of details on hardware, network, storage, design, performance, and software that they're using to train their next model, LLaMa 3, and it kind of put everything in perspective. So to give you a sense, they're open sourcing what it would be like to work across two 24,000 GPU clusters, and by the end of the year, they're aiming, in this infrastructure build out, to have 350,000 NVIDIA H100s as part of a portfolio that will have 600,000 H100s. Now obviously, most companies are not trying to do what the Metas, Apples, Googles, and Microsofts of the world are doing, but to give you a sense of scale for a state of the art foundation model, there's some really interesting details in here. 

RD It is interesting to see the actual numbers there. I read something in the last couple of days. Somebody was trying to hire a Meta senior AI researcher, and they said, “I'm not even going to talk to you until you have 10,000 H100s,” and that's the sort of minimum.

BP Yeah, there's stuff that this person wants to do in the field that's interesting and cutting edge and them the table stakes. 

RD And we've seen research models grown on a hundred A100s which is a sort of step down. So you can build a smaller model, but if you want to roll with the big dogs, you’ve got to have a lot of bark. 

BP This was the first time I heard about Grand Teton, which is Meta's in-house open GPU hardware platform, which they contributed to the Open Compute Project, so that was interesting to hear about. Grand Teton, Open Rack, and PyTorch are the three sort of core technologies that are open to everyone in which they're building this on top of. And it's really interesting because there's a lot of questions out there about what should be open source or closed source and what's safe when it comes to cutting edge AI to share, and it seems like some of the major players are going to take very different approaches to that and it will be interesting to see how that plays out. What does Meta gain by letting everybody know how to do this in terms of hardware? Well, I guess people come around and improve the technology that they're building. If a hundred thousand other people end up using Grand Teton and offering bug fixes and pushing it to the next version, that's all to Meta's benefit. And they're not planning to sell, at least for the moment, AI services directly to anyone. 

RD And maybe they don't think that AI is sort of a primetime threat. I think the other thing is that for anybody to duplicate their setup it's nearly impossible because of the shortage of chips. They can be like, “You should get this billion dollar boat, that's how you sail to this location.” 

BP Yeah, exactly. Build this, I dare you. I've already bought all the H100s. Good luck to you, sir. Exactly. They also sort of just mentioned casually, they're like, “This is stuff that we're doing. We want to push forward research, we want to make it open source,” and then they're starting to roll this stuff out into apps. One of the ones I like is, there's stickers that you can use for conversations across WhatsApp, Messenger, Instagram, there and other platforms too. Now, stickers are just text to stickers, so whatever sticker you want, it's generated, it appears, the raccoon on the skateboard holding stuff. And I don't know if we mentioned this on the last one, but there's been some pretty enjoyable– I don't know how technically impressive they are, but there's been some functionally impressive upgrades recently. Stability AI and–

RD Midjourney.

BP Midjourney. So the open source Stability image generator can now do text super well, which was one of the things that always held it back from being a complete graphic design solution. “Make me a T-shirt for my kid’s band that looks like this and says the band name in the middle,” and the band name in the middle would be all morphed or the letters would be all mixed up or whatever. Now that it has kind of a ControlNet for doing the text, it is pretty incredible what you can do. And then just today, which is March 12th, Midjourney introduced something called ‘character reference.’ So you can put in a picture of somebody you know and then ask it, “Make this person as an explorer climbing a volcano,” and I will put it in the show notes, there I am climbing the volcano. It's a striking resemblance, I will say. I feel good about it. It looks like me, could have fooled me, and so that makes it a lot more enjoyable 

RD I think it's interesting you reference ControlNet seeing some of the background there. They're using a combination of the sort of diffusion models and some more rigid image recognition, and I think that points to that future models will be using these combinations of things, not just pure diffusion or the statistical learning. They'll have knowledge graphs and whatever. 

BP I think even from launch we saw that ChatGPT was launched and it could do a bunch of things but not math or current events research, and now it has a calculator built in, it has a code interpreter built in, it has a connection to Bing search, and when it needs to avail itself of those tools, it can do so. What was it the other day that somebody said to us when we were on with Roie from Pinecone? And this is a classic Andrej Karpathy quote, but Gen AIs, LLMs, they’re dream machines. Their job is to create content, not to be accurate necessarily, to guess what would fit into this blank space and the same with an image diffusion model– “Dream me up something amazing that looks like this.” And then these other AI agents, some of which are built differently, can be more structured, can enforce more accuracy to the prompt or inject legible text or things like that which makes it a more fully-fledged tool for going from, “I can't do anything” to “I can whip up my own graphic design solution for my small business,” or whatever it may be. 

RD Right. That's why you see so many of the applications now being kind of toys, because it is wonderful to see, “Oh, wow, what a surprise you've given me giving me these wild things.” And the Midjourney update is much better at describing things very descriptively, but I almost miss the random basket of prompts that it used to give me where it's like, “What are you going to give me? What is Dracopunk or heat wave or whatever?” 

BP Yeah, there's going to be a lot of nostalgia for the ‘Will Smith eats spaghetti’ era, the goofy hands era of AI when it was generating these occasionally nightmarish, sometimes wonderful mistakes. Those were good times. 

RD The defining feature of any medium is its flaws. 

BP Exactly. 

RD So speaking of image generators, there was a post I saw recently about one created in 1974, an image generator called ‘Aaron.’ And it's not a full image generator in the way that we see them, it kind of creates scribbles and then the artist is able to fill it in, but it was doing the early sort of algorithmic computer-assisted art back in 1974.

BP I was watching a talk from Jeff Hinton, pioneer in the field of neural nets and deep learning at Google for a long time, has since left and is out sort of warning of the dangers as he sees it of super intelligent AI. But he was sort of saying, “Look, here's essentially a language model I built in 1990. It's extremely limited, it has 100,000 neurons and it can tell you if Nancy is Bill's father and Bill is Joan's wife, then Nancy is Joan's mother,” or whatever. It can work over a few semantic concepts and come up with ideas, but it was cool to see sort of the germ of the idea that then grew into the bigger thing. 

[music plays]

BP Develop skills to build accurate, explainable Gen AI apps with online courses at Neo4j Graph Academy. You’ll learn to ground LLMs with knowledge graphs, and how to develop a reliable chatbot. Start today at neo4j.com/llms. 

[music plays]

BP All right, some big news for developers everywhere. Apple announced three further changes for developers in the European Union. Again, we're recording this Tuesday, March 12th. They can distribute apps directly from web pages, choose how to design in-app promotions and more. So in this podcast we don't really get into the nitty gritty of throwing stones, but I think this is important news for developers. Obviously, there's been ongoing litigation between Apple and other companies about the fees that are assigned to in-app payments or the ways in which folks can distribute or not distribute their apps, and so the EU has been passing new regulations and it seems like Apple has decided to make some meaningful changes. I don't know, I guess we'll see. 

RD I know this comes about as a reversal. Initially they decided to ban progressive web apps and then reversed that. So personally, I think it's good that Apple is opening up a little bit. It was a bit of a walled garden, but it also goes against one of the things that people sometimes like about Apple, that it's very protective of its users.

BP Right. Maybe you get a spam ad now that drives you to a website where you download something onto your iPhone and all of a sudden you've been pwned. Nobody wants that. 

RD They just set up a fake looking Apple Store and you download whatever app.

BP Careful what you wish for, I guess. We'll see what consumers think a few years from now. All right, in other news, Cognition AI, which is a relatively small AI startup, I think they raised 20 million or something, they introduced a new product today –this is March 12th– Devin. They call it the first AI software engineer. I don't know exactly what they mean by that in the sense that there are plenty of helpers. I guess what they're pitching here is that this is a fully autonomous AI software engineer. It's a tireless skilled teammate, and it's equally ready to build alongside you or independently complete tasks. We'll include some of their examples in the show notes, but it can read a blog post, run ControlNet on Modal, produce images with concealed messages. It can build an interactive website to simulate the game of life and deploy it to Netlify. It can help you maintain and debug code. So this is a level of autonomy which most of the folks we've talked to from Replit, and more recently when we talked to that security company, say that this is not where we want to go. We want to do developer augmentation. We want to help you write, but not that we want to come in. This company is pitching an agent that gets in there and rolls up its sleeves and does a lot of the work for you. So we'll see if it catches on, I guess. 

RD I think this is where a lot of the sort of hype and fear that AI is going to take people's jobs, and this seems to be that as a product.

BP Yes. So if folks play around with Devin and they want to give us some feedback, let us know. It is unfortunately, yes, it's out there taking jobs off of Upwork and doing the side hustle. Ugh, Devin! Shakes fist. Exactly. So a couple other just housekeeping notes. We released a great podcast earlier this week with Ryan Polk who is the new Chief Product Officer here at Stack Overflow. He hasn't been on the podcast yet. If you're interested in Stack Overflow and the folks who are helping to lead our technical and product organization, it’s a great interview with him. A lot of discussion about what our stance is on ethical AI and how we would partner with big LLM providers through our API licensing. And then last week we announced some pretty cool changes to the Teams homepage. So if you're a Teams user, a lot of big changes rolled out there. It's kind of like quality of life, a lot of quality of life improvements, so you can check out that blog if you're interested. Ryan, anything else you want to chit-chat about before I take us to the outro?

RD So I heard about this new search engine trying to be rooted in AI called Perplexity. And it's trying to be completely updated. It's basically one big RAG pipeline for the internet with AI over it. And I think what's sort of interesting about this and makes me hopeful is that it's making SEO a little less relevant. I think that's a little weird to say as a content marketer, but I think a lot of people have been just gaming SEO to put up chum in the water, so the same thing that everybody has, and hopefully this will bias towards the more original content. 

BP I think there's a give and a take there. As a former journalist who rode different waves of SEO and social referral and search referral traffic, that was a difficult thing to contend with and maybe became too much of the focus, versus, “Hey, let's do great journalism.” At the same time, I would worry that as these AIs get great at synthesizing the information out there, they no longer send any traffic anywhere. So Perplexity just reads The New York Times, reads Fox Media, reads Fox News and comes back with a synthesis of the answer and none of those media providers see a click at all. So that is a puzzle that also has to be worked out. How would Perplexity reward or how does the full value stream work there if Perplexity is sort of gleaning these answers from folks? But I see more and more that at least it seems like the hip thing. I see tech startup folks and venture capitalists being like, “I'm using Perplexity AI. I’m never going back to traditional search.” Well, give it a shake. 

RD And I think that respecting sources is a question that AI in general has to answer. We've got our new partnership we announced last week that I think is a good start in treating the data as it deserves. 

BP Yeah. The knowledge community and the data are an integral part and there needs to be attribution, there needs to be value sharing, there needs to be a give and take. So yeah, it was a good announcement and I'd like to see how that plays out in the realm of search, for example. Ours is more in the realm of a code gen, but in the realm of search would also be interesting. And I have to be honest, my kids are getting to the point where they need to do independent book reports, “Go learn something about a Norse god, go learn something about the indigenous people who lived here, go learn something about tsunamis,” and it's way more enjoyable for them to have a discussion with an LLM and go back and forth and learn things and ask questions than to do a traditional search, go to a website, try to scroll through a bunch of text. There's an immediacy to that conversation that's really rewarding, but it does lack citations.

RD I think Perplexity, since it's based on big RAG, it does provide citations, but I do think you will miss some of the context by going through and reading an entire source. And I get to do the ‘back in my day’ where we read books and encyclopedias and all that.

BP I'd be like, “I’ve got to do this report. What CD-ROM out of the 10 CD-ROMs from Microsoft Encarta contains this information?” 

RD What of the 27 books I have on my shelf? 

BP Go to the library, go to the card catalog. Oh boy. All right, we're shutting this down before we get too old. 

RD That's right. Did the whole Dewey Decimal System. 

BP Stop it. 

RD We knew how to file then!

[music plays]

BP All right, everybody. It is that time of the show. Let's shout out someone who came on Stack Overflow and shared a little knowledge with the community. Thank you, Basil Bourque, who was awarded a Lifeboat Badge for providing a great answer and saving a question from the dustbin of history. “How to get a full name of a month from date in Java 8 while formatting.” I think you pronounce it ‘Bisal.’ It's not like basil like the herb, but basel or basil, we appreciate it. You've helped over 35,000 people, and congrats on your Lifeboat Badge. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. Find me on X @BenPopper. Hit us up with questions or suggestions for the show: podcast@stackoverflow.com.

RD And I am Ryan Donovan. I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog. And if you want to reach out to me, I'm on X @RThorDonovan. 

BP Woo woo. All right, thanks for listening, and we will talk to you soon.

[outro music plays]