The home team convenes to discuss AI deepfakes, the legal implications of generating an AI version of a dead comedian or a famous singer-songwriter, whether leaderboard rankings reflect reality, and the relationship between agile development and burnout.
The estate of the late comedian George Carlin is suing the creators of an hour-long AI-generated comedy special that mimics Carlin’s distinctive delivery and material. [Ed. note: not actually AI, still lawsuit.]
Prefer your AI more Freudo-Marxist? Here’s a never-ending, AI-generated conversation between Werner Herzog and Slavoj Žižek. You’re welcome.
Google’s Bard surpassed GPT-4 to claim the second spot on the LMSYS Chatbot Arena Leaderboard.
Agile development is faltering at big companies, and a recent report cites developer burnout as a factor. But maybe the problem lies in companies’ (mis)understanding of agile.
Shoutout to Stack Overflow user Emil Laine, who earned a Lifeboat badge with their answer to How can I include all of the C++ Standard Library at once?.
[intro music plays]
Ben Popper If you’re building AI apps with popular models like YOLOv8 and PaDiM, you’ll want to visit intel.com/edge for open source code snippets and helpful guides. Speed up development time and make sure your apps deploy seamlessly where you need it most. Go to intel.com/edgeai.
BP Hello, everybody. Welcome back to the Stack Overflow Podcast: Portland Bandwidth Test Edition. I am joined today by my colleagues, Ryan Donovan and Eira May, and we have a lot of fun stuff to talk about today. I'm going to kick us off with a link that y'all shared with me. I think one of the obviously biggest trends that we'll be hearing about over the next year in the United States where we're headed for an election is what impact will generative AI have on disinformation and deepfakes and things of that nature. And so Ryan, you shared an interesting link from The Hollywood Reporter. The famous comedian George Carlin has passed and his estate is suing creators of an AI-generated comedy special which I guess uses his likeness.
Ryan Donovan So it uses his voice and uses his style of comedy. It's an hour long and it sounds like Carlin enough that you could believe it. It sounds like Carlin in that it is an old man complaining about things.
BP To me, the really important question here is, is it good or funny? It's scarier to me if it's good and funny and it's like, “Hey, this isn't a bad George Carlin set.” That to me would be terrifying where it's like, “Well, actually AI is doing a pretty good job.” That to me is just like, “Well, what do we need people for anymore?”
RD I think it's decent enough, but it's not nearly as good because one of the things they're suing is that it's a bad introduction to George Carlin. People will see this and be like, “Eh, it's no good.” And the folks who did it, one of them, at least, is this guy, Will Sasso, who was on Mad TV. He is a comedian and has been doing comedy for decades.
BP I wonder if this will end up being like Mickey Mouse which recently came out of copyright where once the comedian has passed or their work is 100 years old, then it's fair use. Use it as you will, but the estate gets a certain length of time where they can benefit from the works and it doesn't go into fair use. What is it called– fair use arena?
RD Yeah, and fair use. I think there was an article I read a while back that was: “Is AI the end of copyright?” And that was a positive for it because copyright has been extended and extended over time, and if the copyright terms had stayed unextended, we'd be basically getting things from the 60’s.
BP Yes, one of the big benefits of our dysfunctional Congress this year was that they finally forgot to do the corporate bidding and extend the copyright again.
RD But this is one of the things that the writers’ strike was about and the actors’ strike was about, that it is now pretty easy to create George Carlin specials forever, talking about things that are current and contemporary. Bruce Willis sold his likeness. You can make Bruce Willis movies forever.
BP Eira, thoughts on this?
Eira May I didn't see the George Carlin AI special, but I read some folks who were reacting to it and it sounded like at least that version was sort of a poor imitation of the original. Like Ryan said, people who aren't familiar with his brand of comedy or his style or whatever are going to see this and be like, “Oh, this guy sucks. I don't want to watch any more of this.” So it seems like, at least in this case, it's maybe not quite yet to the point where it's maybe going to be a competitor for people who really identify as fans or for whom this is a really primary interest, but for folks who are just scrolling and encountering stuff. It is kind of scary to think about how easily stuff like that could slip past people's attention.
RD There was one a while back that I thought was a more fun version of this. It was an AI generated conversation between Werner Herzog and Slavoj Žižek, and it was just infinite. It just went on forever and it was this beautiful nonsense. Nobody would mistake it for them.
BP I guess one thing I would quibble with here is that the bar for whether or not this should be legal is not aesthetic. It's not like, “Is this comedy funny enough that encountering George Carlin for the first time you'll get the right impression?”
EM Oh, for sure. I didn't mean to be speaking to that piece of it. I was sort of keeping my mitts off that one for the time being.
BP No, I know you weren't supporting that, I just think that's funny that that's kind of the main complaint here.
RD That’s right– the vibes are off.
BP It's more like, “What right do you have?” What is a copyrighted work? Comedy is a weird world where you can't really copyright a joke. It's just an informal rule within comedy that you shouldn't be stealing each other's jokes and get exercised from the community or whatever.
RD And it's self-enforcing. There's a famous bit where Carlos Mencia was accused of joke stealing and Joe Rogan went up on stage to confront him about it.
BP Yes, that was a seminal moment in the history of comedy. All right, so one other thing I wanted to mention here is that we're going to be seeing a lot of this in different ways. At the same time as this George Carlin thing was happening this week, there was something really unfortunate with deepfakes of Taylor Swift spreading all over Twitter, putting her in completely inappropriate poses, and it just feels like the tools that are available to the public are getting better at a very rapid pace and the regulations and laws that govern this are not.
RD It's going to be an interesting dividing line. How much of this is derivative work, how much of it is an impression? Can Will Sasso without the AI go up and perform this set doing a George Carlin impression?
BP I think the answer is yes. You can cover somebody live. That's legal.
EM Right. I mean, cover bands are a thing. Sort of like a cover band for comedy?
BP One thing is with live it's not infinitely reproducible. If you then recorded that and tried to sell it, that would be a problem, I think. And then also an AI copy can infinitely replicate. It's not a one-off thing where somebody goes on stage and does an hour of material. They can run this George Carlin all day, every day if they wanted to.
RD Brave new world.
BP Brave new world. Let's move on to a related topic. There is a leaderboard that Eira brought to our attention. Is this on Hugging Face? Where does this leaderboard come from?
EM This comes from the Large Model Systems Organization.
BP Okay. Google's Bard has just taken the second spot to GPT-4 on the leaderboard. The race is heating up. These scores in the Elo arena really mean nothing to consumers. They're only interesting to observers of the industry and maybe within the companies and the engineers themselves, but it is great that Bard, which was a late entrant– the more competition, the better, I guess, is all I would say on that front.
RD What are the metrics here for this? What are they winning on?
EM That's one of the things that I thought was interesting about this and why I wanted to get your read, because there's just kind of a discussion obviously happening in the replies under this tweet about what kind of metrics are being used and whether the score is reflective of users’ actual experience using the technology, whether it actually works as well as the ones it's being ranked with. It's not something that I know a lot about or have done a lot of work in, but it's interesting to see that disconnect possibly between where it ranks on the leaderboards and where it ranks in terms of usage and popularity.
BP How is Arena Elo calculated? We present the Chatbot Arena, a benchmark platform for large language models that features anonymous randomized battles in a crowdsourced manner. The Elo rating, which is borrowed from the chess community, joins this effort by contributing new models. I think it might be that they're masking what system it is and then they're going through a thousand tries with the user and then seeing how it's scored. I'm not sure.
RD I want to know. It looks like you enter a prompt and then they have two models respond and you say which one's the better one.
BP Got it, okay. So they crowdsource it as a blind taste test. So Coke and Pepsi are neck and neck.
EM Okay, thank you for putting that in context I can understand. I thought of wine tasting, but I'll accept Coke and Pepsi.
BP This is like when people are like, “9 out of 10 dentists recommend.” They used to do this– “Do people like McDonald's or Burger King better this year?” and they would do it. And it's pretty meaningless. That's not what people are making their decisions based on. It's just sort of a fun metric for the companies to compete in the industry.
RD I wonder if we're going to get to a future where people are just operating on the biases, like the GM/Ford arguments where somebody just likes Ford for whatever historical reason. Somebody is going to be like, “I’m an OpenAI man.”
BP “I've been a ChatGPT user since 2023,” 20 years from now.
EM Exactly. That's going to happen with any brand. Anytime there's brands in competition and they're in the market for long enough and they're both successful enough and have carved out market share, people are going to start identifying with them the same way we identify with everything else culturally.
BP I think there's an even deeper level here, which the tech ethicist Tristan Harris talked about in a talk, which is, let's say that going forward at some point they start to have memory so they remember the chats they've had with you, you can reference back a chat from a year ago, and they're personalized to say, “Well, I remember that time we talked about X,” or “I remember when we shared this idea, or “After I read your book it made me think of Y.” And so if you build up that relationship, kind of like the way I feel about Spotify and its recommendation algorithm, the switching cost might be higher than you expect.
EM Yeah, for sure. I think we get comfortable with our tools, and like you said, when the recommendation engine is a part of it and we start to sort of rely on the system's knowledge of what we're going to be interested in, you don't want to have to rebuild that relationship necessarily from the ground up with a new platform.
BP Yeah. So there was a post at the top of our programming today: “Agile development is fading in popularity at large enterprises and developer burnout is a key factor.” I read this piece, and my takeaway is that the larger your organization is, the more bureaucratic everything is, including your approach to Agile.
EM Wait, really?
BP Stop the presses, breaking news. Agile works great at a 10-person company all the way up to 50 people, but when you have a 5,000 person organization, it stops feeling Agile for some reason. It's like, “Yeah, and?”
EM Right. Tell me more.
RD It's interesting going through the Reddit thread that you posted about it because it seems like any time people talk about Agile, it's a competition between people being like, “Oh, Agile sucks” or “You're doing it wrong.” A lot of it is Agile being used as a monitoring tool or being monitored standups.
BP I do think one of the key things that comes out in the Reddit thread, which I'm sure many developers feel, is that there's a piece of this in the original Agile manifesto, which is like, “This will help us better align the business needs and the software development process, and we won't have sort of specs for everything we want and we get halfway through and we realize it's broken, but there's no mechanism really to change and adapt.” But most of the complaints in the Reddit thread, which is delicious and delightful, is just like a very classic developer conversation: Agile is great except for my ludicrously ignorant product manager who I have to explain things to for the X time. Or yes, Agile is great, except for the business stakeholder who came in and asked me to change blue to green again. So it's kind of that Agile is not the problem so much as working cross-functionally with other people.
EM People are the problem. Agile is not the problem.
BP People are the problem, exactly.
RD I think AI is trying to change that.
BP AI will replace the people, Ryan.
RD There you go.
EM Problem solved.
BP Problem solved.
RD A lot of it I think is that people are missing the team autonomy. If you're being asked to report on your progress, you should have the ability to make changes to better progress instead of just being blamed.
BP Right. There's so many– “Every company I've ever worked with claims to be agile and runs like waterfall with scrums. Let's not forget the definition of sprint, which actually means marathon or death march.” These are great. There's so many gems in here. Love this thread.
RD Nobody complains like developers.
BP Nobody complains like developers. So juicy.
EM It's true.
[music plays]
BP All right, everybody. It is that time of the show. Let's shout out a user who came on Stack Overflow and helped save a little knowledge from the dustbin of history. Awarded to Emil Laine, “How can I include all of the C++ Standard Library at once?” Working on a class project, I need it all. There is an answer here for you, and we've helped 67,000 people, so we appreciate it, Emil. All right, everybody. I am Ben Popper. I am the Director of Content here at Stack Overflow. You can always find me on X @BenPopper. If you have questions or suggestions or you want to come on the pod and talk about something, hit us up: podcast@stackoverflow.com. And if you enjoyed the program, leave us a rating and a review.
RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog. And if you want to reach out to me on X, you can find me @RThorDonovan.
EM And my name is Eira May. I'm a Senior Writer at Stack Overflow. I usually write the show notes for the podcast and some blog content and other good stuff, and you can find me on most platforms @EiraMaybe.
BP Thanks for listening, and we will talk to you soon.
[outro music plays]