The Stack Overflow Podcast

How the Python team is adapting the language for an AI future

Episode Summary

In part two of their conversation, Ben and Kyle chat with Python core developer and Steering Council member Pablo Galindo Salgado about balancing consistency and new features in language design, the importance of gathering community feedback on new iterations, and why he’s focused on making Python faster.

Episode Notes

Pablo is a Python core developer, Steering Council member, and release manager of Python 3.10 and 3.11. He’s currently a senior software engineer at Bloomberg.

Looking for a comprehensive guide to contributing to Python? The Python Developer’s Guide is the place to start.

The Zen of Python is a list of the language’s guiding principles, including, “There should be one—and preferably only one—obvious way to do it.”

Find Pablo on LinkedIn, Twitter, and GitHub.

Find Kyle, a senior software engineer on Stack Overflow’s public platform, on Linked, Twitter, and GitHub.

Episode Transcription

[Intro music plays]

Ben Popper Big topics in data architecture call for big conversations. Big Ideas in App Architecture, the new podcast from Cockroach Labs, invites innovators to discuss their experiences building reliable, scalable, maintainable systems. Visit cockroachlabs.com/stackoverflow to listen and subscribe. 

BP All right, everybody. If you are tuning in, this is part two of our interview with Pablo Galindo Salgado. He is a physicist and software engineer. He works at Bloomberg as a compiler software engineer on the R&D Python infrastructure team, but he is also a member of the Python Steering Council. So we talk about how he went from physics to software engineering, how he went from fixing a small typo in the Python docs all the way to helping guide releases and what the compiler would look like, or Python as a very popular and well-loved language. So it’s a very cool episode, this is part two. If you didn’t catch part one which aired last Friday, go check it out. Hope you enjoy, and thanks for listening.

[music plays]

BP You mentioned that something about the limitation of the early setup was that it was always just looking for the next token, and that triggered in my mind one of the questions I wanted to ask, which was the rise of generative AI and large language models in the recent year and just sort of a lot of focus at big companies on applying this. How does that interact with Python, which I think a lot of machine learning engineers do rely on. Do you see any new language features or frameworks emerging to meet that need? 

Pablo Galindo One of the things that is quite interesting to see is that, from the core team, we have been really bad at predicting what was the new big thing in Python. Making predictions is very hard, especially when they are about the future, so for a long time, the big thing many years ago was web development. Django was a big thing and then data science became a thing and now it's AI, and we are always a bit behind. We have two strategies here in general. One of them is that we say, “Well, we cannot predict what is going to be the next thing. Now it seems to be AI, but who knows what is the next thing or if AI is going to keep going?” So we try to make all possible workflows, as many as possible, available and possible with Python. And this is really great. For instance, most of these AI frameworks that are surging and are using Python, even if they're actually not coded fully in Python, actually most of them are C++ or Rust or even Swift in some cases, and they have a layer of Python on top that is what people used to interact with. So people write the code in Python, but it's actually C++ or other things. And that is great. We like that because Python for a long time has some advantages and one of them certainly is not the speed, at least compared with these other things. But we have a solution for that, which is that you can bring your compiled code to Python. And that is great and people really like that because that unblocks for you this speed of development that is very important, especially for instance, for AI that has a lot of research components when you try things. So it is really good because it allows you to keep going and it gives you this momentum as a developer. You run a little experiment and something happens and you can have instant feedback and then you can move things around and you don’t need to wait for compilation and whatnot. So that’s great.

BP Yeah, you make an interesting point, which is that a lot of what goes into AI model development is research, testing, poking and prodding, something where the outcome is often very fuzzy. You're not quite sure what about changing the weights or the dataset produced that outcome, and so you want to be able to move quickly and hopefully cheaply to run those kinds of experiments over and over until you get where you want to land. 

PG Exactly. So sometimes we actually change the language to help these communities. One example, for instance, is years ago we added the matrix multiplication operator, so this is the ‘@’ to the language, and that was mainly for the data science people, because NumPy likes to multiply matrices and that's a very common operation. And this is something that is not even used in the standard library. So if you check, there is a task for it, but it’s not like there is some model in the standard library that leverages this because we don't tend to use matrices, but in NumPy it is everywhere. And that was because we thought, “Well, having an infix operator for matrix multiplication is very important for this community and this community is huge,” so there was enough justification. In the case of AI, there is some things. For instance, we have provided since a long time this idea of optional static typing to Python, so you can optionally do this type hint. So you say, “Okay, this is an integer and this is a flow,” so it's not enforced at run time, but then you have tools that can operate with that, and that's very important when you have these big codebases. And there has been a lot of development in that world for the AI people. So for instance, a lot of them use these sensors, which as a physicist, I have a thing with the word because I think it's misused, but let's not enter there. But they have this multidimensional array, let's call them multidimensional arrays, and those have been very important and they can have different shapes and whatnot. And we added even some slide modifications to the syntax to allow them to tell these tools that are doing the static checking, “This is a multidimensional array of this shape.” And you can transform, so the tools can say, “Hey, you're actually using this tool incorrectly, so don't even run the code because it’s not going to work.” So that's one example. 

BP Along with adding new features, what can you do, what can you change to help Python keep up with the trends and activity in the software development landscape?

PG Right now, I think for us what is more important is that Python doesn't get in the way because of this idea of experimentation and things like that. So we like to have the productivity of the developers and we don't want the language to get in the way. And there is some cases when it does, unfortunately. For instance, one of the things that can happen sometimes is that Python, for a long time, like many other languages like Ruby for instance and many non-compiled languages, have been restricted to have the Python code, for instance, can only run in one thread at a time. This is what is called the Global Interpreter Lock, or the GIL. And this means that Python can be multi-threaded, but it cannot be parallel. So the threads are kind of switching between themselves, but you cannot ever have two threads running Python code at the same time. And for a long time, that was okay. It has advantages, so it's not just something that is bad and that's it. But for instance, right now we are feeling that in the data science world, and especially in the AI world, there is all these super parallel libraries underneath in C++ and they cannot really leverage the whole power of the thing because of this restriction. And for a long time, this has been a super hard problem. It's a really hard problem. It's not that people have not tried to solve it. It's really, really hard, and it's really hard to solve it with the guarantees that you want, because obviously you can solve it but it means that now Python is twice as long, so that's probably not acceptable. And just now we are literally, in the Steering Council and in the Python community, discussing the possibility of dropping it, because one person–Sam Gross– he works at Meta, he has an implementation actually, so not only the idea but he has actually made the proposal with code, to drop the GIL in a way that doesn't have that much impact on performance, but there is a lot of trade offs that we need to consider, and compatibility, and it introduces potential problems in the community. So it's not an easy decision, but right now this is a very impactful thing that can actually make a huge, huge change in the Python community, and especially on the people doing artificial intelligence and data science because it will allow them to leverage these parallel workflows when you don't have a lot of communication in the threads maybe, or you’re leveraging the actual raw power of the C++ libraries. But it’s a very hard change to accept just without thinking, because we know what happened with Python 2 to Python 3. We have learned that lesson, and we don't want to repeat that even in the slightest, and that's very important for us. 

Kyle Mitofsky Okay, so not to quote Zen of Python at you, but to do that exactly– this is in PEP 20, there's an axiom in there that there should be preferably one way to do something in Python. And I feel like that's a reasonable proposition for most languages. You also sit at the seat where you're a language designer and you're evolving the way that the language behaves. And I think you gave some examples of matrix multipliers where maybe that operator didn't exist before and so maybe that's a net new thing. But a lot of times, the language features give us ways to more tersely express something that was already possible in the language, you just had to have workarounds to it. And it seems like every single time you add something, you almost create two different ways to accomplish that same goal: the way you were doing it previously, and how the new language evolves. So how do you balance that need to have kind of consistency in what your expectations are with, “I want to grow the language, I want to make it more powerful or terse or expressive in certain ways.” How do you balance that against trying to stay with conventions that may be predated language features that you want to introduce? 

PG That's very interesting. Also, I have to say that you did mention matrix multiplication, but I think stream formatting is a better example. I think we have six ways to do it. 

KM Sure do. 

PG I would modify the sentence to say that there has to be only one very good way to do it, let's say. But by the way, there is other languages that really like to say the opposite. If I remember, Ruby or Pearl, one of the two says there has to be many ways to do the same thing and they take pride in that and that's fine. But yes, as you say, for a language that we have for a long time, there should be only one way to do it. We have several ways to do it. And as a language designer that's quite important, but there is two things here. One of them is that it's impossible to be omniscient and know the best way to do it. So obviously if you start with that premise it's going to be very hard for you, you're going to find reality a bit harsh. So you need a way to make these decisions, and it's very important what you bring there because the worst situation is having two ways to do the same thing when one of them is not obviously better. Because if you have the old way and the new way, and the new way is always better, then the old way will die. This happened with string formatting in Python. The new way is almost always better, and nobody almost uses the old way unless they want compatibility, so that's good. But if you have two ways and one of them has one advantage and the other has others and you need to know both, then we're entering the problematic realm. So from a language point of view, what we try to do is what is called syntactic contextuality. So the idea here is that, for instance in Python, you introduce a bunch of concepts, for instance, the literals. You have lists, you have doubles, you have dictionaries, and in these three cases each one of them uses a different symbol. Lists are square brackets, doubles are rounded brackets, and dictionaries are curly braces. So then one of them is list, double, dictionary. And then we have another different concept which is totally different, which is comprehension. And this is a way to construct these collections by looping, but in a very terse way and in-line way, and a very I think readable way. So you say something like, “Number square for number in my collection,” and that's the comprehension. But then this is the beautiful part. If you surround that comprehension with the square brackets, suddenly it's a list comprehension and it produces a list. If you change the square brackets by rounded brackets, then you have a generator comprehension that you can put behind, and then it's a double comprehension, and you can create a dictionary comprehension by using curly braces and say, “My number is square,” and that’s a dictionary. And the sets, for instance, sets are also curly braces without the columns separating keys and values, and if you use ‘my number is square for my number in collection,’ and then you use curly braces, you have a set comprehension. And it's fantastic because if you know what a comprehension is as an abstract idea, and then you know the symbols for the literals, which you learn very quickly, by the power of your mind, now you know the comprehensions and you don't need to learn all of them because you already know, and that's the ideal way we construct these new features. Can you construct a list of numbers without a comprehension from before? Yes, you can. You can create an empty list and you can loop and say, “For number eight numbers list the depend number.” Or asset comprehension, the same way. And you could already do that, but this new way has some advantages. You don't want to use it all the time, sometimes it's not that useful, but it has a lot of advantages and you already know them if you know what a comprehension is and you know the literals. And that's the ideal way we like to do the rest of the stuff. Do you know already how to add numbers, subtract numbers, and things like that and override these operators? So now you have this extra symbol and you know already how to override it because you have been overriding the others. So the idea is that you don't need to put all this extra information in your head because it's arbitrary. It's like, “Oh yeah, this was this thing and this was that thing. And how was this again? And what was this special method?” We don't like that. Sometimes we need to do it because we have some constraints there.

BP Right, right. So you’re balancing the idea of evolving the language with backwards compatibility?

PG Yeah, but in general, the idea is to have this contextual information there in the syntax. So you have some syntax, you have a new concept, and then you want that to play nice, in a very kind of easy way to say. And we as language designers try to do that as much as possible. Obviously not all the time is possible, and sometimes we need to create an entire new world of concepts that is just a new world of concepts, just because it's very difficult to just bring the past into this just because it's new. And we have done that, for instance, very recently with the match operator in Python 3.10, because this is a new world of things and there are some people that say, “Well, it looks like this, but it's not really like that,” and that could be seen as a bad thing. And with this, I try to highlight that not always is possible, but we try to do it as much as possible. And I think that's given the language historically a lot of good things that people really like to use and they learn very quickly because it has this property that you just basically refer to the old stuff to learn the new stuff. 

KM I really like that framing that as long as we can deliver something that the community is going to want to adopt at a high rate compared to the previous iteration, then we didn't create two ways. We created one kind of preferred cow path there and we'd like people to go down that road, and it kind of relies on getting that early feedback from the community, from the PEP process, to say, “Hey, does this seem like a real solid win? If not, maybe it's not the right way to design it or implement it or it's not a good fit right now for the community.”

PG Exactly. 

BP You're in kind of a unique position getting to guide a language and discuss it with other members of a steering council. What are you excited about for 2023? Are there some things you can hint at or tell us about that you're happy to be working on, or a direction that you think is pretty exciting for the broad array of Python users?

PG Right. I think I'm very excited about the possibility– let's underline ‘possibility’ at this stage, hopefully we have another session when it will happen, but who knows– of dropping the GIL in Python, what I mentioned before. Still we are discussing it, so it may not happen, but I would like to think that there is a world where this can actually happen, just because this has been blocking a lot of possible workflows and things in the language that for a long time has been making Python not leverage the modern times. Because right now for instance, you have your phone. Your phone has not a super, extremely powerful process compared to last year. It has more cores. And right now if you have more cores, you have more parallelism, and if you have a language that cannot really leverage that parallelism fully –there is ways to do it, but not fully– then it's kind of always lagging behind. And these times you see almost every year it's even more core and more cores. Never like 20 gigahertz, it's like, “Now you have 30 cores,” and therefore 30 threads, and with hyperthreading twice as much. So an idea that I'm very excited to see is if we manage to find a way without making it too difficult for the community to adopt to really bring this work that Sam has made to drop the GIL. That's one of the things. The other thing that I'm very excited about is this work from Guido’s team at Microsoft, the faster CPython team. For a long time we have been focusing on all these features and community and whatnot, but since Guido started working at Microsoft– by the way, for the people that don't happen to know, Guido is the creator of the language, Guido van Rossum. So he started at Microsoft two years ago, I think there, and we have been focusing on making Python faster. And I’m quite excited about that as well. It's not going to be as fast as C++, so if anyone is thinking that, get that reality check, I think. But free speed is important. The fact that you are now using Python 3.11 and then you get 40% faster execution on your particular script, that's great, because you don't need to touch anything and then you get free speed. Compared to the compiler world, compilers in different versions get sub 2% speed. So your code gets a bit faster, but it's like 1% faster, and that involves a pile of books to implement and thousands of lines of code. So there is, I wouldn't say low hanging fruit, but maybe middle hanging fruit that we can access in the Python world. So I'm very excited to see how much we can push it until we get into the ‘2% is a really big win’ situation. But hopefully a lot of years still until we arrive there.

BP Sweet, very cool.

[music plays]

BP All right, everybody. It is that time of the show. We want to shout out someone from our community who came on and contributed some knowledge. Thanks to Peter Lawrey. Awarded two days ago, “What is the purpose of using direct memory in Java?” Peter, thanks for coming on and providing an answer. “Never understood,” the question actually says, “why we use direct memory. Can someone give an example?” Peter did, and has helped over 14,000 people. So thanks, Peter, for sharing some knowledge, and congrats on your Lifeboat Badge. As always, I am Ben Popper, Director of Content here at Stack Overflow. You can always find me on Twitter @BenPopper. Email us with questions or suggestions for the pod, podcast@stackoverflow.com. And if you like the show, leave us a rating and a review. It really helps. 

KM I am Kyle Mitofsky, a Senior Software Developer here at Stack Overflow. You can find me on Twitter @KyleMitBTV, and you can also find me on Stack Overflow with User ID 1366033.

PG So I'm Pablo Galindo. I am a Python core developer, a Steering Council member and release manager. I work at Bloomberg in the Python infrastructure team, and you can find me online on Twitter mainly or GitHub, with the username pablogsal. 

BP All right, everybody. Thanks for listening, and we'll talk to you soon.

[outro music plays]