The home team talks with Luca Galante of Humanitec about how platform engineering is more art than science, how self-service platforms empower developers with “golden paths,” and why he’s excited, not anxious, about AI tools (at least for now).
Luca currently heads up product at Humanitec, a platform orchestrator that provides self-service “golden paths” for developers.
Get up to speed (or refresh your memory) on what platform engineering involves and what an internal developer platform is.
Dynamic configuration management (DCM) is a methodology for configuring compute workloads.
Stop by the Platform Engineering Slack channel.
Hear from top DevOps and platform engineering leaders at PlatformCon 2023, a virtual event held June 8-9.
Find Luca on LinkedIn and Twitter.
Cheers to Lifeboat badge winner Devart for rescuing How can I show the table structure in SQL Server query? from the dustbin of history.
[intro music plays]
Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I'm your host, Ben Popper, Director of Content here at Stack Overflow, joined as I often am by my colleague and collaborator, Ryan Donovan. Hey, Ryan.
Ryan Donovan Hey, Ben. How are you doing today?
BP So back in my day when I was a journalist and working a lot with startups and VCs, people would always say, “We're not a service. We're not a technology company. We're a platform.” It was exciting to be a platform. That meant you were worth more and other people were going to build on top of you. But we've done a few blogs and discussed a few times on the podcast the idea of platform engineering which I think is a bit different. Do you have a definition of that in your head, Ryan?
RD It's actually fairly similar, except the platform that people are building is internal to the software itself. They're building a platform to run a larger networked application, sometimes multi-service. So it's doing a lot of the same stuff.
BP Gotcha, gotcha. So instead of internally having various services which maybe aren't well connected, you have a platform that all the engineers can work off of and that makes it easier for them to build out new tooling.
RD And it handles a lot of the complicated stuff– the traffic routing, a lot of that sort of stuff.
BP All right. Well, today we have a great guest coming on, Luca Galante, who is coming to us from Humanitec to discuss exactly this topic: platform engineering, and where it sits alongside things like DevOps. So Luca, welcome to the show.
Luca Galante Thank you. Thank you for having me.
BP So give folks a quick flyover. How did you get into the world of software and technology, and what are you doing day-to-day in your current role?
LG So I lead Product at Humanitec. The original idea of Humanitec came from a bunch of us basically working at pretty large companies. One was Xing that probably most people in the US don't know, but it's kind of the equivalent of LinkedIn in Europe. Still a pretty big company, like 2-3,000 engineers. Our CTO came from Google, actually. And basically what we realized was that these very large sort of leading engineering organizations were building these platforms, what we call internal developer platforms, to enable developer self service in all the things that we can talk about, and they were sort of slowly but surely trickling down to mainstream, and so we decided to basically start building Humanitec to give other engineering organizations not necessarily small teams, but kind of your midsize and also large size enterprise orgs, the toolbox to go and build their IDP instead of having to kind of reinvent the wheel, because not everybody has the same resources that Google or Airbnb can kind of throw at the problem. And also the problem is actually a little bit different, and that's something that I'm happy to get into, but something that I see teams fall for is like, “Oh, if Google does it, it's good for me as well,” and it's not necessarily the truth. So that's me. I'm probably better known on Twitter for spamming people with benchmarking reports or stuff like that, but really I am one of the kind of core contributors to the platform engineering community. I help moderate a Slack over there, I host Platform Con and a bunch of other initiatives.
BP Very cool.
RD I gave a rough definition of platform engineering. How would you describe it?
LG So for me, platform engineering is kind of the art, because it's really more of an art than a science, of basically taking all the different tech and tools that you have floating around in especially the enterprise today, and bind them into a golden path that alleviates cognitive load on the individual contributor, on the developer, and enables self service for engineers. And the superset of these golden paths is what is often referred to as an internal developer platform, or IDP for short, which is really the end product of the platform team, the platform engineering org, and is sort of developed with a platform as a product mindset, which is kind of one of the sort of major tenets that we advocate for in the community.
BP Is there a distinction, or an important one, between a platform and a portal? We've talked to a bunch of different companies that have built their developer portal, and often I feel like that serves some of the same roles. It allows them to create integrations easily, it allows them to figure out how things are connected or why things were built. What is the distinction between those? Because a bunch of companies we've talked to, from Spotify and others, exactly what you said, it was really useful within a large company and then that thought process was then open sourced and now is being shared by smaller organizations who find a lot of value in it.
LG A hundred percent. And I think it's a great question and one that confuses a lot of people. I think it still confuses a lot of people to be honest, both in the community and outside of it. So Gardner actually published a very clear definition six months ago or so, basically outlining that a portal, you can think of it as effectively a UI layer on top of your platform layer. And as you said, Ben, we see a lot of companies basically do what I call a prioritization fallacy when they start on their platform journey, and it's basically an antipattern where they have this mandate for management or whatever to say, “Hey, improve developer experience.” And so then they basically look at developer experience and they try to break it into the different steps of it like, what does a developer start doing? They create a service, then they add features, then they push it to dev, product, and so on and so forth. And basically they look at it chronologically and they sort of take the first step and they're like, “Oh, great. Service creation, let's start from there.” And the issue with that is how often do you actually create a new service? Now, if you're a huge streaming company, very often you have a lot of economies of scale, if you automate that throughput and so on. But in the vast majority of cases, maybe it's like 0.5 or 1% of the time. And so the ROI of taking Backstage, which again, is a great product. We have worked at Humanitec very closely with their product team. We have an integration because, again, a portal with a platform share like Humanitec can actually go hand in hand very well. But if you start from there you're risking just basically capturing very little ROI for what actually is a lot of work. I've seen people taking six months plus to implement their Backstage instance and then have developers being like, “So what do I do exactly with this?” And by the way, that's an issue that can happen with any sort of platform initiative, not just kind of portal-centric. But that's, I think, something that people really want to be aware of. And the solution to that, frankly, is pretty simple. It’s looking for pain. And so we have published on Humanitec research that we did across the industry where we basically asked people, “Hey, for every hundred deployments, how often do you perform these types of things?” So service creation, or spin up a new environment, provision a new database, change configurations, do a rollback, whatever it is, and then basically you get a percentage per hundred deployments as well as we ask them, “How much time does the action take both on the development side of things and operations?” And then through that you basically get out of it a matrix, and this is going to be different for everybody. I mean, there's obviously clear patterns. And through that it's a really easy way to look at, “All right, this is clearly where people are spending the most time and what seems to be one of the most painful steps of the way,” and that's what you should focus on. And usually in our experience that is application and infrastructure configuration management. That's where people go crazy and you have all this sprawling of files that can become really quite problematic. And so that's probably a better place to start than service creation. Again, highly depends on your setup, but that's the main difference, really.
BP Look for the pain first. Seek out the pain before you invest in building something to potentially solve problems that you may or may not have.
BP All right, everybody. Today's episode has a very special sponsor– yours truly, Stack Overflow. Now a lot of us are being asked to do more with less right now, so spending hours a week searching for answers across wikis and emails and chat threads isn’t cutting it. Luckily, there’s a more efficient way to collaborate and share knowledge. Stack Overflow for Teams is a knowledge base that has all the features you already know from stackoverflow.com, but reimagined for your organization. Its Q&A format, integrations, and content health capabilities make it easy to share knowledge so your team spends less time searching for answers and more time building solutions. It’s also used by companies of all sizes, including Microsoft, Expensify, Bloomberg, and Dropbox. So what are you waiting for? Become your team’s hero and get started for free at s.tk/teamhero. Alright, spiel over. Let’s get on with the show.
RD So what are the important pieces of any internal developer platform? What would you say are the foundational building blocks?
LG Usually, basically what you take is the different cloud native technologies that are floating around and you bind them in this golden path as we said. And what I see successful platform initiatives do is start postcommit, because frankly, everything out to commit is pretty much solved and pretty much commoditized at this point. And so again, there the question I think is from the platform engineering team’s perspective. And another antipattern that we see actually is they reinvent the wheel because they’re engineers and they love building and they're like, “Oh, let me rebuild the whole thing from scratch.” And actually as a platform team, you should really focus on where can I deliver the most value? And so the question becomes, okay, basically look at platform engineering as this unopinionated toolbox to go and build your own opinionated workflows and your own opinionated platforms that will have different golden paths and different levels of obstruction and context that you provide to different types of users from different types of teams. And so it's not really about what is the ideal stack. It's rather, “Okay, how can I take what's out there, both commercial stuff like Humanitec or open source stuff like Argo and Backstage, and how can I combine them into something that makes sense for my team?” And so then your job as a platform team really becomes that sort of last mile optimization, because there's nobody else that can do it for you. Humanitec cannot do it for you, and if they can do it for you, then it's called a PaaS, and that's kind of the key difference, which is basically when you have a product team as a PaaS provider. The most famous OG one is Heroku, now you have a lot of them. They're more specialized in different use cases and they basically say, “Hey, I, the product team at PaaS provider x, have figured out what's the minimum common denominator across the industry and what's the right level of obstruction that works,” kind of blanket, one size fits all for everybody. And that can be great. I often encourage teams to go for a PaaS rather than building an IDP if they’re small. If you're 10 developers, 20 developers, you basically have two paths, usually. Either true DevOps, so everybody does everything pretty much, which can work at a smaller scale, assuming that everybody's comfortable with the Terraform, Helm charts, whatever. It just doesn't scale because you can't expect that when you're hiring a hundred engineers a year that everybody understands. And you shouldn't, because it's not their job. And so that's one path. The other path is, okay, you’re small and you already are in a place where you don't want everybody to understand everything and so you just build a PaaS layer and it's really simple and it's usually turnkey and you can do very little customization. But when you go above that kind of 50-100 developers threshold, that's when the pain points start to emerge. Then that's where you want to start building an IDP, and then the value of the platform team is really around figuring out what are the right problems, what are the right pain points, and iterate right from there. And another thing that I see a lot is people thinking that my platform is going to fail because of the tech stack that I picked. It never fails because of that. It fails because of cultural problems, not because of technology problems. It fails because you weren't able to sell internally to the different stakeholders, whether it's your C-level, your infrastructure teams, your dev teams especially, or it fails because you didn't build the right golden paths. You didn't make a product that was actually 10x better than what people have right now. But not because of, “Oh, I picked Flux instead of Argo.” That's not why it fails.
RD Right. I mean, I get how this is an extension of DevOps. This is sort of the team that builds the tooling that lets the automation and all the self-serice stuff happen. Do you think that this will encompass DevOps or do you think that DevOps will still remain a separate thing from platform engineering?
LG Yeah, that's a great question. And I love having the chance of explaining on podcasts, because we were running this ‘DevOps is Dead’ campaign a few months ago that made a lot of noise and pissed a lot of people off, and I think it was really important to get the conversation going in the market, in the community, so I'm still happy we did it. But one of the challenges with that was that the velocity of the DevOps is Dead message and its virality was too high for the velocity of the conversation to actually catch up and keep up with it. And so a lot of people I think got this message out of context, which can rightly make you angry and so it's actually counterproductive at that point. And so it's helpful to be on a podcast with a bit of longform content and not just a tweet, because if you think about DevOps, DevOps has been around for like 15 years at this point. The world that it was sort of first brought into was a very different one than the one we live in today. We used to develop a monolith, maybe running on bare metal. The infrastructure was extremely less complex. You had no Kubernetes, no IC, no 10,000 tools, CNCF landscape type of thing. And the initial idea was very simple and a really good one. It was basically to take down the barrier between developers and operations and facilitate collaboration. The issue I think is when you layer on top all the converging trends that we've had in the last 15 years: cloud, cloud native, Kubernetes, Terraform, GitOps, and whatever your buzzword du jour, then the actual developer experience of expecting everybody to do everything becomes, A– non-realistic, and B– it just creates a lot of friction between the new Ops team, which are now called DevOps teams, and developers. And so that doesn't scale, and the first companies that realized that are the companies that I was mentioning at the beginning, and that's where it all started and then it trickled down. And so if you think about platform engineering through that lens, it’s really not a DevOps killer, it is rather a DevOps enabler. It is actually what enables true you-built-it-you-run-it at the enterprise scale. And so in that sense, I think platform engineering is really an evolution of DevOps for the cloud native era.
RD Yeah. I mean, somebody's going to have to manage the code when it runs in production. Somebody's going to have to deal with the inevitable breaks.
LG Yeah, absolutely. And that's why I think, kind of going back maybe to one of the antipatterns we see, is platform teams not clearly articulating what their mission is. And so basically they end up being yet another DevOps team and then you fall back into the same issues of ticket ops and I'm just putting out fires instead of building and shipping a product, which is your IDP. And so that's, I think, really, really important. The platform team doesn't replace your SRE team. In large companies, you still have actually multiple infrastructure teams, multiple cloud ops, and even multiple platform teams, which is also something interesting we can talk about, because you basically compete with each other and I think it creates a better product obviously. But it's important, and if you try to do both at the same time, you will fail both at keeping your prod up and at actually shipping a product people want, because you're not focusing on either.
BP I could see how ‘DevOps is Dead’ in a tweet might get lost on some DevOps engineers, but hopefully if they dig a little deeper, they'll understand that there's some nuance to it. Just to ask your opinion on sort of the hot topic of the day– within the context of the new generative AI tools that are emerging, is there a world in which this kind of platform engineering can be facilitated by agents that are able to communicate back and forth with the developers, answer questions about why certain code or architecture was built a certain way, explain why certain errors are happening, or sniff out areas where things are slowing down or leaking memory that shouldn't be?
LG That's a great question and something that we've definitely been thinking about from a product perspective. I think the answer is yes, a hundred percent. I think if I look at this sort of vintage of AI tools, probably the largest impact is going to be on coding and on software creation. From an AI evolution perspective, I think we're definitely in that step of changing how we create software. I certainly believe it will have an impact. In terms of platform engineering specifically, I think, Ben, you nailed it. To me, it's about the interface. It's about bringing this conversational interface into platforms. So right now, if I think about Humanitec, we have a fully API base, you have a CLI, you have a UI, and then you have a fully code-based system. So what's missing there I think from our perspective is providing, for instance, a conversational interface that can obviously work for both users, both on the platform engineering side figuring out what's the best way of doing things, but I think more importantly, working on the developer side of things, because that's really where you can have a much clearer design of golden paths by using something like AI, where instead of having basically the platform engineer go and speak to everybody, which is what they need to do right now, to figure out what's the right level of abstraction, basically the user in the system conversationally figuring that out on their own and basically go from there. And so I think that's really exciting.
BP Yeah, I like the term you used. This vintage does seem like it applies immediately to software developers, especially experienced ones, can find a lot of productivity gains here.
Whereas folks like me, maybe it'll make my essay better but I'm not necessarily sure I want it writing my essays just yet. Code is functional. If it writes something that works well that would've taken me a few hours, great.
LG Yeah, and I think, by the way, that's super interesting, specifically for software engineering. I was listening to this podcast the other day where they were saying the usual debate of if it is going to replace a bunch of people, because they were talking about this thing that went half viral in Silicon Valley about this designer who was like, “Hey, I've been replaced by generative AI and so on.” And I actually think we have such a huge backlog in software engineering of software that needs to be built that I think for the next five years at least it's just going to be like, “How can I take my current x developers and make them x times y,” as opposed to replacing a bunch of them. Because if you think about the main constraint of every single product company that I've ever seen, it’s always engineering. It's always like, “Oh, we don't have enough resources to do this or do that feature,” or whatever. So I think it's exciting and people shouldn't be afraid of it, at least for now.
BP I like it.
RD It's going to do all the easy coding for you.
BP Woe is the soon-to-matriculate software CS grads. But no, I think to that point, I've seen it immediately really increase the velocity of product releases across every technology company. So I think in that sense, if it drives more innovation and efficiency, it might lead to more product, more companies, and so more demand for software developers. It certainly also seems like when you have x number of coders, now they can be more ambitious and so they're not necessarily doing less. In fact, it seems like teams are suddenly being asked to turn up the volume because everybody realizes we're at the start of a new era, and so people want to move fast and feel very engaged. So we'll see how it all shakes out.
RD Once they start coding themselves, that's when you’ve got to nuke the data center.
LG They already did. Did you see? I saw a video the other day.
BP Okay, Ryan is team ‘nuke it from orbit’. I wasn't sure where you sat on that.
RD Is that skynet.com?
BP We all have to weigh in ahead of time because it's listening and keeping track, so just make sure you pick your side.
BP Alright, everybody. As we do this time of the show, let's shout out a member of the community who came on and helped out, spread a little knowledge, saved a question from the dustbin of history. Congrats to Devart, awarded March 31st, a Lifeboat Badge for saving a question with a great answer: “How can I show the table structure in a SQL server query?” Devart has the answer for you and has helped almost 600,000 people, so we really appreciate it. I am Ben Popper. I'm the Director of Content here at Stack Overflow. You can always find me on Twitter @BenPopper. Shoot us an email with questions or suggestions for the show, firstname.lastname@example.org. And if you like the show, do us a big favor, leave us a rating and a review because it really helps.
RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find the blog at stackoverflow.blog. And if you want to reach out to me, you can find me on Twitter @RThorDonovan.
LG And I'm Luca, @luca_cloud on Twitter. If you want to find out more about platform engineering I think the best place to start is Platform Con. Platformcon.com is free, it's happening in June and is the largest conference around platform engineering and DevOps in the space, so go check that out, and thanks for having me.
BP Awesome. All right, everybody. Thanks for listening and we will talk to you soon.
[outro music plays]