The Stack Overflow Podcast

Managing Kubernetes entirely in Git? Meet GitOps

Episode Summary

In this episode, we chat with Paul Fremantle, VP of Product Engineering at Weaveworks, about managing Kubernetes entirely within Git. It's GitOps! It's a philosophy where you externalize your runtime configuration as a set of resources in a Git repository.

Episode Notes

Weaveworks helps DevOps folks manage their Kubernetes settings entirely

Paul's first computer was a Sinclair ZX-80, which had a clock speed of 3.25 MHz, 1 KB of static RAM ,and 4 KB of read-only memory. Pretty good for 1980.

Weaveworks based their project on Flux, an open source engine. If you're not a big corporation and you want to use it, it's free!

Before there was Kubernetes, Google created Borg, an internal cluster manager. It has yet to be assimilated by Kubernetes.

Ben thinks that, if it gets too easy to manage Kubernetes clusters, we'll be out of a job talking about the pain of cluster manages.

Today's lifeboat badge goes to Daniel Ribeiro for the answer to How can I run Go binary files?

Episode Transcription

Paul Fremantle I think the deeper aspect of this is really about the visibility and the human processes. I'm a big believer that good technical capabilities come from the interaction of people, and processes and technology. And so Git has this set of ways of working around pull requests and reviews and merges that really work well in managing your production environment. And your staging environment and the whole pipeline.

[intro music]

Ben Popper Couchbase is a modern, SQL-friendly, NoSQL JSON document database. For building applications with agility, performance and scale. For tutorials, videos and documentation, as well as best practice tips, quick start guides and community resources, visit the Couchbase Developer Portal at couchbase.com/stackoverflow.

BP Hello everybody, welcome back to the Stack Overflow Podcast. I am Ben Popper, Director of Content here at Stack Overflow. And I am joined, as I often am by my colleague, Ryan Donovan, the editor of our blog, and newsletter. Hi, Ryan.

Ryan Donovan Hey, Ben, how you doing?

BP I'm pretty good. So we had the annual dev survey come out recently. And we know that when it comes to version control, Git is the end all be all from the modern developer. We've also talked a lot on this show about DevOps, and how, in some ways, it has become ubiquitous in the sense that I think once upon a time, you know, something that you could have a department that focuses on this, maybe this is a philosophy you want to spend some more time on. Now, it seems like it's kind of table stakes. As you're thinking about building a company, how are you going to do DevOps? Not are you going to do DevOps? So today, we're going to combine those two things and talk about GitOps. I'm not going to attempt to define that. I'll leave it to our guest. But yeah, it should be exciting. I guess, when you hear GitOps, if you knew nothing, and we were just playing a little trivia game, what would what do you think that is?

RD I would guess it was management of Git? But apparently, I would be wrong. [Ben & Ryan laugh]

BP Okay, very good. So I want to welcome on our guest today, Paul Fremantle, he is the VP of Technical Product Strategy at Weaveworks. Paul, welcome to the show.

PF Hey, thank you, Ben and Ryan.

BP For people who don't know, yeah, yell them a little bit about yourself. You know, a little bit about your background and what landed you in the position you're at today.

PF Yeah, I'd love to. So my background. Wow. So I started out not in computing, but as a management consultant. But I was like a super geeky teenager, I had a computer long before anyone else did back in the early 80s.

BP And don't be afraid to date yourself. What kind of computer?

PF I had a Linux 80. And I got it. I was literally like one of the first people in the UK to get it because I persuaded my parents to put together like all my birthday and Christmas and everything presents to get this, put the order in. And it just didn't come and it didn't come and it didn't come and then somehow some friend of my mother's knew dad, and was like, oh, there's this kid. And he's 12 years old and desperate for his ex at and I literally got to go meet him. And he handed me one in person. So that was awesome. Anyway, so I and then I ended up doing all sorts of stuff and security consulting and things. And then I ended up in a software group in IBM. And then in 2005, I decided to set up open source company called WSI2 with another IBM app called Sanjeeva. That's a API management company all based around Apache and Apache License and everything. And I did that for a long time. And then I took some time out to do my PhD. And then, about a year ago, in October, I joined Weaveworks. So that's really cool.

BP So going back a little in time, they're coming from IBM and then starting a company focused on open source. What did your former colleagues think at the time? Now, I think it's pretty widely accepted that open source can be a great place to do business and that even big corporations who once kind of railed against it are giving open source of big, friendly bear hug. But at the time was that something that sort of struck people as new, as unique, as maybe counterintuitive?

PF IBM was actually remarkably supportive of open source at the kind of grassroots level, in other words, they were very happy to build small components. They hadn't yet got to the idea that their core products could be open source. So I think that was a challenge. But you know, people were, I think, thought we were slightly crazy, but we're reasonably supportive. There were a lot of people saying, hey, when it doesn't work out, you can come back. So that was kind of fun. [Ben laughs]

BP And so then yeah, catch us up to the present day. Have you had the same role your whole time? Or have you changed roles that you've grown into what you're doing today?

PF So why I did join Weaveworks, it's really two reasons. One is I was—so Alexis is the CEO and founder of Weaveworks. And I was at a talk of his a Cubecon talk in London, about three or four years ago, where he kind of, I think it wasn't the first time but maybe the second time he talked about GitOps. And it just kind of blew me away. Because during my PhD, I had done a lot of work on building a whole system. And I used everything's in Git. Basically, I had a single script that would kick off all the machines in digitalocean, bootstrap them, start them up, Docker, containers, everything. So that kind of really resonated with me. And so I was really excited about GitOps, and especially, one of the things I really found, who tuned in the last sort of 10 years was this, this kind of massive shift to how important DevOps was, as part of the core business proposition. I did a lot of work at WSI2 on trying to automate and standardize our DevOps processes, not just for ourselves, but for our customers. And I had the DevOps, you know, evangelical fervor, I guess, is the right way of putting it. And then the other thing that I think really, I kind of believe strongly is, you know, I've been using Linux, since the early 90s, I had my first Linux laptop in 1994. And I've seen the impact that Linux has had on the industry. And I feel that Kubernetes is well on its way to being the kind of next Linux, it's sort of a cluster OS rather than a machine OS. And so that kind of ability to really be part of that Kubernetes journey, and to be in a company that's put its eggs all in the Kubernetes basket, I thought was really cool. So those were some of the things I really enjoyed. And one more thing, of course, is open source. You know, we've worked as a massive open source proponent, loads and loads of key projects. Weavenet, Weavescope, Cortex, many, many more. Flux, of course, and Flagger are the ones that do the GitOps stuff. And and so those that kind of real belief that we can build a successful company around open source and help the community as well as build a commercial organization is really interesting to me.

RD It's interesting, I don't think I've heard Kubernetes described as an OS. Can you talk a little more about how it functions as an OS?

PF Well, what does an OS do? An OS is basically allocates resources to processes. So you have processes and they need access to hardware, they need access to memory to disk. And it acts as that resource scheduler that gives those and controls those. And you can, you know, you can say hey, I want a nice this minus 10 and reduce its access to CPU or increase it and change its priority and so forth. If you think of containers as the processes—you think of container instances as your processes, then what Kubernetes does is do exactly that same thing, except for containerized workloads across a cluster. So it gives access to storage, it gives access to network, it controls priority, it controls scheduling, it controls scaling. So fundamentally, in my mind that's the same sort of thing.

RD So it's almost sort of a meta OS, because I think you have actual OS running in each container. Correct?

PF And of course, each machine has a real OS with a kernel on it as well. So yeah, I mean, there's there's multiple as it's an analogy, I wouldn't take it too far. But yeah, I like that word meta. I think that's kind of cool.

BP So Paul, talk to us a little bit, yeah, about what the division is, and where value is delivered on both sides between, as you said, the open source and the community, and then the business aspect, like how the business model works.

PF You know, there's always been this question, is open source, a business model? And I think it's fundamentally, t's kind of more of a marketing model in a way. It's more of a way of saying, let's find the people who want access to what we what we're good at, and help them know about us, and find out what we do. And I think that's really key. And I think the other thing that's really important, and this is something I've kind of, Sanjeeva and I used to say, you know, there are people who have time, but no money, and they're willing to kind of hack the open source and support it themselves and do patches and contribute to the upstream and so forth. And then there are people who have money but no time who are focused on their core business. And so the open source business models are about, you know, I think trying to harness those two aspects. So finding the right people, finding that fit, not having to spend your time going out hunting for those people, but let them find you. And then let them decide, is this something that is valuable enough to me that I need more and I want to pay for it? Or is this something that's just kind of, I'm gonna use anyway?

RD Yeah. And it lets the very motivated engineers solve their own bugs. No longer are you sitting around waiting for that to get prioritized. You can go in there and do it yourself.

PF Yeah, and not just that, but also understand the code better. See how it works, it helps build ecosystems. So, you know, one of the things that that we've been doing so flux is the core GitOps engine in the CNCF that Weaveworks donated and works on. And we've been doing a lot of work with how Flux interacts with OpenShift, I was talking to one of my colleagues who's doing a really nice talk at Cubecon on how Flux and Jenkins work together. So those kind of interactions are so much easier in the open. And I think that that just creates this, this massively powerful ecosystem. And I think that's a big part of what Kubernetes is about, is building up the ecosystem. And that's, I guess, my analogy with Linux, you know, Linux created an ecosystem, that changed the way we do computing. Changed the way we build servers and everything around it. And I think Kubernetes is doing the same.

BP And so when people are getting involved in this at, you know, a basic level, are there sort of various tiers of products that make more sense for somebody who's building something, you know, on their own as a student, or a hobby, somebody who's a small to medium sized company, and then like a big enterprise offering?

PF I mean, maybe we should talk about what GitOps actually is in a minute.

BP Yes!

PF But the way that GitOps—the way we've built out our product line is that we, you know, there's the CNCF project, which is Flux, which is a highly effective project. And we do kind of ad hoc support for that in Slack, as best we can. And then we have a product, an open source product called Weave GitOps Core, which is really aimed at Teams. And it takes Flux, and it wraps it up in a supported, documented and opinionated approach. So if you don't, if you want to tweak all the settings, Flux lets you do that. If you want to have our best practice of how to do GitOps, then we'd GitOps in codes that out of the box. And the other thing that Weave GitOps has is a clear upgrade path to our Enterprise Edition. And our Enterprise Edition, does all sorts of things around multi cluster, hybrid stories where you can do GitOps against public cloud and local infrastructure. And it does really some nice things around how teams can have their own space. So it kind of lightweight multi tenancy called team workspaces, which has R back and access control and so forth. So we've tried to kind of say, you know, if you're just a small team, and you want this for free, you can use our core offering. If you want to pay for it, you can buy support. And if you want more enterprise platform operator features, then there's an enterprise capability.

RD So back to the original question, what exactly is GitOps?

PF That's a great question. [Ryan laughs] So GitOps is fundamentally the philosophy that you take your runtime configuration, and that could be your dev staging or your production environment, and you externalize it as a set of resources in a Git repository. That sounds very simple. And of course, you instantly get version control for that. So the most obvious benefit is you can do a rollback. You know, you make a change to your production system, something doesn't look right, you can just roll back that change. But I think the deeper aspect of this is really about the visibility and the human processes. I'm a big believer that good technical capabilities come from the interaction of people, and processes and technology. And so Git has this set of ways of working around pull requests and reviews and merges that really work well in managing your production environment. And your staging environment and the whole pipeline, because you can see exactly what changes have been made. You have a record of every change to your production system. You have that audit capability. But you also have the review process where you can set up your Git merge facility, so you have review processes. And you can really make sure that you manage these things in just a really nice logical way. And I guess if you're into DevOps, you know about the Dora metrics So the Dora metrics are from the DevOps research Agency, which got subsumed by Google. And they categorize these key metrics, which are really important. And one of the things they categorize is that people who are really good at these metrics, who are really good at DevOps are also really good at business. They actually do better than their competitors. They win in the market.

RD Yeah, I mean, DevOps is all about making sure your your app functions in production, right. Any network application needs a strong the DevOps team.

PF Absolutely, absolutely. So that ability to make sure things are working production is key. But then it goes even goes further that people who are elite at velocity, at moving new versions into production, who have a low defect escape rate, who don't have to go and rollback often, who can roll back quickly and who can fix changes, effects problems fast. It's basically evolution. Right? If you can evolve faster than your competitors, you can beat your competitors, right? So this is that the people who can do all those things faster, basically foster evolution. They go through evolutionary cycles faster and better. And they can figure out what works well.

BP I have one question that I want to ask because my knowledge of Kubernetes is pretty surface level. But I know that it begins from the metaphor of like a boat, and an helmsman. And I was on your site here, and you're talking about managing drift. And when I think about, you know, putting things into loads of different containers, and having, you know, like, in essence, as Ryan was saying before, like, the sort of meta OS that lives on top of lots of real and virtual machines, it does seem to me that yeah, it could, after a while be hard to keep track of, and understand the performance of all these things. So in that context, what is drift? What does drift mean? And when you think about, like, the lifecycle of these containers, how do people manage them effectively?

PF That's a really awesome question. The first time I really saw drift in action was before Kubernetes, was launched. And, and I was at a conference somewhere. And it was the first time Google was talking about the precursor to Kubernetes, which was called Borg. [Ben laughs]

BP Not as friendly. Glad they changed that.

PF Not as friendly. And they actually did a live demo of Borg. And the Google Developers at the time they launched, they announced this, I don't know if this has changed there. But they could launch their code, any developer on up to 10,000 instances of servers around the world using Borg. And so the guy did this, right, he just typed a command line and up popped this thing. And he was tracking how many instances it was running on, and it was going, knowing 9996, 9997, 9998, 9995 because when you're running in that scale, there are failures, right? So you know, the configuration that we want in a large scale scale production service, it is a desire, right? It's, it's not necessarily what's there. And so what was happening there was the server was, was trying to ensure that they got 10,000 instances, but it was not possible. So it was constantly pushing against that drift, and trying to get to 10,000 and getting near and then a machine would go down or fail or network would die. And so that's the kind of first aspect of drift but unfortunately, not just machines, right? At scale, we have machine challenges, but then there's also humans as well. Which is that, you know, someone goes in and does a cube cuttle apply, which maybe they shouldn't have done. And, you know, if you really buy into GitOps, of course, you can turn this off or not. But if you really buy into GitOps, you don't want random cube cuttle applies in your environment, you want the environment to match exactly what's in Git. And so, you know, one of the things we've done at Weaveworks is to develop what's called a gitops maturity model, which talks about how people move through different levels of maturity. And I think the most, the most important move there is from understanding that Okay, we're going to do a one time bootstrap of our infrastructure using what's in Git and then just leave it and then go in cube cut apply and change it and fix it and hack it. And then end up not knowing where we're at, not having visibility, maybe somebody does the wrong thing. To saying, actually, we're going to stop doing that, or we're going to, we're going to make the changes in Git and have a software agent do reconciliation and make sure that we're always trying to fix that drift and try and get the, the running system to match that the declared configuration and desired state in Git.

RD That's interesting, because you always have that source of truth and get, instead of DevOps coming through and, you know, changing things on the fly as they need them. Which always gets people in trouble.

PF It always does. I actually—once you've got GitOps in your head, if you're ever on an incident run, you see somebody, you know, start SSH into the server and changing things or cube cuttle applying this and that and the other and you're like, once you've got the GitOps religion, oh my God, it's really painful to watch. You're like, well, how do you roll that back? What happens? What happens if you did the wrong thing? How do you keep track of what you've just done? Have you got to, you know, have you got a you logging all these changes you're making to the server? How are we going to know which of these changes fixed it and which didn't? You know, all these questions jumping ahead, and you're like, oh no!

RD Yeah, even if they fix the problem, there's no way to duplicate that.

PF So right. So right. That was a, that was a real moment of truth for me when I realized I got the GitOps religion, because we were, I was on incident, Slack and Zoom. And when trying to sort something out, and someone did something not through Git, I'm like, no, no! [Ryan laughs]

BP Paul, do you feel like anything else you want to talk about head before we head to the outro?

PF There was one more aspect that I'm really passionate about, applying GitOps to the infrastructure, not just to the applications. So that was kind of—I mean, one will lead into that would be the talk about this GitOps maturity model. And I sort of talked about that first change. And then the second change is to say, okay, not just my apps, but I'm now gonna start managing my clusters, cluster config with GitOps as well.

RD Yeah. So you're actually managing the infrastructure resources, right?

PF Yeah. So that's something, you know, that Terraform helps you do. But also, I don't know if you've come across the Kubernetes cluster API.

RD Not personally, no.

PF The Kubernetes cluster API is a kind of Kolbert of technology. It's a little weird to get your head round. Effectively, you have a master cluster, right, which could even just be a simple kind cluster could be really small. And you deploy the cluster API called Cappy into it. And now there are a set of Kubernetes resources and API's and yamo files, that will then go and spin up or manage clusters in AWS as your VMware, there are all these different providers. DigitalOcean. So effectively, it turns managing clusters into a another Kubernetes API. And of course, once it's a Kubernetes API, we can get upset, right. So now we can spin up a cluster, just by deploying a YAML into. Git repo. We can change that cluster by by doing a PR against the YAML, merging it, and it automatically gets synced.

RD That's very cool.

PF And, you know, the shift here is that you know, what Kubernetes did was this—I don't really like this analogy. I don't know if you know this analogy of pets and cattle.

RD Oh yeah, yeah.

PF I don't think mine is so clear. But do you know what coppiced trees are?

BP No.

PF It's like certain trees that you can just chop them down to the root and then they grow up lots of branches, and you use it for wood burning or for basket weaving all sorts of things.

BP Right, bamboo you can just harvest over.

PF Yeah, you can just harvest over. So my analogy is like bonsais versus coppice trees. Because I don't feel the cattle one is very vegan friendly. [Paul & Ben laugh] But back to my point, which is that, you know, what Kubernetes does is it lets you treat your workloads like cattle, right? It lets you treat your apps and your container instances like cattle and you don't have to care about individual instances anymore. What cluster API and what GitOps for clusters does is it lets you treat clusters like cattle. Literally, hey, I got a problem on this cluster. I can just spin up another one. Do a blue, green handover, kill off the old one. And now I can suddenly, I can spin up clusters and manage clusters for my test environment for my dev staging production.

BP In the long run, this gonna be bad for Ryan and I, we're gonna go out of business. Because most of our sponsor blog posts end up being about the pain of having to rearchitect some monolithic infrastructure that you spent decades building. And now you've got to modernize it and get faster, and it's so much work. But if people just start building on Kubernetes from the jump and managing their clusters and spinning them up and down, we're not going to get to tell these great war stories about the moment you had to bite the bullet and rearchitect everything. It'll just be too easy.

RD We'll just be out of business. [Ryan laughs]

PF Sorry, Ben. I apologize for that in advance. I think you've still got a few years left!

RD There'll be some other problem with it. I mean, it's another level of abstraction. And somebody has got to go down below the abstraction at some point, right?

[music]

BP All right, awesome. Well, Paul, I want to say thanks so much for coming on. I will, as I do, at the end of every episode, give a shout out to the winner of a lifeboat batch on Stack Overflow. Somebody who came on and found a question with a score of negative three or less, gave it an answer and got it up to a score of three or more. Today goes to Daniel Ribeiro. 'How can I run Go binary files?' So if you want to know the answer, we will throw it in the show notes. And thanks to Daniel for sharing some knowledge on Stack Overflow. I am Ben Popper, the director of content here at Stack Overflow. You can always find me on Twitter @Ben{opper. You can always email us podcast@StackOverflow.com. And if you liked the show, please leave a rating and review. It really helps. Ryan, who are you?

RD I'm Ryan Donovan. I'm editor at the Stack Overflow blog and I put out the newsletter. I'm a ghost on twitter @RThorDonovan. And if you have a great blog idea, you can always email me at pitches@stackoverflow.com.

BP Paul, who are you? And where can you be found on the internet if you want to be found? And where should people go to learn a little bit more about Weaveworks?

PF Thank you, Ben. I'm Paul Fremantle. I'm the VP of Technical Product Strategy at Weaveworks. I am @pzfreo pretty much everywhere. GitHub, Twitter, anywhere you like. If you want to find out more about Weaveworks, go to weave.works and hit us up and have a look.

BP All right everybody. Thanks for listening. We'll talk to you soon.

[outro music]