The Stack Overflow Podcast

One quality every engineering manager should have? Empathy.

Episode Summary

Ryan talks with senior engineering manager Caitlin Weaver about how her childhood fascination with computers led to her leading CLEAR’s Cloud Infrastructure Engineering team, her experiences in DevOps, the role of empathy in engineering management, and how the platform engineering landscape is evolving.

Episode Notes

CLEAR is an identity company trying to take the friction out of air travel (such as with TSA PreCheck, available through CLEAR), stadium events, and other experiences that require security screening. 

Find Caitlin on LinkedIn

Shoutout to Stack Overflow user Patrick Pijnappel, who earned a Populist badge with their answer to Redirect all output to file using Bash on Linux?. It’s helped 230,000 people and counting.

Episode Transcription

[intro music plays]

Ben Popper Can a blockchain do that? Algorand has answers. Developers are using the open source Algorand blockchain to build solutions disrupting finance, supply chain tracking, climate tech, and more. Hear from devs, learn about the tech, and start building on-chain. Blockchain solutions aren’t hypothetical, they’re here. Check out canablockchaindothat.com. Can a blockchain do that? Algorand can.

Ryan Donovan Hello everyone, and welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I'm Ryan Donovan, your humble host, and today we're talking with Caitlin Weaver, Senior Engineering Manager at Clear. Welcome to the program, Caitlin. 

Caitlin Weaver Thanks, Ryan. Thanks for having me. Happy to be here. 

RD All right, awesome. At the beginning of the show, we like to get to know our guests, find out their path into software and technology, and I know you have an interesting path in. 

CW Yeah, it's definitely not very direct by any means. And it's funny, I feel like the last time I did a job interview, they were like, “Okay, tell us how you got into what you're doing,” and I kind of end up starting when I'm 11 or something, which is a little far back. But it's true. 

RD So what happened when you were 11? 

CW Okay, so everyone will know how old I am after hearing some of the milestones in this conversation, which is perfect, but basically I was very excited by the family computer and all the things that it could do, and certainly the internet. And 90s, early 2000s internet was very much a creative space– people were spinning things from whole cloth. And having things you could click on, or things you could log into, or be able to do things on the computer, to me that really felt like making something real. I'm a member of a family of artists and my dad's a mechanical engineer as well, so we're sort of very creative people and so I had this just obsession with making real things, and so I wanted to make a website because a website seemed so very– I don't know. 

RD Very real.

CW Yeah, obviously it's not tactile, but there was something official about it. It's like the glossiness of a magazine or something. I was always intrigued, I made zines and things like that as well. I wanted to make stuff that felt like other people could hold it, other people could interact with it, and so I just got very excited by ways that I could actually create stuff with technology and create things on the internet and so I learned HTML when I was a kid and started signing up for all the various websites where you could create your own site, whether it was GeoCities or whatever. And then there was a certain age at which I started to collect allowance money or Ziploc bags of quarters and be like, “Hey, mom. I have $25. Can I use your credit card to buy a domain name for the year?” And so I'd have a new domain name every year and try whatever creative thing I could online. And so there was nothing really very advanced about what I was doing, it was really just sort of the presentation of it and being able to express myself creatively and do things like that. And then I got excited about, okay, well, what if people could log into my website and they could actually have an account, that seems extra real. So kind of evolving that forward a little bit was just a theme of all my teen years. I had some website that was my personal site and sometimes collaborations with friends and things like that. And then also around the age of 11 or so, there was this PC program called GameMaker. I think it still exists in some form. 

RD Was that the Garry Kitchen one? I had one when I was younger and it was Garry Kitchen's GameMaker and it was a real old Commodore 64 one. 

CW I'm not sure if our friend Mr. Kitchen was involved in this one, but this one I think was a Windows XP program. And I would make drag and drop paper doll kind of things with my friends, or I had a side scroller game of my cat collecting popsicles, which I sold at my middle school on floppy disks to other kids for five bucks each. I think I made like $15, I sold three copies. 

RD That’s almost another domain name.

CW Yeah, exactly. So lots of creative stuff like that. So anyway, enough about the early years type of thing, but that basically created this level of momentum and excitement about creativity and technology that led me to do some creative collaboration actually with my dad, as I mentioned, is a mechanical engineer, and he's also an athlete. And so we ended up, I got even more interested in the sort of physical side of computing and programming microcontrollers and soldering bits and bobs together and trying to make something actually tactile this time. 

RD The really real part. 

CW Yeah, exactly. And when he realized that, he got really excited because he's like, “Oh, I have ideas for inventions.” He's a very invention-minded person, and so he gave me a blank check to go to Radio Shack to buy the supplies to buy or to create an idea that he'd had, and we built several different things together.

RD That's amazing. But then you saw the pitch that you didn't go to school for programming. 

CW No, I sure didn't. My family also has a sort of language background, I'd say. My mom is from Argentina and I grew up in Alabama, so I think we always felt sort of different in a positive way as a family, and her dad was a very interesting and unusual person who had this incredible collection of bibles in multiple languages. Any language that I could think of as a child, he had multiple bibles in that language. And so I grew up surrounded by these bookcases filled with all these different languages and different alphabets and all sorts of stuff that was just so exciting, and so languages was definitely a big inspiration to me and so I pursued an interdisciplinary undergraduate degree that skewed towards linguistics. We didn't actually have a linguistics department formally, but we did have an excellent interdisciplinary program and so I took every class that could be conveyed as a linguistics class that was available at my alma mater and then some, and really kind of came out with this really fun, holistic undergraduate education that I really value. But to just sort of converge the timelines of what we've been talking about here, these inventions with my dad, that whole activity came right after I graduated from undergrad. So then I was thinking, “Okay, what do I want to do next? What's really next for me?” And I thought, very much so, that I wanted to be a professor of linguistics and get into academia and do all of that, but I didn't really understand what it took to apply for a PhD, nor did I have any corpus of research or mentorship with different professors. Really, again, we didn't have a linguistics program. I naively applied across the board, and I cherish my Harvard rejection letter so much that I literally got it framed, and saw something amazing on the internet which was just a student project from New York University's interactive telecommunications program, and it was something somebody had done with physical computing and I thought, “Wow, that's just so cool.” It was so simple, really. It was just a cube that would be cold or hot depending on how it felt outside, and I thought, “Wow, you get to do that at school? That's cool. Maybe I should do that.” So I applied and then I just started doing that kind of stuff at home because I thought, whether I get in or not, this is fun and I want to learn how to do it and do it. And then I was admitted, and so that was what came next. And that program is very unusual. It's very interdisciplinary, they bring people from many backgrounds, and so my concentration there– I mean, you don't really have a concentration there– but my intention while I was there was to end up doing interactive museum exhibit design, which I did to some degree in a couple of internships, but life has a funny way of leading you all sorts of places.

RD Indeed. And these are all interesting, fun projects. I think for a lot of people, eventually you find a way to apply those skills to real world jobs. So talk about how you went from wanting to do the museum design to engineering manager. 

CW Gosh, I was working at a really small startup after internships and so forth, and I had landed myself essentially a technical marketing job, sort of, and I was wearing a lot of hats in order to do that. And marketing is never something I intended to do, and so while I was able to do some interesting creative projects with the product that we were selling and hone my skills in writing documentation and blog posts and things like that, which was all quite fun, I definitely wanted to be more technical in my day to day. And so I ended up pursuing a little bit of education in computer science, so I did an intensive program at NYU intended to help people get the requirements for entering a master's in computer science, learned a bunch of computer science stuff that way, then actually did enter one of their programs towards cybersecurity, which, spoiler alert, I ended up just doing a lot more of getting serious about my day to day job and skipped too many semesters, so I didn't really finish that one. But I learned a whole lot from it, and I really enjoyed the time that I spent there and getting the depth of understanding, really the nuts and bolts of things. It bothered me that I had so many aspirations in the things that I wanted to make with technology, but I felt like I was constantly hacking it and just sort of hitting two bricks together and hoping that it would do what I wanted it to do without really understanding how things worked or what levers I could pull to have the right effects. So long story short, in pursuing this program, I requested that I have a more technical position at that company. Unfortunately, there wasn't really space for me to do what I had experience with, which was a lot of front end design and user interaction design and front end proper– HTML, CSS, whatever, JavaScript– but a need opened up precipitously, like the next day. A friend of mine who was working a DevOps position announced her resignation and they said, “Hey, so would you be interested in DevOps?” And I was like, “Okay, yes, but you know I don't know how to do that, right?” and they said, “Yeah, we know, but it's okay, we'll train you.” So I'm endlessly grateful to my first DevOps manager at that company because I really got my first hands-on experience with a domain that I’m more or less in now. 

RD I think that's interesting. You mentioned not knowing how the levers worked and not knowing how things bash together, and now you're in a DevOps platform engineering role where you build the levers and the machines that bash things together. 

CW Yeah, no doubt. 

RD Do you see it as a straight line like that, or did you have to learn more? Do you have to pick up more skills to be able to understand the platform engineering part?

CW There were a lot of skills along the way. I mean, I think there's a lot of hands-on stuff to do. There's a lot of learning how to be an operator. I mean, I think anybody who's been in the sort of DevOps space for long enough or different iterations of it over time has had to be an operational engineer to some capacity at some point. And that's been a really important formative skill I think for me, even though I'm absolutely not doing that in my day to day now as a manager of managers. But understanding what it is to be in the moment of a pretty high-pressure situation, you might be up in the middle of the night, you're making changes to a live production system, maybe you work for a public company, so there's a lot of stuff on the line if you screw it up, and knowing how to mentally prepare yourself to make these changes in a way and to sort of scaffold yourself in that scenario. You can write all the automation you can possibly write and test it well, but and also there are these moments where you have to do some procedure, you have to perform as a human, and knowing what that's like in the highest stakes way, I think has also given me a lot of empathy and understanding of what it is to be making changes to production systems at any level. Even if you're just releasing a new microservice or something, you're going to have to sit there and watch the changes happen, and you're dealing with a lot of complexity the entire time. We can't pretend like, “Okay, we have Kubernetes and now there's no complexity.” Kubernetes is complex, and asking all of the folks who work in the engineering department that I support to learn a bunch of things about a bunch of infrastructure abstractions while they're also trying to push hard and do feature work that's really quite complex and they have their own incentives. Having that kind of empathy for driving complexity differently or abstracting it in a safe way, an actually safe way, has been really interesting for me as sort of a platform product manager now, and I think it's been really helpful and key in that.

RD You mentioned the sort of incident response. When was the worst time you got paged on an incident? 

CW Oh, no. Worst time, okay. So quite a few years ago, we had Kafka clusters that we maintained ourselves. I'm really happy that we have managed Kafka now because it's not a trivial task. And for the life of me, I can't remember what specifically was wrong, but what happened was everything was down. All the production systems were down and every Kafka broker, everything was out of memory, and so there’s, I don't remember how many brokers, six or eight, and fortunately I had a lot of notes already prepared for my operational activity from a few nights before which was unrelated of updating the TLS certificates. We had to terminate TLS there, so I had gone and just replaced certs that were about to expire. And so my style at the time, in order to do it quickly but safely, and we didn't have playbooks and things like that to do this like Ansible, so I had basically all of the IP addresses ready to export as envars, just a quick copy/paste, broker one, broker two, matching exactly the name tags in AWS for each host. And I had all these commands just ready to go to copy and paste to do everything, and at the time, my terminal-foo was very, very fresh and so I was screen sharing on Zoom with the CTO on the phone and I had my like tMocks up with all the screens split so I had every single broker up. I'm popping through all of them very quickly, doubling the heap size on every broker and restarting each one and watching all these consumer groups rebalance, and it was pretty horrible, I'm not going to lie. But at the same time, I think for better or worse, I'm one of those people who gets this sort of zen calm in the storm kind of thing when there's an emergency like that, and so I was just plugged in and just going da, da, da, da, da while somebody else is checking logs and trying to figure out some other things. And so that wasn't fun. 

RD That's good. So let's talk a little bit about platform engineering. Obviously you saw it from the other side, the sort of code in production, the failures, manually copying, pasting IP addresses. What's the sort of platform engineering fixes that you've done to sort of improve that code in production for your team? 

CW For sure. I mean, I think one of the biggest shifts that's made a big difference here is to go from thinking of the things that we do as a collection of projects to thinking of it as a product, and progressing towards a smoother and smoother surface of safer and safer abstractions that are easier and easier to reason about in order to make everybody as productive and safely productive as possible, and also hopefully help them to follow the rules by default. If you just use our stuff, compliance isn't going to be a problem, audits aren't going to be a problem. That's really the idea is to provide all this leverage to everybody. And so for my teams directly, figuring out when can we remove a responsibility from ourselves, how do we reduce the complexity of the things that we own so that we in turn have more bandwidth to reduce the outward facing complexity for the rest of the engineering organization as well. So while we run our cloud, we run our cloud platform systems, we also manage the interfaces that everybody uses to interact with them. And from idea to production runtime, what does that journey look like? 

RD And I think when people talk about platform engineering, I think there's a lot of variability in what that platform is. 

CW Absolutely.

RD Like you said, it's interfaces with the cloud system, whether it's service meshes or whatever proxy traffic shaping you have under the hood. What is the sort of extent of platform engineering? 

CW Ooh, in my context? 

RD In your context or in all contexts. 

CW Goodness, okay. I think I'll speak about my direct experience before I try to generalize too hard here. So what we do here is, what my teams do, we provide the infrastructure itself, the ways to provision the infrastructure, the ways to interact with the infrastructure, and also a lot of things about how you actually develop your code in the moment, so not just where your code goes, but what's inside it and the processes around working with it. So there are shared libraries that we produce to sort of standardize here's how everything should work, we have an application platform ensuring that people get the things that they just need to have out of the box in the first place. So metrics, your integration with our metrics provider is going to be there, just because you're using what we're using, what we're providing. And metrics, logs, all your basic observability stuff, any interactions that are going to happen with the standard systems that we use for interservice communication or interservice auth are all going to happen the same way as every other service is doing it. And so I think providing this level of leverage of that you don't need to think about that central thing. You don't need to think about that common thing at all, and I don't want you to have to know how Kubernetes works beyond some basics around here's the logistics of the deployment. You want to understand that when your software gets switched out for a newer version, you want to understand some things about the dynamics and the timing around that for sure. But other than that, there's a lot of abstraction that we can safely provide and a lot of detail that we can safely hide to reduce the level of complexity for developers as they work on what they work on. And so anything cloud, anything Linux-based cloud is kind of my side of the shop. 

RD Anything that isn't the business logic of the individual programs. 

CW Exactly. There's the product itself, and that's not what my teams are doing. We're trying to do everything around it so that the product development teams can really, really hone their focus on specifically just that. It's definitely something we haven't perfected and it's always a work in progress, but that's definitely the aspiration and we're always checking our priorities on, “Okay, how are we going to do this, how are we going to achieve this thing?” and making sure that we're always doing our very best to prioritize, allowing all of them to focus on that. 

RD And I think one of the things I like about platform engineering is that it feels like this big collaboration in the industry. There's a lot of open sourcing of tools. Whatever company will be like, “Hey, I had this chaos engineering thing. I'm going to open source it. Everybody use it.” Do you use a lot of open source, and do you open source anything? 

CW Yeah. Okay, open source, of course, is, I mean, it's impossible, I think, to be in this industry and not engage in that community somehow. So we definitely use a lot of open source things. We certainly use a lot of managed versions of open source things. Amazon Web Services is our cloud provider, and so there was a time when we home-rolled our Kubernetes at times, not now. So whether it's Kubernetes or Apache Kafka or whatever it may be, and certainly many other things along the tool chain other than what I've just named. It's countless, it's impossible to enumerate. We don't currently open source anything ourselves, although there are plenty of folks across the teams who have made pull requests to open source things. We find bugs, you find an open issue on GitHub, it's been open for a year and it's driving you crazy because that's the bug that's bothering us. Well, we're empowered to go make a pull request and then empowered also to wait and hope that it actually gets merged to main for us to utilize in a soon-to-come release. But that waiting game's never too fun, but we definitely contribute here and there as we see fit. 

RD With platform engineering, obviously it's not a solved problem, you're always running into new things. What do you think the big challenges for platform engineering will be in the future?

CW I think the future challenges are a lot of the same challenges that people currently encounter with platform engineering, which is complexity. It's always about complexity and scale. I think that it's very easy to end up with a lot of things that have a lot of integrations with each other and then you've got all this glue, all this glue you have to manage. And that's a lot. So that's one, integrations and how you actually design and architect the sort of things that you compose into a product for your end user, which is your engineering staff. And then I think there's also choosing the right levels of abstraction for people, and there is also a continuous challenge of how much freedom of choice do we afford to the product engineers, and about what? Where should we be opinionated and standardized and where should we not? And I think that that's an interesting tension that's never exactly solved. There are things where it's easier to say, “There should be no degrees of freedom in this,” but then there's this big messy middle of sort of a slippery slope. Engineers love to be creative and they often have strong opinions, and when they're told, “Hey, use this thing,” and they think, “Ugh, I could build this better myself,” and maybe they could, but with what timeline and for whom and so forth. And so, unfortunately, sometimes you have to figure out how to make using something in their day to day more appetizing, even if it's not really all that appetizing to them in the moment. 

RD Software developers, they want to build software. I wonder if you ever run into people who rankle at the level of abstraction you give them, if they're like, “No, I want this at a lower level. You're hiding too much from me.” 

CW That's not the most common problem, actually, I have, but I've run into it before where they will say things like, “You think I'm not smart enough to understand this?” is sort of the attitude almost.

RD You dare insult me? 

CW Right. And it's not an insult, I just want you to use all those amazing brains somewhere else and not think about a pipeline. You have better things to think about. Or just be on my team and build pipelines. It's fine, you can join us. 

RD So what is the most common problem you run into or most pushback?

CW Let's see. I think at this stage in our company's maturity for platform engineering, it's that we're following the 80/20 rule and we probably always will to some degree. And so when we look at our language ecosystems that are alive and well here, some of them get a lot more formal support than others and some of them get a lot more business priority for formal support than others, and that's so tricky. We could try to spread our focus across all these different things, or we could try to nail down the main one. Most of our back end is Java, we’re a big Java shop, and so if my teams were going to spend all their cycles working on one of the languages that we have three services written in, that would be a little bit of a tough sell as far as how we talk about where we're going to make sure we're doing things right as an initial investment. So we're solving for the biggest cases first, which means some communities feel underserved, and formally they are kind of underserved and it feels bad as a product owner to have to disappoint customers and leave them in a state where they're like, “Well, we didn't get that thing that the Java people got.” And so it takes time and it takes time to get there. 

RD I mean, obviously platform engineering, you're developing for the entire product across an engineering organization. How do you get buy-in and alignment from everybody who's working or is affected by your platform engineering efforts?

CW It's a combination of top-down and bottom-up, I think. And I'll speak a little bit to the top-down and how I think top-down can sort of evolve into more of a cultural vibe than a top-down feeling, which honestly, it sounds so simple, but it's so effective. It's really organizational alignment around goals. What are we going to highlight as our priorities for how we work together? And so understanding where is the company going and where does the company want to go, and then how does specifically engineering support that? And so if we have specific qualitative goals, and quantitative, frankly, around what kind of behaviors should we be exhibiting, how are we going to measure these, and so forth, then a team like mine can come in and provide ways to measure how are we doing and also anchor our KPIs for the things that we build for the roadmap to support those goals directly. So for example, let's make sure that our security hygiene is impeccable. Let's make sure that we're resolving CVEs inside of SLA. How do we ensure that we're doing that? We measure it, we provide that visibility to everybody across the organization for each one of their code bases, and also we build tools and processes that make it as close to automatic as possible for them to keep up with it. We can't make it automatic, it's not quite possible, but we can do things like automatically update base images. If we've got Docker base images across the company and my team provides a fresh patched one every so many days, we can make sure that that's in everybody's main branch of their repo because we hold the keys to that platform. So stuff like that, aligning our KPIs with where the company wants to go and making sure that people have visibility into that. 

RD That's interesting, the top-down approach, the top-down power. At my last job, I ended up reporting to the CTO and I saw the power of what I call the ‘executive laser beam.’ You could say, “This is something that I see as a problem,” and they're like, “Well, I'm going to add it to all the directors’ goals” for the next performance review cycle, and it's like, “Whoa, okay.”

CW It helps, it helps a lot. And I think also by making it visible in a place that it's one thing if it's somebody's goal and they're looking at it in their private browser tab or whatever, but to have it in an interface –we're using Cortex developer portal– to have it in a place that's shared that people look at actually on a weekly cadence. We have weekly business reviews for each of the pillars of the organization and they're reviewing their scores. How are we doing against these different things we're measuring ourselves on, and it empowers engineering leaders to have tough conversations sometimes with their product partners to say, “Hey, look, we're slipping on this thing that we agreed is really important, even though it doesn't always have a direct impact on how the end user experience is with our product.” So very powerful. 

RD Very powerful indeed.

[music plays]

RD Well, it is that time of the show again, ladies and gentlemen, where we shout out somebody who came onto Stack Overflow, dropped a little knowledge and helped out the community. Today we're shouting out the winner of a Populist Badge– somebody who came on, dropped an answer that was so good it got a higher score than the accepted answer. So congratulations to Patrick Pijnappel for dropping an answer on: “Redirect all output to file using Bash on Linux?” If you're curious about that, we have an answer for you. I'm Ryan Donovan. I edit the blog, host the podcast here at Stack Overflow. If you liked what you heard today, you can drop a rating and review. It really helps. And if you want to reach out to us, you can find us at podcast@stackoverflow.com. 

CW And I'm Caitlin Weaver. I'm a Senior Engineering Manager at Clear, and you can find me on LinkedIn. 

RD All right. Well, thank you very much, and we'll talk to you next time.

[outro music plays]