The Stack Overflow Podcast

He helped create Jira. Now he's searching for meaningful engineering metrics

Episode Summary

Dylan Etkin, founder and CEO of Sleuth, joins Ryan to talk all things engineering efficiency, DORA metrics, continuous delivery, and how his psychology degree has proven useful in his work as an engineering manager and startup founder.

Episode Notes

Sleuth helps engineering teams systematically improve efficiency by tracking speed and release quality, preventing slowdowns and bottlenecks, and removing toil and unnecessary friction. Try it for free or see how teams are using Sleuth. Interested in the automations they offer for teams, check out their public marketplace.

Dylan was an original architect on JIRA, so he’s not exactly new to issue- and project-tracking software.

DevOps Research and Assessment (DORA) is a research program that tries to understand what drives successful software delivery and operations performance. 

According to Dylan, one thing the best development teams have in common is their culture of continuous learning.

Connect with Dylan on LinkedIn.

Episode Transcription

[intro music plays]

Ben Popper 70% of the Fortune 500 use Pluralsight to upskill their workforce. Now you can take the same courses to boost your dev skills. Start a free trial today. Visit pluralsight.com/stack to learn more. 

Ryan Donovan Hello everybody, and welcome to the Stack Overflow Podcast, a place to talk about all things software and technology. My name is Ryan Donovan, I edit the blog at Stack Overflow, and my guest today is Dylan Etkin, CEO and co-founder of Sleuth. We'll be talking about all things engineering efficiency, DORA metrics, continuous delivery, all the fun stuff. And he's also an original architect of Jira, so maybe we'll get mad about that too. Welcome to the program, Dylan. 

Dylan Etkin Thank you so much, and hello to your audience. I'm excited to be here. 

RD So at the top of these shows, we like to find out a little bit about our guests. How did you get into technology and what are you doing at the role you're at now? 

DE So I have been in the tech world for– oh man, I'm aging myself now, but probably about 25 years or so. I started working as a developer long ago. Funny enough, I had a degree in psychology and then realized that wasn't what I wanted to do and took a hard left turn into computer science. I would probably say that actually with years as an engineering manager and now as a CEO of a small startup, my psychology degree actually came in pretty handy. So for myself, I really did a number of things, but probably the most exciting was that I connected with a small startup called Atlassian when there were basically 20 people there. I had moved to Sydney with my family and I knew of two organizations that were there: Atlassian and ThoughtWorks, and ended up working at Atlassian. And who knew, I thought I joined a startup and it turns out I joined a global behemoth, but a great opportunity to learn tons of things. I did engineering there, architect, like you say, ran some engineering teams for them, and I've been running my own startup for the last four years or so at Sleuth. I just had my Sleutheversary, as we like to call it. 

RD Congratulations. 

DE Thank you. 

RD So there's obviously a lot of focus, I would say in the last few years but probably forever, on making engineering teams efficient and tracking that efficiency. A few years ago, I think it was Google that came up with the DORA metrics as a way to sort of measure high performing teams. What do you think about the DORA metrics? 

DE Well, I'm a huge fan because I run a company where we've actually built something to help track those, but to take it maybe a level up, like you say, engineering efficiency isn't new. I always like to joke that back in the punch card days if you had less punch cards, you were more efficient. But the fallacy of that is pretty obvious to everybody and I think a lot of the measures that we have tried in the world of software engineering have often been wrong, obviously wrong, specifically to developers who are doing the work. And one of the great things about those DORA metrics is that there's a lot of research that goes behind them that correlates to organizational performance and they tend to take something that is very complicated, which is this idea of concept all the way through to reliable launch, and they boil them into numbers that we can reason about, and you can do things and measure whether you've moved those numbers in some direction or not, which is a very powerful change to how we've done these things in the past.

RD It is interesting. I've seen kind of a shift in people arguing against individual developer efficiency for team and now organization efficiency. And like you said, there are wrong-headed ways to do it like lines of code written or something like that. 

DE Number of pull requests, exactly. 

RD Sure. Do you think of the metrics that are thrown around, is there one that is more important than others? 

DE The one thing I will say first is that you're right that the team focus is super important, or at least from our perspective over here at Sleuth, we think that software is a team sport and that when you focus on individuals and you try and stack rank based off of these sorts of things, all sorts of nightmares come about. You’ll just not get team buy-in and it's not productive. There are other ways to understand if people are performing well or performing poorly. But if we accept that DORA metrics are a team measure and that you can make a difference on those, I would say that the change lead time is where we see a lot of organizations just starting. So if you are trying to get a baseline and understand how well am I doing, how well am I doing in relation to other teams inside of my organization and to teams outside of my organization, change lead time is just a really easy to understand measure. You're talking about: how long am I spending coding? When I put up a code review, how long do I wait for somebody to actually start to look at that? Once they've done so, how long does it take them to complete it? And then once I've merged, how long does it take for things to get into a specific environment? And those tend to be pretty powerful. People have a gut feel of how long those are taking, but they're continuously surprised by the real answers because there's very often something going wrong there that they didn't know about. 

RD Right. And change lead time is basically from report to deploy, right?

DE There's a little bit of a debate, I think, about how that's interpreted. In the DORA report, I think they talk a lot about you merge and then it's deployed. How we see a lot of folks using it more in practice is understanding that from concept through to reliable launch. And another place where probably we differ or the industry tends to differ from the academic side of it is that the academic side is very specific about production and production only. Almost no team I have ever worked with just focuses on production, so the idea of merging a pull request and then it instantly makes it across to everything, that's not very realistic. And a lot of teams for good reasons can't necessarily measure their developer efficiency by when it gets to production, but they have many pre-production environments where they might be able to quantify their developer flow and then understand that there's a certain lag from a pre-production environment to a production environment. 

RD Sure. There's all the post coding, the build time, the deploy. So continuous integration and continuous deployment are big buzzwords for the past couple years. Are they the same thing?

DE They're very close. I would say that– and I always have to look it up, I always forget– continuous deployment is something that we don't see a lot of teams doing. Continuous delivery tends to be what teams tend to focus on. So it is rare to see a team that says every commit goes to production. What we generally see is that really high performing teams are deploying multiple times a day to their production environments but they're doing it through pull requests, or if they have a large enough team they have to have a merge train where they're bulking a couple of pull requests out at any given time. But they're absolutely striving for that holy grail which is smaller batch sizes, so that they can just reason about the blast area of what it is that they're doing and deliver reliably. 

RD With Jira I feel like that was sort of tracking the pre-delivery, the coding portion of the process. Since your time building Jira, do you think that process has changed much for better or worse?

DE I think I halfway agree with you that it's capturing the coding side. I think Jira does a great job of capturing the planning side, and it has a real part to play in the coding aspect of it, but it's almost where ideas go to germinate and to be turned from something high level to something specific and then broken down into things that can be coded and then there's that element of helping shepherd things across to finished code. But a lot of the folks that we work with, and I think a lot of folks in your audience would probably agree that Jira is kind of bad at reflecting the real state of where code is at. So it's this strange collection of things that are not necessarily reflecting reality. It was this thing, this issue, we were going to work on it, it looks like it got put into this state, oh, but it's actually been deployed for three days or four weeks, or actually we decided we were never going to do it. It's a little out of sync with reality, and so from the coding side of things I think it does a not great job of reflecting that, but it does a really great job with the high level planning and communication with the rest of your team. 

RD That's a fair point. My last position was my first encounter with story points and that's always a tool of estimation and planning. Whether those story points reflect the actual development is another issue, but the story points attached to an epic always made it into management documents like, “This is how people are doing. This is sort of tracking the performance.” Do you think that is an intended use case or is that metrics gone wild? 

DE I would say that it's a little more metrics gone wild. I remember being the age that I am, I was around for the agile revolution, and if I remember correctly, the intent was for us to be able to talk about relative sizes and to instigate a conversation amongst developers to make sure that we were breaking things down. And honestly, interestingly enough, I think it was the advent of this idea of batch size, which DevOps has done such a better job of quantifying. It was trying to say, “If you think this is 24 jelly beans, that's far too many jelly beans. Can you re-scope this to make it four?” We used to say, “Hey, we're maxing out at eight,” and there was always the argument of ‘is it jelly beans or is it hours’ and guess what? It was always hours. But all of that was to say, can we do smaller increments? Can we reason about it so that you don't get stuck on something for five days without any help and that you can ship value to customers faster? And using those things as a measure for management, that was another reason why the engineering efficiency market really has a lot of past baggage to jettison.

RD And along with measuring engineering efficiency, there's also been this other term I've heard in a couple of years: value stream management. What is that and what's the difference between that and engineering efficiency, if any? 

DE I think a lot of people call it value stream. There's an older topic which was value stream and I don't think that quite equates to the new term of value stream. I've also heard people call it engineering allocation and the idea is really where we are spending our time. So DORA metrics do a great job of telling you how effectively you're working on the things that you're working on now. My PM said a thing once which I think was great, which was that DORA metrics will never tell you whether you're on track or not. So you could be working really well and really effectively on all the wrong things and spending all of your time on stuff that doesn't matter to your business. Now, if you pair that up with something like engineering allocation or value stream, you can get more of a sense of whether you're working on the right things and whether those things are shipping or not shipping. Think of it– and I'm sure this isn't new to your audience as well– as this idea of keep the lights on; the KTLO. So you're going to have some sort of mixture of feature work, KTLO, support, bug fixes, maybe infrastructure or tech debt payoff or those sorts of things, and generally speaking from an engineering organization perspective, you want to align that with the business needs. You want to say, “Hey, right now we're happy to charge up more tech debt so we're going to cap out at 15% of tech debt, and we're going to make sure that we're focusing on feature work for 60% of what we're trying to do.” But understanding those engineering allocations of where people are actually spending their time, not just where Jira is telling you they're spending their time, is a little bit of a hard problem.

RD I’ve also been in software for a while and I wonder if this problem has gotten harder because we're no longer actually shipping on a regular cadence. You don't have a disc that goes out every six months or a download that goes up every three months or whatever. Do you think that the expectation of continuous fixes and feature delivery has made tracking performance harder?

DE I don't. I actually think it's made it easier. Trying to reason about what changed in a three or six month increment is really difficult. There's just so many things that have changed at any given time. There's so much ‘why’ behind the things that have changed that it's really hard to disentangle and learn anything from that big blob of change all at once. So the idea of breaking it into tiny little pieces that are shipping in smaller increments I think is a lot easier to reason about, and honestly, it's easier to measure. So for our product, we tie into all of the real tools that you're using today and we look at your pull requests and we look at your issues and we look at your deployment system and all of that, and I will fully own the fact that if you're a team that's shipping once every three months or once every six months, we struggle to get any level of signal out of the noise that you're generating. 

RD Interesting. We did a post with Charity Majors of Honeycomb talking about getting faster and faster feedback loops so you can figure out what's wrong in a faster and faster manner. 

DE I think it's more satisfying for the individuals that are working in that manner as well. It's better for your customers too. We had an instance just the other day. We did this customer interview, somebody was giving us some feedback on this new automations marketplace that we have and they were like, “I tried to install 15 and it was just obnoxious that I had to keep selecting this thing.” And we were like, “That's a good point.” And one of our developers heard that and was like, “I've got a solution. I'm going to put a little checkbox there, keep you on the same screen, whatever.” It was shipped the next day and we could turn around to that customer and say, “Hey, remember how you said that was annoying? It shouldn't be anymore.” And you can win customers for life when you can move at that sort of speed. 

RD And they don't have to do any legwork on their end. It's just changed and it's better. So I want to switch up a little bit. It's interesting that you were a psychology major. What insights into managing engineering teams do you think you've got because of your psychology background? 

DE I mean the psychology background just never hurts. It's good to have a basis for the fact that we're all squishy humans and that emotions play into things, and probably for myself being an engineer and being a little on the weirdo engineer side of things, it helps you get into the mindset of the people that you're working with as an engineering manager. At the end of the day as an engineering manager or a CEO, you are trying to create an environment where you can get the best out of the people that you work with, and that means being mindful and paying attention to who these people are, what motivates them, how they like to work, not everybody is the same. There are maybe a couple of archetypes of types of people, so over time you can recognize patterns and the like, but understanding that there's no one size fits all and that if you do want to get the very best out of your people, you have to understand where their strengths and their weaknesses lie and work with them to sort of put them in a position where they can do the best work of their life. 

RD I actually think that's one of the better ones– that people are these squishy, emotional creatures. There seems to be a new application to the business where we're in this post-Fordist realm where people aren't just parts of an assembly line, but I've also heard that there is a hard transition from engineer to engineering manager. Why do you think that is? 

DE Ooh, yeah. I would agree with that completely. I've taken a number of folks from an amazing IC and then put them in a leadership position. It's just such a different experience. There's a number of different things that make that difficult. First off, just your sense of satisfaction in how the job is actually getting completed changes overnight. I always like to tell the story of if you're an amazing IC– let's say a normal IC does a three. I don't care what that number is, but they do a three. You're so good that you do a five, and maybe if you don't sleep for two nights in a row you can get that up to a six, but it's humanly impossible to get up to a seven. And you go home every day satisfied that you did a five and that feels good to you because you made a difference. Now you start managing people. Let’s say you have five people that basically are doing threes. The cool thing is that you get to reason about 15, you know what I mean? You get to think at the level of 15, but you're not producing that and your satisfaction has to come from the fact that your team is doing a 15. And if you're really good at what you do, maybe you move a couple of those people up to a four and now your team of five is producing a 20, and wow that's a huge difference. But that's a change. That's an incredible mental shift, and just because you were good at motivating yourself to get to a five day after day does not mean that you are good at motivating a team of five people that move from a three to a four or even stay at a three.

RD Right, you're not doing the work yourself. You can't push yourself a little harder. You have to figure out how it works with the team. So if you're looking to encourage teams to go from a 15 to a 16, what are some of the levers you can pull? 

DE There's so many great things out there. It's really about culture but then there's a lot of tooling that exists out in the world that can help enforce culture. So DORA and DevOps are a great way to think about structuring a team because you're empowering individuals. Having a very quick planning process means that you can be a little ad hoc but cater to the individuals. There's a lot of tools out there. Ours is one of those two where you can set certain guardrails, things like, “Hey, we only open a pull request when there's an issue key reference,” or, “We want to do Slack-based approvals when we move some change from a staging environment to a production environment.” There's tools that will help you set guardrails and take some of the best practices that the best high performing teams are using today and implement those for your team. And that tooling and that sort of guardrail can help set culture and help explain to your team that this is how we, as an organization or as a team work. I’ll say one more thing on that too, which is that it's about continuous learning. I think the best teams out there have this mindset of, “We're going to continuously improve. We're going to continuously learn,” and if your group of people has that mindset, there's any number of tools that you can use or processes that you can use to do that, but the mindset is the key and being bought into that and not saying that we've always done it this way and we're always going to do it that way.

RD Do you think some of the new AI tooling has a place in putting those guardrails in culture? 

DE I'm trying to understand exactly what the impact of those things are going to be, if I'm being completely honest with you. And I've been on a voracious learning journey about what is the art of possible with these tools, and what is the art of reality with these tools today and what we could potentially offer for tomorrow. I would say that the jury is still out on those things for me. It seems like the thing that is most effective and most impactful right now is suggesting code and removing that toil of having to stop, go into your browser, go and Google a thing, take a snippet and put it into your IDE. Now that's just kind of in the flow. But I would also say that there's a fair bit of early research out there saying that it doesn't make a huge impact on overall efficiency. We've got folks out there that are measuring the DORA metrics and showing that the folks that have adopted these tools are not overly more efficient than those who have not. So I think the jury's still out. 

RD Obviously we've heard that they're going to make everybody much more efficient, but do you know what the sort of slowdown is from generating a block of code to production? Because I feel like if you have a block of code in front of you, if it's all correct, you just put it into production. So obviously it's not going to be all correct. 

DE Nowadays, I think probably the issue is that it's just like any other tool. I liken it to back in the 2000’s where we had these IDEs. If you were using Java, Java has a lot of boilerplate, and then some of these IDEs just let you say, “Boom, generate the boilerplate,” and I didn't have to type a bunch of junk. It seems like we've taken that up five or six notches where you're like, “I want to do something with the API,” I say it in a code comment, and I get more or less the code there, but I still have to look and go, “We don't get the project that way, we get it this way, so let me change that. It seemed to suggest this thing, but actually I want to prorate the thing, so let me just change this up a little bit,” so there's still some human interaction side. But I was talking to somebody at a conference a while ago, somebody who's at a VC that's very AI-oriented, and she was convinced that there are some big models out there that are able to do a mid-level engineer’s work right now today. That to me sounds a little unbelievable, but who knows? Maybe it'll move in a leap instead of an increment. 

RD Maybe. I've talked to folks and I've heard it's good at scaffolding, but it's not going to give you perfect code. 

DE Yeah. I don't know about you, but I would not have imagined that we would have been this close to where we are right now maybe three years ago. So the folks that are at the forefront of this are leaping us ahead. And I think that every startup or every organization right now is asking a very hard question of themselves, which is, “How do we bring this new tool to bear on the problems that our users have in a way that is impactful and not just shiny?” Because the shiny is almost obnoxious and takes away from some of the efficacy of some of these other things. 

RD With anything that's shiny, you hope that there's something useful behind it. 

DE I think this one has staying power, obviously, because we're learning what the art of possible is and it's evolving every day. But you know that 90% of the things that people are trying with it right now are going to be in the garbage bin in a year or two years, and the other 10% is going to be where the real value is at. 

RD Got to love a new bubble, huh? 

DE It's always something. Back in the day at Atlassian I ran the Bitbucket team and for a while, Git and GitHub, the idea of pull requests and whatever, that was the bubble, and now it's just what we all use to do things. It changes. 

RD So if not AI, maybe a sort of lower level thing will help the performance, maybe something like better automation, right?

DE Absolutely. I am a huge believer in automations for transforming the software industry from day one. So if we think about the things that have removed developer toil and allowed us to focus on the higher level order things that we're trying to solve, CI/CD is a great example. When I started in the industry, it was a little unusual to write unit tests and do CI because you had to run it on your local machine. It's just a given now that you would not start a project without writing tests and having it in some sort of CI/CD system. Similarly with deployments, they used to be a nightmarish thing that only happened so often. Now we've moved to a place where we can automate those and make it a non-event. I will argue that at this point we have built automations on top of each other such that we're in a place where we can take a lot of the best practices that teams are using today to create those guardrails, to create that culture, to create a DevOps culture, and we can delegate a lot of that stuff to automations as well. So when I get excited about how we move the industry forward next, AI might be a part of some of those automations, but to me it's continuously delegating toil to the robots where it belongs and giving humans and developers specifically more time to focus on the real work that they're trying to do.

RD Can we automate an efficiency process for teams or is that the bridge too far?

DE I think to a certain degree we can. I think there are things– I brought this up earlier in the podcast where you're saying, “I want to have all of my developers when something has gone out to staging do a quick smoke check and ask themselves, ‘Is this working the way that I wanted it to work and can I very quickly verify that and then get it off into customers hands out in production?’” And I don't want to leave the tools that I'm using, I want to do that in Slack. We can automate that process. There's a lot of toil in there. If we said we wanted to do that on an individual item by item basis, you're going to spend a couple of hours every time you do it, doing it, tracking down who I should mention in Slack and then waiting for them to respond and all these things. That is something that belongs in the robot's hands, and I think once you've done that, you haven't completely given off efficiency to the robots, but you have said that plus 15 other things is the framework in which we're going to work, and that is setting down a very efficient and effective way of working. So I do think we're very close to that.

[music plays]

RD Well, this is the end of the show. As we do, I'm going to shout out a badge winner. Today, I'm going to shout out a Stellar Question winner. Rynop won a Stellar Question for “How do you JSON.stringify an E6 map?” So at least 100 users saved that question for later. My name is Ryan Donovan, I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog. And if you want to find me on social media, I am @RThorDonovan on Twitter. 

DE And I'm Dylan Etkin. I am co-founder and CEO at Sleuth. If you are interested in learning more about engineering efficiency or want to measure for your team, check us out at sleuth.io. We have a link to the way that we sleuth inside of Sleuth, so you can check out our live demo. And if you're interested in some of the automations that we offer up for teams, you can browse our public marketplace at marketplace.sleuth.io. 

RD All right. Well, thank you very much for listening, and give a like and subscribe. It really helps.

[outro music plays]