The Stack Overflow Podcast

How to convince your CTO it's worth paying down tech debt

Episode Summary

On this episode: Matt Van Itallie, Founder and CEO at Sema, a company that assesses code to improve outcomes for users, companies, and developers. Plus, friend of the show and erstwhile cohost Cassidy Williams joins the conversation.

Episode Notes

Sema’s AI code monitor helps companies manage the risks and capture the benefits of AI in the software development lifecycle. Learn how it works here.

Connect with Matt on LinkedIn.

Erstwhile podcast cohost Cassidy Williams is the CTO of Contenda. 

Shoutout to Stack Overflow user Jim, who earned a Stellar Question badge with Docker cannot start on Windows, a question (well, more of a statement) that’s helped 1.1 million people and counting.

Episode Transcription

[intro music plays]

Ben Popper Supercharge your AI development with Intel’s Edge AI resources at intel.com/edgeai. Access open source code snippets and guides for popular models like YOLOv8 and PaDiM. Visit intel.com/edgeai for seamless deployment. 

BP Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I'm your host, Ben Popper, Director of Content here at Stack Overflow, joined as I often am by my colleague and collaborator, Ryan Donovan, Editor of our blog, and for the first time in a long time, glad to have her back, Cassidy Williams, CTO over at Contenda and a long time contributor to Stack Overflow, both the newsletter and the podcast. How are you doing, Cassidy?

Cassidy Williams Hello! I'm glad to be back. 

BP It's great to have you. So today we are lucky to have Matt Van Itallie, co-founder and CEO over at Sema. We are going to be chatting about how to measure developer productivity and skills, how within a big company, an engineering leader or CTO can make their case to the board or to the rest of the senior leadership team that this is why we need to invest in certain things. Maybe it's tech debt, maybe it's something else. And then we're going to be chatting about the topic de jure– AI code gen, the pros and cons, the things you need to look out for, and how you can measure these things successfully. So without further ado, Matt, welcome to the Stack Overflow Podcast.

Matt Van Itallie I'm so excited to be here. Thank you, Cassidy. Thank you, Ryan. Thank you, Ben. I'm so psyched. 

BP So let's just start out real quick. Who are you, how'd you get into the world of software and technology, and what led you to become a founder and leader of your own company? 

MV Sure. So I learned to code in Basic on a Commodore 64 with cassette tape memory. I'll show my age. Yes, Ryan, absolutely. That little turtle device. My parents were both coders and my mom was a coder turned math teacher. To some extent it was in my DNA to found a company that treats code as data, which is ultimately what Sema does. I was on the business side of technology for many years, both in governments working in school districts and in the private sector, and I really became obsessed with how we can make code and engineering generally more understandable to the rest of the organization. Paul Ford wrote this great piece in 2015 which was one of the inspirations of starting Sema, that the man in the taupe blazer might show up and say some very hard to understand things about your code and you have to spend a lot of money fixing it. And I kept looking at things like Salesforce and marketing automation software, which is useful for individual contributors but can also provide an executive level, a C-suite level view of marketing and sales, et cetera. And it drove me nuts that there wasn't an equivalent tool to help the rest of the C-suite and the boards of directors understand code as well. I spent about two years researching it, read a lot of fun academic papers, and was very fortunate to have a great researcher agree to let us out-license. We’re a spin up from the University of Michigan, and Sema was born. 

Ryan Donovan So you wrote an article for us about a year and a half ago, putting together a sort of matrix of skills quantifying exactly what the junior/mid-level/senior developer is. I thought that was super interesting because everybody is trying to figure out what that means. So can you tell us where that idea came from and how your thinking has evolved from there? 

MV So despite coming from such a quant-y background and despite having the desire for quantitative understanding of code, it is incredibly clear to everyone on this call, myself included, that code is a craft. Code is not assembly line work, code is not just doing the same thing over and over again. There's so much nuance, there's so much judgment, so much opportunity to learn and grow that to think about how does it mean to advance one's skills, to advance one's career, it's really critical to be from a craftsperson's perspective rather than some kind of rote, ‘you do these steps, you then advance,’ something like that. So that was definitely the origin of the piece. I love the question of what's different. I'd say two things have really shaped my thinking, two changes since then in particular. One is around this little thing called generative AI which you might have heard of. Can I joke like this or are jokes banned on this podcast?

BP We didn't laugh, but I thought that was a joke and it was appropriate for the podcast. 

MV You were also appropriate by not laughing because that is the quality of my joke. So Gen AI, for sure, but also really spending time thinking about management pathways, and maybe I'll do that second one first. I'm such a believer in experimentation in almost everything, and especially in careers. Not only should one, over the course of one's career, be experimenting in size of company or organization and type, but also I really encourage everyone to try at least a little bit of time managing. It could be as a team lead, it could be as an architect, as a thought leader, but boy, oh boy, do I recommend it. Literally worst case, you realize you hate it, and you know how hard it is for someone to be a manager or a team lead and you show a little bit more empathy. Best case, you realize you might want to incorporate that as part of your forward path, whether you want to continue to be an individual contributor but also do other things. Maybe you actually want to do management. So reflecting, if I were to write that again, I would definitely encourage people to be testing the waters with where their skill and interest is on the management and leadership side, and I did say that very intentionally, listeners. Start with interest and try to assess interest and then we can talk about skill, because if you find it interesting, the door's open. So don't, “Oh, I can't do it. I'll never understand this.” No. If you find it interesting or you're exploring whether you find it interesting, that's the first place, and then skill comes. So that's one thing that's really different, and of course what does it mean to be a developer in the age of generative AI is a fundamentally different question, and an exciting one for the future.

CW That point you made about having interest, something I always tell people to do is just sit down and say, “What do you like? What don't you like? What are you good at? What aren't you good at?” Just write these things down, because as you start to actually dedicate time to writing these things down, you really start to see the patterns in there where sometimes it's just like, “I don't like this one programming language and I'm not that good at it,” so don't apply for jobs that use it. That just saves you a bunch of time. And that's a small example, but it kind of goes out to the types of companies that you want to work for, or the types of work that you do want to do, or the types of teams you want to be on, the type of life you want to lead. Having that actually written down and doing that assessment for yourself is so important and I encourage everyone to do it.

BP It's really interesting to hear you say that. I'm just going to give a metaphor from something I've been seeing recently, which is that people work their whole life to be a professional football player. We just watched the NFL and they go through high school and college and they get to an elite level and then they get into the NFL and they kind of just wash out after two or three years. And then a lot of these people have been coming into the UFC –I watch a lot of fighting also– and they perform really well there. It's a totally different skill set, but they have all these exceptional athletic abilities and maybe they learned they like something else. And so they could have stuck it out in the NFL for another three or four years, grinding it away and not getting a lot of play time or not getting a lot of money, but if you figure out what you're passionate about and you already have a certain set of skills, maybe it's your programming language, your management versus your IC, you would test, you would come across, you'd be evaluated as far more skilled and useful and successful and valuable if you figure out where you fit best.

MV Well said. 

CW Just to play on that metaphor, it's so true. I was talking to a friend of mine at the Super Bowl and his cousin was an NFL player and now he's a farmer. After being in a very winning team, he decided to use his winnings to buy a farm and he wrangles cows now. And so there are ways to use your skills in a lot of different ways, you’ve just got to figure out how you want to do it. 

BP For programmers it's different because I think usually your skill sets can last you 20, 30 years. For an NFL player, you've got usually a one to five year window to either make a bunch of money or realize you're going to be successful at this long term. So you’ve got to make your bets wisely there. So for a second, let's move over to Sema. Matt, you ran me through some interesting examples of how you would go to Acme company and the kind of report you would give them, how that would be useful for the company, but then especially for the engineering leaders inside to make their case when they want to say, “Listen, we're going to need a bunch of money this year for the refactoring. I know you don't know what that is or why paying this tech down is important when it's not adding to revenue, but I'm going to use this thing that Sema gave me to help make the case.” Can you just walk our listeners through that, because I think everybody who listens to this is either in the business of developing themselves, managing engineers, or some of them work at a business with a lot of software folks and want to understand their perspective as well. 

MV Absolutely. I'm going to answer that with a quick detour about one of my favorite topics: logic models. If you haven't spent any time thinking about them, I highly recommend it. I recommend Wikipedia which has a great one referencing this. The idea of a logic model is a way to make sense of a system, and I think really, tied to Cassidy, it requires you to write things down to get clear in your thinking. That's the segue from Cassidy's great previous point. Logic models, and I have a slightly modified version I'm going to use, has four parts. It has resources, activities, outputs, and outcomes. Resources is people, money, things that could be used or put to work. Activities are things that are done– straightforward. Outputs versus outcomes is the key. Outputs are things that result from these resources applied to the activities. It could be an extremely long list. Outcomes are things that result that are incredibly important. And in my opinion, the right way to use logic models is to be brutally fierce and consistent about what is the outcomes. That's a judgment about understanding what's going on in the organization to really make sure you only have a very short list of the most meaningful parts. It forces discipline, so if you ever look at a logic model with more than four outcomes, it's a warning flag that you need to really cull. So Sema, we've been working on nonfunctional requirements since we got started. I've been thinking about this since 2015. Another way of thinking about nonfunctional requirements is code-based health. Another way of thinking about it is security debt, code quality debt, intellectual property risk debt from how you're managing third party licenses, process debt for the inconsistency of the development process. We care about these things deeply. We think about them all the time. But nonfunctional requirements are an output, not an outcome, to organizations. As much as I love them, and as I know as much as listeners love getting code right, getting code right and getting the process right to make sense to a nontechnical audience and to especially a business savvy audience, you have to put that in context of real, I'd say business but it's also let’s say organizational outcomes because it could be a nonprofit or a government agency, real organizational outcomes. And so anyone who's making the case, and you're never too soon in your career to make the case, because no matter what your career path is, being able to be persuasive about what you need is critical. I really encourage you to frame things in the organizational outcome, and then if you're talking about technology, how does it achieve the organizational outcome? So that's a little bit outside. So Ben, now to your question, let's take testing and let's take the presence or absence of unit testing. If you're unit testing, we help companies in technical due diligence. We've looked at a trillion dollars of organizations and deciding if they should get purchased or not. I'm absolutely honored to be in that seat. When we're looking at a company during diligence, one of our questions is, “What is your current level of testing, do you believe that more testing is necessary, and what is the business or technology reason to invest more in unit testing?” A business reason would be that we have undetected internal bugs that are showing up from users. We have outages, we have things that are affecting the customer experience, our net promoter score, our NPS, is lower. That's always the clearest way to get people's attention. You could even say that our throughput is lower because we're not being able to deliver our road map. Our functionality is fast because not having tests is slowing us down. So going that extra step and thinking empathetically to, “We're here to put code to work. How does my proposal put the code to work better?” Yes, of course, helping improve the day to day matters for engineers, but it's even more persuasive if you can frame it in terms of what the organization needs.

RD I think, Cassidy, you added a link to the newsletter this week that was about unit testing and how 100% coverage isn't always the best idea. 

CW Well, because a lot of times if you do 100% of everything, eventually you're just writing tests or just writing something just to have the coverage and not to actually draw meaning out of it and not to assess that something is wrong.

RD Right, not doing it for a business reason. 

CW It's like having a GitHub streak but all you do is edit a character on your readme or something just to keep that sweet, sweet green square. There comes a point where the number doesn't matter. 

BP I think it's so interesting also because, if I think back on this, if you were buying a reasonably large company, and you said you've worked on due diligence for trillions of dollars worth of deals, how in any reasonable time frame would you go about evaluating code quality? You're not going to assign a huge number of your engineers to pour over their code base and come back and tell you what they think. It might be built in a completely different language or structured completely differently and so that's a big unknown when you're acquiring another software company that it would be nice to have some benchmarks and some useful metrics for.

MV No surprise, I think that's a good point. But one of the original analogies back to Salesforce, if you were thinking about buying a company to do a sales diligence, you would interview salespeople, you would talk to some customers, you'd look at some contracts, but you would also look at the sales data, You'd get an export of the CRM. And the idea of trying to understand what's going on in sales with qualitative interviews, no matter how rigorous they are, is still going to miss the boat without quantitative. The state of art of diligence in 2024 is almost always a code scan at least gets the open source legal risk that's copyleft or copyleft limited, GPL and such. But of course we think there are many situations where scanning all of that code to get that quantitative to match with the qualitative is quite important.

RD So we talked to a company for a blog recently and they said the folks who can sort of quantitatively automatically measure code quality will have sort of won the game for Gen AI. Do you think it's possible to quantitatively measure code quality, or does it require somebody sniffing around for code smells?

MV Yes is the answer, and I think it requires both. I think it requires both. It's going to take an awful lot to say that coders need to be removed from the process of evaluating whether or not code is right. Can you automatically run linter results? Of course. Can you check for CVEs and SaaS and DaaS security warnings? Of course. But is this the right way to solve the problem? Is this context the right way? Are we missing some hidden secondary consequence that however it was created, whether a human did it or it was an open source library or it was Gen AI, the human must be in the loop, which is why –maybe I'm segueing– why I'm both incredibly bullish about the power of Gen AI in the SDLC, but also as bullish, even more bullish about developers critical role to make sure that that code is right.

CW Yeah. We've all seen the memes and stuff too where it's just like, “Don't touch this file. Anybody who tries to refactor this file increases the number of hours spent wasting on it.” You have to have understanding and it becomes a blocker if there's that file or that chunk of the code base that can't be touched and that you have to work around 

MV For sure. Well said. 

BP So let's move over then to code gen. You hear statistics thrown around from some of the providers of these code gen tools, folks who have visibility into billions of repos saying that 30-40-50-60-70% now of all code being written inside of companies has a little bit of AI auto-gen code included in it. Maybe it's just closing the paragraph or adding a tiny little bit of this or that, but I think you have maybe a unique perspective having gone through the due diligence. What are the pros and cons there? Obviously maybe there's pros in terms of productivity– keeping developers in flow state, keeping them feeling happy. Maybe you can speak to a few others you notice. And then the downsides I think really do have to do more with the security, the licensing, and the legal implications, so would love to hear your thoughts.

MV Exactly. And just to give a little bit of bio on this, after working on other elements of nonfunctional code quality since the beginning of the business, last year we started building a gen AI compliance and productivity tool at the behest of some of our major clients. So helping navigate this transition, so not just diligence in general but Gen AI in particular. So Ben, that high level you shared is absolutely right. Let me just start from our corest of principles, that the absolute biggest risk of generative AI in the software development life cycle is not using generative AI in the software development life cycle. The possibilities of improving developer flow, of increasing developer productivity so that you can deliver the road map faster, are so extraordinary, are so extraordinary that we would consider it a critical risk if an organization isn't ready to use this in production in the next 12 months. Now, some organizations are at a stage that, based on their regulation, based on their stage, they can be using it in production right now. Other folks, if you're in pharma selling in Europe, you absolutely have to get the EU AI act in compliance. So it doesn't have to be tomorrow, but I'd say 12 months from now, any of your listeners contributing to your organization, getting to that point where it's in production, because if you don't, your competitor organizations will, and it is tremendously advantageous. The mental model I would encourage everybody to think about for using generative AI is open source. Open source is code you didn't write, but using it can help you code faster, can help you focus on the parts of the problem that you and your colleagues are uniquely equipped to solve. I'd go far as to say that it would be quite unusual, really, really unusual to not use open source and to write it yourself if an open source library exists, but open source comes with risks. It comes with legal risks, comes with security risk, comes with version risk, comes with knowledge risk, and so I would use that prism, with one really big distinction that we'll talk about in a second, is absolutely the way to think about it. I know some folks are having a little bit of an identity crisis. We're working on a Gen AI tool to help manage generative AI and help increase its throughput, and one of our engineers months in came to us and said, “Is it okay that I'm using Gen AI? Because it feels like, I don't know, is that really what it means to be a developer?” And I said, “Do you use open source?” Well, one, thank you for telling me. 

BP Have you ever copy/pasted from Stack Overflow? Just curious. 

MV Exactly. But we're in this mental change where people just have to wrap their head around it. And I swear, everyone, you can have my email address. I will tell whoever you're working with that being a part of the ethos of being a developer includes reuse. You're standing on the shoulders of giants. It really is okay, but you have to manage the risks the right way. So then let's talk about huge, huge, huge benefits, huge risk if you don't use it, but then let's talk about some of the practical risks. Number one, code written purely by generative AI is demonstrated to be less secure. There was an initial Stanford study, I'm sure you've seen it, but also a really cool one from Cornell University in October where the data still seems to hold out. The good news is that it is a really solvable problem. Whatever SaaS or DaaS tooling you're using, whatever CVE detection you're using, it's a must-have for your generative AI, so to speak, pipeline on top of everything else. The second risk is legal risk, and this one is, I think, intriguing and a little bit scary, but also very solvable. The legal risk in generative AI is not –I'm going to answer what it's not first, Sema's position on this– is not the risk that because you're using a generative AI tool that was trained on protected materials, such as copyleft materials, that the creators of the training material will be coming after the organizations. Sema assesses the risk of that to be quite low. Now, we always recommend being a good open source partner, especially relative to your size and stage of an organization, but the legal risk from Gen AI is not from creators of the training data coming after your organization. Happy to talk to you more if you want to go into details. The risk is that the code you are writing with generative AI will not get copyright protection, may not get copyright protection and may not get other forms of legal protection. 

BP Can you define copyleft? Just a simple layperson's definition for folks who are listening but maybe didn't know that term earlier on.

MV Absolutely. Let me dig into copyleft just very, very briefly. When you're using open source code, you can download it from GitHub, the most common place, and it looks like you are able to use it. It's publicly accessible. All open source code, all code comes with licenses, and it's not always obvious until you go looking what the actual license type is. Some licenses are very permissive. As the name suggests, you can do whatever the heck you want. Some licenses are very restrictive. You might have to pay for it, a commercial license. But the most kind of restrictive in the sense of limiting what you can do with it, is a copyleft license. And what copyleft licenses say is that if you use this code, and if you share this code with your users, the term of art is distributed, you are supposed to give away your code for free just like we give away our code –‘our’ is the creators of a GPL license– for free. And what this creates is a meaningful compliance risk. If you haven't gone through a copyleft review or open source legal review, that will happen to you at some point in your career where you have to go through and see the full provenance of your code's open source and what licenses there are. And if you're using the copyleft code in the wrong way, you'll need to replace it or find a different library that handles it in line with the license. So 30 seconds of law– in order to get copyright protection, which is an important one especially for software companies of legal protections, it needs to be human created. The work of machines does not get copyright protection. The court systems, including in the United States but also elsewhere, have already determined that if a work was entirely created by generative AI, including through prompt engineering, it will not get copyright protection. 

BP Fascinating. I didn't know that. 

MV And so it's not enough to be using a tool to produce the results. They have also said– the US Copyright Office among others– that we don't know exactly where the line is going to be drawn. Clearly, if a work is 100% human, that's fine for copyright, and if the work is 100% Gen AI, it's not. By definition there's going to be a line drawn somewhere and so this is the most important risk. That's the interesting/slightly scary thing. The good news is the remediation and prevention is very straightforward, and I'm going to tell you two magic words: pure and blended. Pure, in this case pure Gen AI code, is code that you copy/paste in untouched from a generative AI tool. Blended means you've modified it. Almost all of the risk, and we talked about security, we talked about intellectual property, I'd also had maintainability, they all point in favor of blending the code, of making sure you understand it, of testing it for correctness, of putting it through security, of making your own meaningful contributions to it. Every single one of those cuts in favor of blending the code, of modifying it. And if I go back to the open source analogy, that's the one place where it breaks down and it's fundamental to get this distinction. Open source, as best as you can, we don't want to touch it. It is the right license as is. You create license risk if you modify it, you create security risk if you modify it, especially versus large libraries. The opposite is true here. So any engineer, any engineering manager, anybody, just just hear us out. Treat Gen AI the opposite of open source. Lean into editing it, modify it for all these reasons, and frankly, Sema's challenge for the next two years is helping 30 million engineers understand not only is it okay to modify, it's so much better than the alternative. Boy, that was a speech. I know I just went on, but I'm so passionate about this. Please jump in. 

CW It's real though, because there's so many things where large language models are very good at determining what should look right. And they are getting really, really good, especially at generating code and more precise things, but at the same time, they're not perfect. And for all the reasons you said and just for accuracy sake, when you use a tool, whether you're copying and pasting it or generating it or using open source, you’ve got to know how it works to be able to work with it. And if you blindly use generative AI and don't really know how it works and don't edit it, you're going to run into legal troubles, you're going to run into troubles with understanding, and it might just not be fully accurate. 

RD I wonder if some of the legal squeamishness around the machine created comes from the nondeterministic nature of LLMs. If you have a digital camera with a filter on it and you take a photo, it is essentially machine created with a human aiming it, but it's going to do the same thing every time. You can copyright the lifting of it. With an LLM, the machine has a little bit of the creation, it has a lot of the creation baked in. 

MV Exactly right. And there's court cases that say that a Xerox machine isn't creative, a camera is, but a Xerox machine is not. The law is full of judgment like that, and you've said it very well, Ryan. 

BP So Matt, let me ask you, what are you looking forward to for the next year? What are you excited about? And are there any sort of interesting challenges or potential obstacles that you would see maybe arising for a company in the world of Gen AI and specifically code gen? Excited about, nervous about, think people should be paying attention to. 

MV Sure. I am incredibly excited about what generative AI will be bringing to developers’ quality of life and to organizations’ ability to solve new problems faster and solve the same problems with higher precision to be more customized. I am incredibly, incredibly excited about it. I'm a little bit nervous about the path of implementation. This is so powerful and has such potential that organizations very understandably may be quick to mandate and to set aggressive targets for Gen AI usage. And I just can't stress enough that this has to be done with developers, not to developers. The change management to bring in a new tool, no matter how powerful it is, must be done together. And so I'm nervous about organizations, I totally understand why, but nervous about organizations forcing it on their developers rather than engaging with them on how we do this in the right way that respects the craft, that respects our teams. I am excited about helping organizations get that right and excited for the folks who get it right, but it's just so tempting to just say, “Just use it,” and that's just not our field. So some of the risks associated with generative AI are general. It's a general risk that it might be incorrect, as Cassidy said. It's a general risk that the code is less secure. There are also jurisdiction-specific risks. So copyleft is, if you're using it it doesn't matter where you are, it has the same risk applied. But the EU has different laws about code than the United States has, and we're almost finished with a study of federal and state level regulations of Gen AI obviously in the United States. There are 862 different bills and laws going on right now affecting folks in the United States. Not every state, obviously they're different, and so it's a little bit geeky, but I would really recommend your legal team, compliance team, needs to have some way of being informed about the regulations that are changing. And it's different, obviously, than open source. Open source, you just look at the license and you can see. You need to find a mechanism to stay up to date when these rules and laws go from potential to actual.

BP Right.

[music plays]

BP All right, everyone. It is that time of the show. We want to shout out someone who came on Stack Overflow and helped to spread a little knowledge around the community. To Jim, awarded five hours ago, a Stellar Question Badge. It's been saved by 100 users. “Docker can't start on Windows.” Well, 1.1 million people have had this problem, and Jim has a solution for you. So thanks, Jim. As always, I am Ben Popper. I am the Director of Content here at Stack Overflow. You can find me on X @BenPopper. You can always email us with questions or suggestions: podcast@stackoverflow.com. We love to have people write in and have you on the show. And if you enjoyed what you heard today, why don't you leave us a rating and a review on the program. It really helps. 

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can read it at stackoverflow.blog, and you can reach out to me @RThorDonovan on X. 

CW I'm Cassidy Williams, I'm CTO over at Contenda. We're building Brainstory at brainstory.ai. You can find me @Cassidoo on most things.

MV And I'm Matt Van Itallie, founder and CEO of Sema. And I had such a fun time, thank you all so much for having me. You can find me at mvi@semasoftware.com. 

BP Awesome. 

CW Thank you, Matt. 

BP All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]