The Stack Overflow Podcast

The new pair programming: an AI agent that cleans your code as you write

Episode Summary

Ben welcomes Sonar CEO Tariq Shaukat for a conversation about AI coding tools’ potential to boost developer productivity—and how to balance those potential gains against code quality and security concerns. They talk about Sonar’s origins as an open-source code quality tool, the excellent reasons to embrace a “clean as you code” philosophy, and how to determine where AI coding tools can be helpful and where they can’t (yet).

Episode Notes

Tariq Shaukat, the former president of Google Cloud and Bumble, is the CEO of Sonar. Follow him on LinkedIn.

Sonar offers code quality and security solutions that help developers write clean code and remediate existing code organically. Their product SonarQube helps devs ensure the quality and security of AI-generated code.

Watch Olivier Gaudin, founder of Sonar, explain why clean code is the foundation for well-functioning dev teams.

Stack Overflow user Ogglas earned a Populist badge by explaining How to access the appsettings in Blazor WebAssembly.

Episode Transcription

[intro music plays]

Ben Popper Ready to transform the way you manage agreements? Unlock higher efficiency and performance at Docusign Discover on November 20th. With APIs and tools that span the entire contract lifecycle, Docusign Discover equips you to integrate, automate, and optimize your agreements. Register for free at developers.docusign.com. 

BP Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, Director of Content here at Stack Overflow. Today, I am joined by Tariq Shaukat, who is the CEO over at Sonar. As you know on this show, we've talked a lot recently about AI generated code and all the assistance these amazing new Gen AI systems can provide to developers. There was a ton of feedback on it in our 2024 Developer Survey, with an increasing number of people turning to these tools but not necessarily trusting them or the output they produce. And so today we're going to be talking about how this can be a boost, perhaps, to developers’ productivity, but how to ensure that at the same time, the code you're getting from the AI is quality, is secure, and ends up adding overall to what you're doing as opposed to creating more work in the first place. Tariq, welcome to the Stack Overflow Podcast. 

Tariq Shaukat Thank you very much. It's great to be here, Ben. 

BP So tell folks really quickly, I know you have a background in mechanical engineering, then you spent some time at Google Cloud, you worked at a very well-known dating app. So quite a varied set of skills, but tell folks a little bit how you got into software and technology and what it is that brought you to Sonar.

TS Thanks, and again, thanks for the invitation to be here. Being an avid listener of your podcast, I anticipated this question. I was trying to remember what was the first computer I used and I think it was a Texas Instruments TI-99/4A way back in the early 80’s, so I think that's dating me and I probably shouldn't have said this on air, and my first computer was a Commodore VIC-20. Everyone talks about the Commodore 64, but it was the VIC-20, the version before that. And so I've been kind of obsessed from a very young age with technology, computer software, all of that, and really when I went to college, as you said, I did mechanical engineering, but with a real emphasis on robotics, so there was a lot of software development involved with that. I’m kind of self-taught, I wouldn't even call myself a developer per se. I'm sure that that would be an insult to most of the developers listening to this podcast, but I can write code and I sort of have taught myself how to do that. As you mentioned, I joined Google in the early days of Google Cloud as President of the Cloud Group when Diane Greene was there and then with Thomas Kurian as well, and then made the move over to Bumble as President of Bumble, building and shipping various mobile apps, including the dating apps that everyone knows and loves. And about a year ago I joined Sonar, and I have this sort of history of going back and forth from consumer to enterprise and people who use technology and people who build technology, and I find it personally very fulfilling and it gives you a very different perspective on the technology world, I find. If you have to actually use the technology in a Fortune 500 company, then you understand what you should be building. And one of the things that really appealed to me about Sonar is, at Bumble we had hundreds of engineers and I was always worried about the productivity of the engineering team and how we really make sure that life is good for the developers and that they're spending less time on stuff that they don't really want to spend time on and is not really adding value to them or to the company, and how you spend more time on building. When you build a mobile app and you ship through the app stores, if you have quality issues, it really hurts because there's this whole app review process, things like that, so any crash rate issues or any downtime that you have in your app really becomes a big problem. So we had an increasing level of focus on quality and security– of course, there's always an emphasis. And that's really what brought me to Sonar. I started talking to our founder, Olivier, who founded the company 15 years ago as an open source project, and it really just resonated– the value that we're creating and the developer-first approach that we take. 

BP That's interesting. I didn't know that the history went so far back. So tell me what's the origin. 15 years ago, not many people were talking about LLMs and their ability to help us produce code. What was the origin as an open source project and how did that evolve into what Sonar is today? 

TS Yes, we had Olivier, who until recently was my Co-CEO at Sonar. He's still involved with our next generation efforts inside of Sonar, but he was an engineering manager at, I think, JP Morgan at the time and he was struggling, both himself and his team, with just how do they actually continuously write high quality secure code? And the big insight he had was that he and his co-founders wanted to help to provide tools to help developers just get better at their craft. It was really focused on that. I'm not sure he ever thought that it would be a business or that he thought it would be around and thriving 15 years later, but they basically quit their jobs and started this open source project building code analyzers. The initial version was for Java and they made it available in the open source community and it just kind of took off virally from there. And fast forward, a couple of years later they started getting the requests from companies of, “Hey, could you add this feature, this feature, this feature?” So it started turning into a business. And if you fast forward to today, we cover 30-ish different programming languages, everything from Java where we started, but Python and C++ to COBOL and ABAP for SAP developers. We integrate with multiple different IDEs and DevOps environments, so it's really become a platform for software developers, but still very much rooted in that open source foundation that we have. 

BP That's so cool. So you mentioned, one, enjoying going back and forth between enterprise and consumer. Is Sonar a tool that sort of sits in the middle, because in some ways maybe there's a bottom-up adoption process of a team or a department within a company that wants to adopt some of these tools so that they can be more productive but ensure code quality? On the other hand, I could see this being top-down. The head of JP Morgan or NASA or whoever it is might say, “I'm reading all about AI code generation and I want to unleash these tools inside my organization, but not if it means bugs and security risks and headaches are going to come back to bite me next quarter.” So talk a little bit about that. Is Sonar adoption fed from both directions? 

TS I can look at my career, this consumer and enterprise thing, and say it keeps me interested, which is what I sort of solved for, and in 20/20 hindsight, I can rationalize it. And for sure, I'm finding a lot of lessons to pull in from the consumer world because developers are people too, for lack of a better way of saying it, and almost all of our customer base, we have 7 million developers who use the platform today and 400,000 organizations, and I had our team recently go back and look at our largest accounts and how they started, and in almost every case they started with people using our open source platform and then calling and saying, “Hey, I need a $1,500 license,” something like that. We sell packages by lines of code. So it really was an individual developer or small dev team who said, “I need access to certain languages, certain security rules, certain security analysis,” things like that, and it really kind of snowballs from there. And so almost all of our customer base starts with that individual developer, the adopt, land, and expand– in many cases adopting open source and then we land a commercial contract with them and then at some point a CTO or a CIO or somebody says, “Yes, we need to have a standard platform for checking code quality and security. Oh, wow, my developers are already using this, let me go with something that our developers use.” One of the things we think differentiates us in the market is that we are very maniacally focused on how we make tools that developers actually want to use. There's no end of security scanning tools and things like that out there, but generally speaking, what we hear from customers is that it just grows the backlog. You can't keep up and developers almost tune out the noise. The signal to noise ratio is not terribly high. And we pride ourselves on that we may not be the tool that everybody loves, but at least the development team really believes that when we show them an issue, it is a real issue more times than not. Everyone has false positives, but we try and minimize the false positives and maximize the true positive rate. 

BP So one of the things that I had seen when I looked up Olivier and checked out the website was this idea of ‘clean as you code.’ So can you talk me a little bit through how this works? And you just mentioned false positive, so somebody is working and as they're doing it, Sonar in the background is inspecting it, and then it's sending them a message saying, “This might raise an issue in the future” or “You're lacking test coverage here” or “This doesn't fit within our standard best practices.” What kind of messages is it sending and how is it doing that kind of real time, almost pair programming feedback? 

TS So we have basically three products. We have the product that most people know and is the brand that we're most known for called SonarQube, which is really a self-managed code analysis and workflow system, and I'll come back to that in a second. There's a cloud version of this, a SaaS version that you can use called SonarCloud, and then there's SonarLint, which is really the IDE extension for SonarQube and SonarCloud and it really is a plugin that you can use in the IDE. And our basic thesis is that there's a lot of observability tools out there that will help you find and fix issues in runtime, but it gets really expensive to do that and the earlier that you can find and fix issues, the cheaper it's going to be, the more productive you're going to be, and the higher customer satisfaction you're going to have. And so a lot of what we have focused on is how we push this analysis to the earliest possible stages. And so to your point, if you have SonarLint connected into SonarQube, you can actually see these issues surface as you're actually writing code, and it'll say, “Hey, this may be an issue. Take a look at this,” et cetera. And then as you issue your pull request, you can actually have a more thorough analysis done and it'll give you a list of issues with, “Here's the severity that we think that this issue has.” And to the point on ‘clean as you code,’ you can configure quality profiles and quality gates. So we've got a version that we call ‘clean as you code,’ which is super strict and you cannot move forward with your build unless you resolve these issues. And our developers use this in-house. They love it, but it's not for everybody, because different companies have different policies. What we think is, if you use that quality profile, you're going to prevent any issues from accumulating later on. And so you reduce tech debt, you reduce all the bug fix and remediation and things like that that kind of plagues the development world. But really any company can configure their own quality profiles, and one popular one, as you mentioned, was around test coverage– just, “Hey, you can't move forward if test coverage is not above X.” So we'll give you all the quality issues and security issues we see and you can put the gate in to say, “Let's not move forward without passing this on code coverage,” or, “All high severity issues must be resolved, but not low severity issues,” or things like that.

BP It's interesting to hear you say that. I think that there's a duality to being a software developer, which is to say, folks often talk about the burnout that comes with technical debt and the fact that a lot of the work ends up being looking at tickets or refactoring or trying to convince the folks on the revenue generating side that you need extra headcount to address these issues before you can go out. But then the other side of it is, how do you get in that flow state where you feel creative and you execute on the business logic or you come up with something brand new that actually unlocks some kind of new functionality? And so I could see, as you mentioned, maybe there is a smaller startup where that level of quality control feels like it's hemming them in and they want to push a little bit in terms of getting stuff out before they then do the checking. The thing that is really fascinating to me is this question of code quality and if there is an asymptotic end to it, because the idea that you could write more code because of these AI services is interesting, but at the end of the day, more code doesn't mean that it's better code or that you're coming up with a better solution to the business problem at hand or to the product market fit in terms of how many consumers are going to recommend your app. But it does seem sort of difficult to argue with the idea that, of whatever code you write, if when it's done it's super robust, it's double redundant, and it's completely covered with your test, then you as the developer are going to have a better experience over the next year and so is the user. And so it's almost, to get back to mechanical engineering, and I know you have NASA as a client, if you can take the same amount of time as before to write code that's been more thoroughly checked and test covered and will give you fewer headaches in the future, then that's inarguably a better thing, whereas more lines of code may not be a better thing. It kind of depends on the situation. 

TS It goes back to the old quality, not quantity thing, and I think that applies kind of universally. One thing that we very much see in our customers, and there's been some research papers published by Google's Developer Experience team and others recently that kind of indicates this as well, that there's this broken windows theory of software development that I find really interesting, which is that if you signal to your developers that you care about quality and you care about security because you've got tools and you've got checks and things and you've got some level of rigor in the process, that actually leads to higher quality and security but it also leads to higher developer happiness and productivity. It just creates a better environment for the developer to operate in. It's not one size fits all. If you are at a large bank and you're building the wire transfer system, okay, you really can't have that go down. It will cost you tens of millions of dollars if that goes down, so maybe your quality profiles and quality gates there need to be very different. If it's an HR system internally that you're writing, maybe there's a slightly different set of quality profiles. And if it's a mobile app that you can just sort of ship quickly, you ship a web app that you can ship changes very quickly and the impact is low, you can have a much different view of quality. So I think this idea to let's actually tune the way that our workflow works based on this criticality of the app is really important, and I think the same is true with AI code generation. A lot of the real success stories that I hear about and see are really not for simple apps, but for apps that have relatively little downside if it goes down for a couple of minutes or there's a minor security breach or something like that. I think JP Morgan AI's team just put out some research on the importance of hallucinations in banking code, and I think it's probably obvious to say that it's a much bigger deal in banking code than it would be in my kid's web app or something like that. So quality is going to matter more in certain applications. 

BP It's interesting to think about what different companies demand and how they would approach this. I was speaking recently with someone who was formerly at SpaceX and Starlink, and those companies obviously achieved some incredible results, but he also mentioned that the mentality that it's okay to fail comes from this idea that we're going to launch some rockets and some of them are going to crash and that's going to teach us something, and that that even trickled all the way down to the software, that the push from above was to work as hard as you can, be passionate about this, if you make mistakes and there's a bug or the site goes down, that's okay. That's kind of the ethos within the company. We'd rather be creating new things and learning from our mistakes. So Sonar has its own proprietary or internal AI system that you developed to do code checking, or are you calling on some of the frontier models for this work?

TS Actually before I answer that, to your last point, what we see actually generate a lot of value for the developers is what we've called inside of Sonar ‘learn as you code.’ The idea of a black box system that's just telling you to fix these issues but we're not going to explain to you why and we're not going to help you fix it, is much, much less interesting than the idea of, “Hey, there's a mistake or there's an issue here. Here's what we think is causing the issue. Here's how you can avoid the issue in the future,” and that's actually one of the most sought after features, both by individual developers and by companies who are struggling with this idea of how you take your junior engineers and make them senior engineers over time. 

BP The AI as a mentor pair programmer, trying to level you up. 

TS Exactly. So then to your question, as you mentioned earlier, 15 years ago there was AI, but it was a different type of AI. It wasn't generative AI, it was the more statistical reinforcement learning, et cetera, and even that 15 years ago was pretty rudimentary. The core of our system is really a deterministic to oversimplify rules based system that has 5,000 or so different scenarios, depending on the language and what have you, that we look at. And so it's a very thorough algorithmic review of your code base to identify issues. What we are seeing now with all the advances in AI is that we can do a couple of things. One of them is that there are some problems that lend themselves to super deterministic systems and there's others where there's gray areas and actually the generative AI type of approach, the more reasoning type of approach, is better at solving those issues. So it's actually expanding the universe of problems that we can look at. We're still using the rules-based system for a large number of problems, because we actually think it works really well. And not everything is going to become a generative AI problem, and we're supplementing it with new approaches that will help us cover issues you couldn't identify before. So that's one piece, but then the other part that is really important is that in the past, we've been able to identify an issue for you, tell you that we think it's this type of issue, so here's an explanation of why we think this is an issue, here's the rule it triggered, here's an explanation of it and give you some learning. Now what we can start to do is connect remediation to identification of the issue. And this is, I think, one of the more exciting use cases that I've seen around generative AI. Of course, writing new code is always going to be exciting and interesting, and there's people doing great work there, but Stripe put out a study a couple of years ago that said that something like 40 to 50 percent of a developer's time is spent on toil work– doing debugging, refactoring, documentation, et cetera. And all of these issues are areas that we think Gen AI can really be helpful in and so for our purpose, we've got all this context about our analysis of your code and what the issue is, and that then leads into how we help create a fix that we can suggest to the developer. Again, we don't really believe in black boxes, so this is suggested to the developer and let them decide, is that the right fix or not the right fix. And so we call this AI Code Fix to be super linear from a naming standpoint and there we rely on external foundation models, OpenAI, Claude, 3.5, that sort of thing. 

BP To think back 15 years ago, what you're saying makes a lot of sense. Back then it would be much easier to write a set of rules for each language that say, “When you see this, or when this exception is raised, you know what to do.” Just let the developer know as they're writing and that removes that later toil and instead of it being a big spaghetti mess that you’ve got to unwind, you're fixing these problems as you go along which is nice. Now we have this incredible new capability of AI that is probably the thing that makes me the most interested to be working in this area. Another thing that gets me excited is you look at something like an AlphaFold to come up with novel ideas, or you look at the recent Strawberry release from OpenAI where you can watch an AI system go through this intricate and lengthy chain of reasoning, trying 17 different ways to get to the bottom of a cipher, looking and seeing when it's made its own mistakes, circling back, coming up with a second idea, testing its hypothesis and eventually getting you there. And to your point, so now you have sort of two ways of working with AI, almost like that thinking slow and thinking fast. There's our rules-based way that's going to be giving you your basic, “We're following along. We're trying to teach and we're catching your stuff,” but there's also a, “Man, this is kind of a thorny problem. How are we going to untangle this?” In the background, this set of AI agents are going to read some documentation and go check up on some new APIs and maybe come back to you with a couple of ideas for a solution you could implement, which is kind of astounding, and maybe not every developer will feel comfortable with it suggesting so much of their work, but maybe some of them will feel like it brings them good ideas that they can then execute on. 

TS And I think a lot of our focus is on how do you reduce that toil and really let developers focus. There's a lot of people talking about, “Oh, they can focus on architecture, et cetera.” The developers I talk to actually want to code in addition to doing that. But to your point, really thinking about how you free up time so they can do their best work and get the most satisfaction and impact is really important. The other part that we hear time and again with companies implementing generative AI is that you need what we're calling a ‘trust but verify’ approach. You can trust the OpenAI models, you can trust Claude, you can trust any of these systems, but if you're working in NASA or you're working in the banking system or you're working at a retailer and you're working on the checkout system, you need assurance that the code you're writing is actually good code. And so we're kind of used in two modes inside of companies. One of them is helping developers catch and fix issues as early as possible so that they don't have as much rework that needs to get done later on, but then secondarily, it is assurance for the companies that this code is being written in the way that we like. Some of these large banks are basically software companies. They've got tens of thousands of developers and they need to be able to say to their regulators and to their boards and to their CIOs and whatever that yes, we have the quality controls and assurance in place. This is something that with Gen AI we are hearing is becoming a problem. Hallucinations are, in many ways, a feature of the Gen AI systems and not a bug, meaning that the way the math works is that they will generate some issues that are incorrect, some code that is incorrect. And the real question is how do you couple the benefit of that with the sort of systems that actually check it and can help you find and fix problems as they occur. 

BP That's one of my favorite Andrej Karpathy essays that these are dream machines. I don't know why you think you should use these as a search engine or to generate completely secure code. They were created to make things up. That's what they were designed to do and now we're trying to shoehorn them into these new use cases. But to your point, you can use them in a workflow with different kinds of agents or more deterministic AI, and in that way you can maybe tap their potential while also reducing the downsides. So just as we wrap up here, what are you looking forward to over the next 6 to 12 months? Are there things coming on board in terms of capabilities or is this just about expanding the business? What are you thinking about and planning to work on in the year to come? 

TS It's an amazing time. I think you said this earlier, it's an amazing time to be in the software development world because there's more change happening now. I wake up every morning and look at all the startups that have been funded and it's sort of mind-blowing. And for us, we have these three parts of our business. How do you identify issues, and we're applying AI to that, but also continuing the work we do on the deterministic side. That's one core area. I'm very, very excited about this idea of how you remediate issues that are found faster and better. And what's going on in the AI agent world I think is going to be really revolutionary, perhaps not on writing new code or building a new app. It might be, I'm actually not super deep there, but when we find an issue, how do we actually go through a process through a combination of generative AI, reinforcement learning, et cetera, these reasoning chains, to help you fix these issues or at least propose fixes for developers. And I see a lot of exciting work happening in that area. And then the third piece is the most boring part to talk about, but how do you actually make Gen AI coding ready for primetime? One CTO of a large bank told me that they're having an outage a week, that they are root causing back to a generative AI model. And that's not a Gen AI problem, it's a failure of systems and processes problem. I don't know any developer who grew up wanting to be a copy editor for AI-written code. And so they need systems and tools to help them get the full potential. So we're investing really in all three of those areas and I think six months from now it's going to look very different than it does today. 

BP It's interesting because over the last two years of developer surveys, I think we've seen the mentality among software developers shift kind of in this direction. Last year they were almost exclusively using it for code generation. They weren’t sure if they wanted to trust it, but it would write things and they would look at it. This year that's still the number one thing, but the area where everybody sees it heading is testing and documentation, which to a large degree is what you're talking about. How can it help me get rid of that 40 percent of toil work, focus on the stuff that keeps developers happy, to your point, that they enjoy and they feel agency over and they're creating and then make sure that as you go through that workflow and you take on LLMs as assistants, you have some processes and procedures in place so that you're not getting an outage a week which you're hearing about from your boss, because I'm sure that that's not fun. 

TS You're definitely hearing about it if that's happening. In many ways I think the easiest source of value from the Gen AI tools is going to be to eliminate some of the easy work. And there's all sorts of questions about what happens and how do you learn to be a senior engineer if you never do the junior engineer work and things like that, so I think there's problems that come with this, but you can solve some of that and then you can solve some of that grunt work– the work that nobody really enjoys doing but needs to get done. And just by doing that, you're going to improve developer productivity and happiness immeasurably, or hopefully you can measure it, but immensely is a better way of saying it. 

BP Exactly.

[music plays]

BP All right. I want to take us to that time of the show where we shout out a user who came on Stack Overflow, somebody who contributed a little bit of curiosity or knowledge. Awarded two hours ago to Ogglas, “How to access the app settings in Blazor WebAssembly,” a Populist Badge, when you come along with a better answer than the accepted answer. So thank you for that, Ogglas, and congrats on your badge. As always, I am Ben Popper. I am the Director of Content here at Stack Overflow. You can find me on X @BenPopper. If you'd like to come on the show as a guest or you want to hear us talk about a certain topic, hit me up with an email, podcast@stackoverflow.com. And if you enjoyed today's conversation, do me a favor, subscribe and leave us a rating and a review. Tariq, just say your full name, your title, where you want to be found on the internet if you have social media or LinkedIn, and then the best place for listeners to go to learn more about Sonar or whichever particular product you want to call out.

TS Sure. I'm Tariq Shaukat, CEO of Sonar. You can find me pretty easily on LinkedIn, @TariqShaukat on X, and we are at sonarsource.com. 

BP Awesome.

[outro music plays]