The Stack Overflow Podcast

From bugs to performance to perfection: pushing code quality in mobile apps

Episode Summary

Ben and Ryan talk all things mobile app development with Kenny Johnston, Chief Product Officer at Instabug. They explore what’s unique about mobile observability, how AI tools can reduce developer toil, and why user experience matters so much for app quality.

Episode Notes

Instabug helps developers monitor, prioritize, and debug performance and stability issues throughout the mobile app development lifecycle. Get started with their docs.

Connect with Kenny on LinkedIn

Stack Overflow user itoctopus earned a Populist badge by explaining how to Break huge URLs so they don't overflow.

Some great excerpts from today’s episode: 

On why they built a lean, mean SDK: “Nowadays mobile developers spend a lot of time thinking about SDK bloat and how much they're taxing their app’s performance just from the SDKs they’re including. We spent a lot of time and a lot of effort making sure that our SDK has very minimal performance impact. You can't do this without any performance impact, but making sure that it has really minimal performance impact as an SDK itself. A lot of that has to do with the way in which, from years of experience, we capture the information and offload certain information to storage for when we have network connectivity bandwidth later so that we're not constantly eating network.”

On the future of self-fixing code and mobile app development: “Our belief is that the place where we're going to see this kind of auto fixing of code, auto healing of code, it's probably going to be mobile first. So we're invested heavily in seeing that reality. You can think of it as straightforward as crashes, for example. There's a known set of crash error codes. And so there's a known set of crash behaviors. So it's pretty easy for us. And that was what our smart resolve 1.0 was to get to, Hey, this is generally how you should solve these types of crashes. Our 1.0 version is not giving you code suggestions, but it's at least giving you known best practices from places like Stack Overflow and others that have content about how to solve these types of problems.”

On using AI models to spot UI issues: “We think that there are a lot less deterministic ways to spot a frustration signal. So the thing we're working on is, on device models for your users’ behavior that will allow our SDK to capture a frustration signal that nobody else has. Maybe today when I opened my banking app, I usually look at page one and then do a transfer, check out my balance, and now I'm doing this weird swiping behavior because something's not working well. A model could spot that. It wouldn't be reported as a bug, but a model could spot that.”

Episode Transcription

[intro music plays]

Ben Popper Maximize cloud efficiency with DoiT, an AWS Premier Partner. Let DoiT guide you from cloud planning to production. With over 2,000 AWS customer launches and more than 400 AWS certifications, DoiT empowers you to get the most from your cloud investment. Learn more at DoiT.com. DoiT.

BP Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, Director of Content here at Stack Overflow, joined by my comrade in arms, Ryan Donovan. 

Ryan Donovan Yes, sir. 

BP Ryan, I think you and I have had some fun over the years. If you want to get a developer talking, you say, “What is the weirdest bug you ever found?” Everybody's got a story, good to talk about at the bar, fun little detective mystery novel that you can write for someone. And in our recent developer survey, folks were talking about what they loved and what they didn't love about working as a developer. The number one thing they hate is technical debt. One of the other high things they hate is the time they spend on toil work like debugging. The number one thing they love is clean code and working environment. So we are lucky today to get the chance to talk with Kenny Johnston. He is the Chief Product Officer over at Instabug, has had some time in the bare metal world of Rackspace, some time at GitLab, and now working at the intersection of AI and programming. So Kenny, welcome to the Stack Overflow Podcast. 

Kenny Johnston Thanks for having me. Excited. 

BP So just real quick, tell folks what brought you into the world of software and technology, a little bit about your journey through the different eras, and how you ended up in the role you're at today. 

KJ Sure. I'm a developer by trade. I started out doing development when I was in high school and college as a side job, graduated with a degree in computer engineering, spent some time doing a totally different career in electoral campaigns and politics that we can talk about on a different podcast, but I ended up finding my way back into technology and immediately started into product management. So I was a product manager at HP and then at Rackspace, and in both of those I was working on OpenStack. So I kind of started my tech and product management career in open source technology, including at Rackspace managing our services for OpenStack, Ceph, and Kubernetes, and then transitioned from Rackspace to GitLab where it was much more direct developer tools. At GitLab, I covered the whole ops side of the DevOps tool chain, so everything from CI/CD to release and deployment to monitoring and infrastructure management. I've always been kind of more familiar, especially from my Rackspace days, with the infrastructure and ops side of things. And one of the things that became very clear to me during my time at GitLab is that one of the most essential digital experiences that any company delivers is a mobile app, and most of these tool chains aren't really built for the uniqueness of what a mobile developer does. So I spent some time learning about that and kind of fell in love with the company I work at now. Instabug is really focused on, yes, reporting and fixing bugs from mobile apps, but more the things that are difficult in a mobile developer’s workflow that their standard tooling may or may not accommodate and making sure that we provide kind of best in breed for those mobile teams that are oftentimes given whatever tooling every other type of developer gets to utilize and told to make do. So I've been with Instabug for about two years and we've really focused on expanding our portfolio, not just to finding crashes and helping teams fix performance problems, but also things like really difficulty in the release process for mobile apps, or this really unique part of mobile apps where you get feedback from places like the app store that's really hard to debug. So we've been focusing a lot on the whole lifecycle of a mobile app and the journey of a mobile app developer. In the last year and a half, we've been really focused on AI and how AI can help with specifically the pain points around what you described– the kind of toil work of fixing crashes across multiple different app versions that are impacting a very small fraction of your users, or tackling really tricky performance problems that are really hard to reproduce. That's kind of what I've been doing in my career in tech. 

RD Like Ben said at the beginning, we love to talk to developers and find their best bug story. And I think a lot of that is because sometimes from crash reports or whatever, you get whatever information is available. Sometimes it's cryptic dumps of memory, sometimes you put in the breakpoints at the right point, you've got logging, whatever. What was the sort of ‘aha’ moment with Instabug for how you could reduce the toil for this detective work?

KJ Instabug started 11 years ago as a bug reporting tool as the ‘shake your phone to report a bug,’ and this was because uniquely in the development's lifecycle for mobile development, there's a lot of manual QA because you want to make sure before you ship it to the store for store approval that it's as buttoned up as possible, because if you miss a window with the store, it's a three or four-day delay and you missed the train. So lots of mobile teams naturally spent a lot of time having a QA, whether it's a developer on the team or a third party QA service, reviewing the app, making sure it's functional and reporting bugs. That used to happen with information in an Excel sheet or a Jira ticket where you don't get any of the details that a developer would actually need to reproduce or debug that. So Instabug started with just, “Hey, when you report that bug, let's capture as much information as we can about what was going on in that device and store that in a way that the developer can understand it and make a decision and reproduce it rather quickly.” There's things like the reproduction steps, the state of the device, things like the orientation and the battery state and the network connectivity, all these things that tend to be the cause of bugs, we were capturing. We then took that same methodology and applied it to crash reporting where now it's a lot more about the patterns, because in mobile development and crash reporting, your software is running in a very heterogeneous environment and so it's really about, “Oh, this crash happens to happen on devices that are on this network or where the battery was in a low state or where the user took these set of steps,” and Instabug is built to group crashes in a way that helps you actually get to a root cause that you can reproduce so that you can get to fixing that crash quickly. So those ‘aha’ moments are really, I think, different than other crash reporting or error reporting tools. With mobile, you really have to take into account the fact that these patterns are oftentimes more about the environment the software is running in, which typically in a web app is more controlled, but in this case, it's uncontrolled and so you're really trying to figure out how to create a situation where the developer has some confidence in what the root cause of it was and be able to reproduce it. It’s the whole name of the game. So I think that's the real ‘aha’ moment for Instabug. Our big differentiator is we do that without all the breadcrumbing. So our SDK naturally captures both performance data and crash data and gives you understandable reproduction steps without a developer having to have instrumented something and then been like, “Oh shoot, I have to go perform a release to be able to reproduce this crash.”

BP I think one of the most interesting things about mobile observability, as you mentioned, is the spectrum of environments in which your software might be running. Can you talk a little bit about how you handle the fact that the app could be running on an eight-year-old Android phone or a brand new iPhone? The issue might be generated by the battery or the cracked screen as much as the software. 

KJ A cracked screen we can't necessarily detect, but a battery we can and the older devices. So one of the things that Instabug is really great at is identifying the root cause– we call them patterns. So we capture all of this information, and when you get a crash that has, let's say, 10,000 occurrences, we can help you spotlight the patterns of, “Oh, it happens to be happening on Android devices that look like this or that are this old.” The other really important thing to remember, and we take this very seriously as a company, is that we're here to make your mobile app higher quality, so we have to be very careful that our SDK is optimized to not have any performance impact, particularly on lower end devices. So nowadays, mobile developers spend a lot of time thinking about SDK bloat and how much they're taxing their app’s performance just from the SDKs they're including. We spend a lot of time and a lot of effort making sure that our SDK is very minimal. Obviously it has some performance impact, you can't do this without any performance impact, but making sure that has really minimal performance impact as an SDK itself. But a lot of that has to do with the way in which, just from years of experience, we capture the information and offload certain information to storage for when we have network connectivity bandwidth later so that we're not constantly eating up network, for example, so we do a lot to optimize. The place where we spent the most time and energy as a company and engineering is in SDK optimization for sure. 

BP That's a great point. Ryan, I'll let you ask a question after this, but just that there's that double-edged sword that not only are you thinking about all these different variables of the device, but the device is also not hardwired, and so you want to be really considerate of heat and battery usage and all those other things. 

RD I wonder if you're able to kind of dogfood your optimization. Are you able to test your own SDK and software with the Instabug software? 

KJ Yeah. We have quality SLAs for our SDK, so we're constantly measuring if our SDK is the cause of any crashes and whatnot, maybe it's like six nines. But we're constantly observing. So our system routes crashes that were caused by us to a specific instance that we are then monitoring on behalf of all of our customers. 

BP So we've talked a bit about how you got into this game, and we've talked about what's unique about mobile observability and also some of the areas you focus on. Let's talk about the AI element of it. Did you come to this company before that was really an emphasis? Has that always been an emphasis, and in what way does it play a role in helping to ensure apps are bug-free? 

KJ So I came to the company slightly before the OpenAI Cambrian explosion of AI, but I think there has always been this notion, and I believe in this, that of all the places–  I'm not a big believer that AI is just going to magically take over every developer's job. It's certainly not the case. I think it will largely on the margin help developers reduce the amount of toil work they have and the amount of maintenance work. I'm not sure that it's ready for primetime to do that job in a web app where actually a lot of the cause of that what you might call ‘issue’ might be the infrastructure that it's running on. And so a lot of times, just from my experience in GitLab, the solution to a problem is an infrastructure change or a scaling or things like that. It's hard for, I think, AI to reason over multiple scopes of code, one which might be a Terraform configuration and another might be the Ruby back end and another might be a front end. The great thing about mobile and the reason why I think of all the places where we're going to see AI materially improve the toil burden on software is going to be in mobile is because it's a controlled platform that is run by another operating system and the back end code is also in the front end code, so it's all one package. So that means that AI can very straightforwardly reason over the entirety of the software that's getting run. So our belief is that the place where we're going to see this kind of auto-fixing of code, auto-healing of code, is probably going to be mobile first, so we're invested heavily in seeing that reality. You can think of it also as straightforward as crashes, for example. There's a known set of crash error codes and so there's a known set of crash behaviors. So it's pretty easy for us and that was what our SmartResolve 1.0 was, to get to, “Hey, this is generally how you should solve these types of crashes.” Our 1.0 version was not giving you code suggestions, but it's at least giving you that there's known best practices from places like Stack Overflow and others that have content about how to solve these types of problems and so we can point developers in the right direction. Our 2.0 and the place where we spent a lot of time is in grabbing the right pieces of code to give context to this typical RAG of how we give the right context to a generator that can give you an actual code suggestion for this fix. So SmartResolve 2.0, the place where we've been investing in the last year, has really been about optimizing our retrieval pipeline for the right pieces of code to give a model– at this point we're using public models– to suggest a fix that's more accurate. And we found that that code retrieval part was difficult. It took us a lot of time to figure out how to get the exact right portions of code through that retrieval process. 

RD It's interesting that you say you got 10,000 crashes and they all share the same similarities, and you could look up a fix but there must be an extra complication of applying that fix to an arbitrary codebase. What have been the challenges of getting that fixed to merge with any given person's codebase? 

KJ Oh, sorry. It's not about the fixes of any given person. The challenge has been indexing their codebase so that we can make a smart suggestion relative to their codebase. I'm saying indexing, but it's doing something where we're not indexing the whole codebase, but we can target based on what we know from the frames in the crash that these are likely the functions that are going to be impacted. So let me give the model those functions and then you can make a more reasonable assertion. And a lot of that might be that we were doing re-ranking on our retrieval to improve accuracy. With some things in Swift and iOS in particular, it's hard for you to reason out what exactly is happening in a function when you give a whole file, so we are selecting specific parts of the function to highlight. So it's more about how you're building the embeddings for the code itself so that when we retrieve from that code, we're giving it more. Actually, when we have perfect, ‘this is the exact code we know we need to change,’ the generation part is pretty good. It's been more of the retrieval part that's been difficult. It's where we've spent the bulk of the last six months improving. 

BP It's interesting that you say you rely on publicly available models, but you also have a specialization like let's talk Android and iOS and they have these peculiarities, these particularities. Do you think over time as you do this and you're able to get signal of, yes, you fixed the problem for me, or no, you didn't fix the problem for me, you could build up a dataset that would allow you to either fine-tune a model or build your own such that this isn't a model specializing in mobile bug detection. That's what it does, just like where there's a model that now specializes in geometry and protein folding and whatever, whatever. Or do you think, like you said earlier, relying on other people's models is the way to go because they invest so much to create them, and then to your point, doing the sort of last mile work of the RAG to make sure that it does it well?

KJ So there's two things there. One, I think that over time we will build fine-tuned and custom models for the code generation part, let's call it that. And the race there is, “Isn't Microsoft and Google going to be better at code generation than Instabug?” Very likely, so I think it's probably going to land in the fine-tuning sense, but I think we can apply a certain amount of fine-tuning. The real place where I think it's very useful is the words you use, which is, how do you spot a bug? So one of the things that we're actively working on beyond just our Smart Resolve 2.0, is the ability to– the way that I think of the evolution of mobile software is that we were always trying to push the boundary of quality. First, it was that you just need a crash-free app. If your app is crash free, that's great. Well, everybody has a crash-free app, so now it has to be a really performant app. We're always going to be pushing the boundary of what a quality app is and what a high experience app is, and it's going to be important that we find new– I call them frustration signals. What are the things that users are not having a good time with that we're not even observing. Instabug has an example of this today that we do that no other SDK does. It's called a ‘frustrated restart.’ You do this all the time on your phone where the app is not responding, it hasn't crashed yet, hasn't caused an ANR or the operating system hasn't crashed it, but you pull up your tray, you close the app and you open it right back up. That's you expressing some frustration, something wasn't happening right in the app. The developers want to know about it. We report that to developers to be like, “This is happening a lot. It happens to happen on this screen with devices that look like this. Let's help you debug it.” But those are all deterministic ways of spotting a frustration signal. We think that there are a lot less deterministic ways to spot a frustration signal. So the thing we're working on is on-device models for your user's behavior that will allow our SDK to capture a frustration signal that nobody else had. Maybe today when I opened my banking app, I usually look at page one and then do a transfer or check out my balance, and now I'm doing this weird swiping behavior because something is not working. Well, a model could spot that. It wouldn't be reported as a bug, but a model could spot that, and then you're getting these richer third level frustration signals, let's call them. So we actually have something like that today that's not using an on-device model, but lets our customers spot UI issues. So this is something that AI is really good at. Let's say there's spillage out of a container or there's misalignment of text or the text coloration is off where it's hard to actually read the text. We can spot those. These are things that even in a pre-production testing environment, a QA team member would never report as a bug. They'd probably just be like, “Too much effort to report. I'm supposed to test functionality.” But we can capture those automatically and report, “Hey, we spotted 15 UI issues in your pre-production build. Make sure you fix those before you push to production.” 

RD I wonder what the limits, the sort of possibilities of the frustration signals are. If I lock my phone and I throw it across the room, can you capture the gesture data? Or if I'm cursing at it, can you capture that? 

BP Right. You don't know if it's me using the phone or my kids, so it's going to be tough to understand why. 

KJ I think iOS I believe has some– 

BP Yeah, fall detection.

KJ No, it's facial detection. It's looking at your face, “Can I tell, are you making a frustrated face or a happy face?” You can think about it evolving into– the race is on for the best quality mobile app. Everybody is competing with everybody else whether you're in the same industry or not, and I think the bar is just going to keep pushing higher and higher. 

RD You get the Neuralink singles. 

BP Exactly. What you said is really interesting. This behavior is common to me that something is annoying me about the app and so I'm dismissing it and restarting it, and that's the opening page of that mystery novel we talked about. There wasn't something clear that you could see, there wasn't maybe a crash or whatever, but for some reason I had an unsatisfactory experience and so now you want to get on-device with something that has some intelligence and can report back to you anomalous behavior, right? 

KJ Exactly. The bar for mobile apps keeps getting higher and higher. When I use my banking app, I use it right before or after I use Pinterest, and so my expectation of how that app is going to perform is based on how well Pinterest performs, not how well it's other banking app that I don't have installed on my phone performs. So the unique thing about mobile is that you've got 100 percent of a user's attention, it's in their pocket. The performance expectations are horrible. As a developer, you should just think, “Man, the expectation bar is so high. I have to put so much effort.” Top mobile teams over and over again tell us that performance and quality is the best feature. It's the thing that drives retention and engagement and keeps people from uninstalling my app. And so we're just seeing more and more of this theme from mobile teams that they've recognized this fact, they're setting themselves up with the expectations of an app quality that's on par with the best apps in the world. It's not only Facebook who can build the best app. And so what we're trying to do as a company is give developers, mobile developers in particular, the tools to respond to that pressure, because it's a lot of pressure for mobile developers.

RD So what do you think the future– do you think phones will increase performance before people figure out how to optimize them down? Will that performance pressure alleviate as the hardware gets better? 

KJ It hasn't so far. The trend is yes, phones got more performant and didn't crash as much, so people were satisfied, but there's still the next barrier. I don't think this is the kind of thing where we reach this magical asymptote where everyone is satisfied with software at its current state and nobody really cares about software quality anymore. So I think there will always be the push for a better experience, a more performant experience, less frustrated. It went from stable to performant. Maybe it's less frustrating, a satisfying experience is the next word we're going to use, but I don't think there's any end to that. 

BP I think maybe an interesting way to think about it would be that as you change from one medium or one modality to another, so, okay, I have some expectations about desktop. Now the mobile revolution happens. I don't have any expectations and early mobile apps are pretty jank. Now all these years later, the pressure is on to have the mobile app that you will keep on your homepage and open every day. And when I use an AI-powered chatbot, I expect it often to be pretty jank, and in the future, I'll expect it to be a lot more savvy and responsive.

[music plays]

BP All right, everybody. Thank you so much for listening. I want to take us out and give a shout out to someone who came on Stack Overflow and spread a little knowledge or curiosity. Awarded five hours ago to itoctopus, “How to break up huge URLs so they don't overflow.” A Populist Badge was awarded because itoctopus gave an answer that was so good that it got more votes than the accepted answer. Appreciate you sharing a little knowledge with our community. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. Find me on X @BenPopper. If you want to hear us talk about something or you want to come on the show as a guest, shoot us an email, podcast@stackoverflow.com. And as always, if you enjoyed the program, do me a favor, subscribe or leave us a rating and a review. 

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog. And if you want to reach out to me with hot tips, cold gripes, whatever, you can find me on LinkedIn. 

KJ I'm Kenny Johnston, the CPO at Instabug. You can find me on X @KenCJohnston and come check out Instabug at instabug.com. We have a full developer sandbox to give you a sense of what it's like. 

BP Sweet. All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]