The Stack Overflow Podcast

Throwing away the script on testing

Episode Summary

Syed Hamid, founder and CEO of no-code test automation platform Sofy, joins Ben and Ryan to talk about scriptless automation, why his platform targets mobile app developers, and what he learned in nearly two decades at Microsoft.

Episode Notes

Sofy is a no-code test automation platform for mobile apps. SofySense is their OpenAI-powered AI assistant. See what they’re up to on their blog or check out their open roles.

One of the biggest challenges in testing is deciding whether to use mock or live data.

Interested in reading about how Stack Overflow is building up our test coverage?

Syed is on LinkedIn.

Congrats to Lifeboat badge winner Todd A. Jacobs for interceding between the question How can I check whether a string is an integer in Ruby? and the relentless march of time.

Episode Transcription

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, Director of Content here at Stack Overflow, joined as I often am by my colleague and collaborator, the Editor of our blog, the illustrious newsletter manager, the experienced technical writer, Ryan Donovan.

Ryan Donovan Oh, thank you, thank you. Glad to be here.

BP So Ryan, a couple of topics that have come up recently on the pod and are buzzing around in the discourse for I think software developers in general is AI and ML, the degree to which it can help with writing code or testing code. We recently had the folks on from Codium to chat about this, and our guest today has some ideas in similar veins, probably has his own approach, and I'm excited to hear about it. So from your perspective, Ryan, writing testing is something no developer really wants to do, and saying, “I got to this level of test coverage just so I could pass,” means you write bad test coverage. And so maybe this is one of those areas where automation will not eliminate jobs, it'll just make everybody's lives better. I don't know, what do you think?

RD Yeah, I mean, I remember the days when there was a lot of manual testing and we had a whole team of QA folks going through the program and clicking buttons. So I think the less contact you can have between the developer and the actual testing the better. It's repeatable, it’s reliable.

BP Yeah. All right, so without further ado, we'd like to welcome Syed Hamid, who is the founder and CEO at SOFY onto the program.

Syed Hamid Thank you for having me, Ben and Ryan. Nice meeting you.

BP Nice to meet you too. So Syed, for folks who don't know, tell us a little bit about your background. I know you spent two decades at Microsoft and were an engineering manager there. They are maybe the top dog again. History has a way of repeating and rhyming and going in circles when it comes to AI and their partnership with OpenAI and all the things that they're pushing out at a tremendous pace. Talk to us a little bit about your time there, what you learned, what it was like working there, and maybe reflect a little on how they ended up in the pole position.

SH Yeah, so I actually started right out of college at Microsoft back in ‘97. I started as a software test engineer and grew through the ranks and part of the eight-member leadership team that was responsible for all the developer and tester activities within the company, and worked on pretty much six or seven different divisions, from back in the day, the MSN, the Narrowband, and the broadband, and as latest as compilers and languages for the Dynamics product line. And I saw this problem firsthand of how difficult it was to kind of just test and release and do that every day on a day on day basis. It was frustrating and I see that as kind of an opportunity that it's a much bigger problem than just Microsoft alone, it's an industry problem. And then finally I kind of decided to jump the ship and I started SOFY.

RD So tell us a little bit, start at the beginning, talking about automating testing. I know people write test scripts. You want to do scriptless testing. How does that work?

SH So before I go into that one, I think it's important to kind of look at a day in the life of a QA, or how people have to go test it. So testing comes a little bit later, but if you look at just the flow, right from the CI/CD integration, setting up an environment, using different frameworks like Appium, Selenium, and different tools, and then creating the automation, and the fourth element is actually executing at scale. So all those four components actually create a lot of friction. So how I see the problem is not just that I have a scriptless automation to create automation, but actually transforming the user experience, in this case, the QA, from setting up the environment, running it at scale. And doing it today, what people do is, as you know, they write a bunch of scripts with CI/CD integration. They write a bunch of scripts to set up environments, setting up devices or environments and then kind of write using the right code to do that. And I think that's where we actually come into the picture. We have taken a little bit of a different approach in the sense that it's taking a progression of the software testing. And if you also look at the progression, you have the manual testing, you have the automated testing, and then you have now the no-code testing. So how we see the future is more of an intelligent testing, and what an intelligent testing means in this case is that software's ability to identify itself how to test, we have reached a point where we think that with the technology we are able to kind of create tests or to generate the tests and be able to execute it. So that's kind of what, at the 10,000-foot-level, our approach is.

BP It's really interesting. I had mentioned before the show that Ryan and I are working on a piece. It's something along the lines of, “Self-healing code is the future of software development.” And so the idea is that there have been things you might add to a CI/CD pipeline or a GitHub action, like you said, that you might write some scripts that were useful for this. But with the entrance now of really powerful AI that is really good at reading and understanding code, creating its own, can look at maybe a specific codebase and understand the dependencies, the packages, the style even of the code, it seems like what you're suggesting and what we've been having really interesting conversations with people about is that, as you're writing the code, or maybe after you've written it but while it's still not in prod, you're able to run things that say, “This is a good test. This looks like an area that needs to be shaped up.” Basically adding pull requests that you can then look at that would help you automate some of what you were saying is quite burdensome, that overhead you described of several steps before you can even get to the tester.

SH It actually goes even further than that. We obviously have a product that we announced a month back. I can now actually look at your functional spec or a story in your Confluence page and I can actually generate the test case’s route out of that and map it to what I have tested before. It's amazing what we can accomplish today. Obviously, it's not going to fully replace people, but just as you mentioned, it's going to augment. So imagine the world that if I'm a product manager or a program manager, and if I write my story and the system is smart enough, because I have already tested your product or your software, so I can actually bring those things together. And actually we have been able to demonstrate that and it's been resonating quite well, is that that capability, because one of the key capabilities that OpenAI brings is that I can take any software, because at the end of the day, it's a state diagram, take that one and create your own model and start mapping it to not only the source code that you mentioned– “Hey, for a given product change, tell me what's impacted,” but also identify a new feature and create the test that I can auto-create, or we call it the set test case generation, and be able to execute that seamlessly.

BP Right. Well, if you're not saying ‘generative’, you have to add generative so that your company will be highly valued.

SH That's a good one. We were in this space before this generative word became so common that it is now.

RD It's interesting writing the test case off of the functional specs. Obviously somebody can write janky code to do the thing and it doesn't do the thing, so if you're just looking at the code, you're like, “Well, that did the thing badly like it was supposed to, the code says it does.” Do you think there will come a time where people are just writing functional specs and the AI does the rest?

SH I think we are going that path, but it's still far from it. And what are the challenges with it? So there are three types of challenges. One you already highlighted, that if somebody writes a bad code, you can't figure it out. Obviously it's the same thing in that the functional spec also has the same problem. If somebody writes a bad one, then you'll have a bad outcome. But one of the things that we see, especially in the legacy or older product where people are going through the digital transformation, a lot of things are not documented. So you get partial information and you may cause more damage than the solution. And the third problem that we see in this is, especially in the line of business applications, there's a lot of dependency on different types of test data that determines how things move. But regardless, I think we are going in that path and I do believe that in the next couple of years there will be a lot more advancement than we can actually think now, because right now people are building all the things on top of OpenAI.

BP Yeah. There was the demo. They haven't released the multimodal model yet, but I draw the sketch of my website on the napkin, I take a picture and it spits out the HTML. So that's a very toy example, but like you said, somebody who's way more advanced who's writing a story or a functional sort of outline might be able to get something that’s a bit more tangible.

SH Yeah. I think the biggest challenge that we'll have, and we are already seeing that, is how do you get to the baseline? What I mean by baseline is suppose you're testing a mobile application. So you have to know the fullest story, all the artifacts of the product, menus, items, transactions, and all those things. I think once you get to that baseline or build your multi-model, then actually you can start mapping into more and more other aspects of it that can help us generate the test cases and be able to execute that. By the way, related to that, today we can take any contextual text just simply saying, “Log in using Office365, do this,” and you can actually convert that into an Appium code out of the box through OpenAI. So there's already the building blocks of it, so generating of test cases is there, the ability to generate code is there, so it's the ability to stitch this together into a seamless experience that will transform the QA experience.

BP Yeah, I had an experience myself being a pretty unskilled coder trying to do these things, and exactly like you said, it was the area of stitching together the front end code with the backend database with the area I was running the Node.js application, and in my mind what I started to realize was that the cloud providers that bring all this together so that as you're talking to the service, you're saying, like you said, “I need a login page here. I need a little bit of OAuth here. I need a database of this size. I need it to be able to scale depending on the amount of demand. And this is how I'd like to go about my front end design.” And if those things are natively integrated and it's clear to the AI how to connect those pieces, it becomes incredibly powerful. Because that's where a lot of the friction is right now.

SH Exactly, and that's exactly what our approach has been. And to Ryan, your question earlier of how I see the scriptless transforming, is actually that's the next generation of it. Scriptless is very focused on creating tests. It's not about generating, executing, and bringing and reporting everything together. And another thing that we have seen which is very powerful is that ability to understand your quality report. And what do I mean by that? So today, as an example, if you're having a mobile application, you have to have the visual quality separate. You have to have performance data separate. You have to have testing metrics separate. So now with OpenAI, you can actually just ask the question, “Hey, I'm releasing an application into this particular country. What do I need to go do it? Hey, what step is taking the longest performance and what are the preceding steps?” So actually you can start analyzing these results in a very unique way that gets consumable by anyone in the team instead of writing custom reports.

RD We did a couple of pieces on testing on the blog a while back talking about how the best tests are deterministic– they have a functional core. On a scriptless level, how do you handle those things that aren't deterministic, say data from a database or the time or things that will vary every time?

SH Yeah. That's one of the things there. We see two types of examples in a non-deterministic test. One is that especially if you have dynamic content, and two, you have a hybrid app or dynamic apps that don’t give you all the controls and all those things. So one of the things that we have done and different companies have created is the different modules. So now you can actually analyze the image, for example, the one time password configuration. So actually now what you can do is you can do a three-pronged approach, meaning you can actually analyze the image and understand what are the constructs of it, and then you can actually apply certain data sites that are pre-configured. You can actually generate that on the fly. And the third is actually bringing those together in a non-deterministic way. So I totally agree, it's a challenge. The more and more apps become dynamic, the more challenging it becomes. But there are few techniques that people have been using to address that.

RD Do you recommend or do you use mock data or fake data at all?

SH We don't. We actually have our own patents on it. So we have seen that if you use, especially in the testing, depending on what testing you're doing, if you're doing in unit testing, the mark should be reasonable. But since we address this more of a QA persona, we believe that for those, when the mock actually can give you a lot of, “Yay, things look good,” and they are actually not. So we have actually not followed the mock

BP I noticed that it seems you focus on mobile. When I go to your website, that seems to be what you're calling out the most and a lot of big name clients there sort of talking about the apps that you've helped them. Is there a reason that you're particularly focused on the mobile space, and maybe talk to us a little bit after that about what is your tech stack? You mentioned you got started a few years before kind of the recent revolution, but it sounds like you also have learned how to tap into OpenAI and things like that as of late. So why mobile, and what's the tech stack behind all this?

SH Yeah, so what we were seeing is that when we started the company, we had to focus on certain things. We realized that the mobile was a bigger problem. Every company, especially B2C and others, have to have a mobile presence. And a few of the last reports that I had seen is that that was growing much faster. So people spend more and more time on mobile, both in the mobile browser as well as mobile apps. So we support both mobile browser, mobile apps, but also we noticed that one of the pain points that people had on the mobile is that in order to test the mobile, you have to call certain APIs, as Ryan mentioned, to retrieve data and perform actions. So we actually do mobile browser, mobile apps, and APIs as part of one suite. And we felt that that's growing fast and was a pain point that we wanted to address, so that was the intent for addressing that.

BP I guess now I would love to know a little bit about your tech stack. What did you decide to work with and how has that evolved over the few years you've been in business, especially as kind of a seismic change has occurred with the latest wave of generative AI stuff?

SH Yeah, so we have three core components of the product. One is obviously the presentation layer, then we have all back end, and then we have our own machine learning models, so these are kind of the three broader categories. And the back end pieces we actually run Azure, and the reason we use Azure is basically because we needed the redundancy across multiple environments and also, we work with Enterprise so the compliance and SSO makes it much easier and simpler. And we were also part of the Microsoft startup ecosystem as well. So we use Azure as our back end. Our front end is Node.js and Angular and other pieces that we use. And then for machine learning, we have actually our own models that we have created. We have our own models that we use actually in the machine learning. We use around seven or eight different algorithms, from OCR recognition to building our own models and prediction on what elements are there. So we have a combination of the things, and the one thing that we do is, also you can access any device, Android and iOS, under 10 seconds from anywhere in the world as if you're using it physically. So that actually requires a lot of optimization, so we have a whole network infrastructure layer that actually drives that efficiency.

RD Yeah, I remember a friend of mine was running his own mobile game company and he would go to conferences just to pick up test devices.

SH Yep.

RD Yeah. So you will allow testing sort of actually or virtually on devices?

SH Actually we have actual real devices in our data center in Seattle. And one of the things that we do very uniquely and that's what we were talking about with how we can rate our model, is actually when you go to any device and start playing with the application, under the hood, we are training it. So where you are clicking on it, what you're doing, that all feeds into our machine learning system and where the users are doing it and that's how we are able to do it with high accuracy and generate the test case and be able to run that faster.

BP Interesting. So you mentioned that you had worked at Microsoft and that this startup kind of came through the Microsoft ecosystem. Tell us a little bit about were there things you learned there, principles for the way you organized teams or think about software engineering, or something that came out of maybe having a sort of pseudo-partnership with them going through their accelerator that have played a big role in what SOFY is?

SH I joined Microsoft right out of college. So I think I should write a book on failures, literally, because a couple of things are the foundational things that I learned. Number one is that realization that what a founder don't know, don't know. So that itself is a very simple word, but it is very significant. As an example, I'm the lone founder, so when I started it I had no idea about sales and marketing. Knowing what you don't know is very important so that you are surrounded with the people who can complement the skillset. As one person or one founder, you can’t bring everything to the table. And unfortunately, early days of Microsoft, to a certain extent. So that's one foundational principle. Two is that, as I spend more time at Microsoft, my understanding of the ecosystem was through Microsoft’s lens, and the industry is far bigger than just Microsoft. So that was one big learning. It literally took me a year to really kind of figure out where it is. And in the 2015/16/17 timeframe, Microsoft wasn't in the mobile space as well, so I was doing something that was completely different. So industry problems was a really bigger learning that I had. And the third one is really that it’s all about people, and that's one thing that I was really helped with at Microsoft. Microsoft was very focused on people development and that you're as successful as your team. Micro or macro markets things aside, it's the strength of your people that makes or breaks the company. So I think those are the three principles that I now try to go use in managing several hundred people at Microsoft and now growing the team from one end to end and really rolling up the sleeves and doing it. It's a different mindset to say the least.

BP Cool.

[music plays]

BP All right, everybody. It is that time of the show. Let's shout out a Stack Overflow user who came onto the site, shared a little bit of knowledge, and helped to rescue a question from the dustbin of history. A Lifeboat Badge is awarded to Todd Jacobs for coming to rescue a question and providing a great answer. “How to check whether a string is an integer in Ruby.” Todd has got you covered and helped over 17,000 people, so we appreciate it, Todd. I am Ben Popper. I am the Director of Content here at Stack Overflow. Find me on Twitter @BenPopper. Email us with questions or suggestions, podcast@stackoverflow.com. And if you like the show, leave us a rating and review. It really helps.

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find the blog at stackoverflow.blog. If you want to reach out to me, you can find me on Twitter @RThorDonovan.

SH I'm Syed Hamid, founder and CEO of SOFY.ai, no code automation platform for mobile apps. Try us out at sofy.ai and get your mobile app tested by AI.

BP Very cool. All right, everybody. Thanks for listening and we will talk to you soon.

[outro music plays]