Serial entrepreneur Varun Ganapathi joins the home team for a conversation about the intersection of physics, machine learning, and AI. He offers some recommendations for developers looking to get started in the ML/AI space and shares his own path from academia to entrepreneurship. Plus: How an early lack of Nintendo motivated Varun to learn to program.
Varun is the cofounder and CTO of AKASA, which develops purpose-built AI and automation solutions for the healthcare industry.
Building a physics simulator for a robot helicopter as a student at Stanford helped Varun connect his interests in physics, machine learning, and AI. Check out that project here. His instructor? Andrew Ng.
Along with Ng, Varun was lucky to connect with some brilliant AI folks during his time at Stanford, like Jeffrey Dean, Head of Google AI; Daphne Koller, cofounder of Coursera; and Sebastian Thrun, cofounder of Udacity.
When Varun earned his PhD in computer science and AI, Koller and Thrun served as his advisors. You can read their work here.
In 2017, Udacity acquired Varun’s startup, CloudLabs, the company behind Terminal.
Connect with Varun on LinkedIn.
Today’s Lifeboat badge goes to user John Woo for their answer to the question Update the row that has the current highest (maximum) value of one field.
[intro music plays]
Ben Popper Gatsby is the fastest front end for the headless web. If your goal is building highly performant, content rich websites, you need to build with Gatsby. Go to gatsby.dev/stackoverflow to launch your first Gatsby site in minutes and experience the speed. That's gatsby.dev/stackoverflow. Head on over, use that link, let them know we sent you, and help out the show.
BP Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, Director of Content here at Stack Overflow, joined as I often am by my wonderful colleagues and co-hosts, Cassidy Williams and Matt Kiernander. Hey, y'all.
Cassidy Williams Hello.
Matt Kiernander Hello.
BP So we're going to have a fun conversation today. We're chatting with a serial entrepreneur, Varun Ganapathi. He is the CTO and co-founder of a company called AKASA. He has created and sold a few other companies and he's deep into the world of machine learning, which is involved in so much of what goes on in the software industry these days. His particular focus is trying to use it to fix healthcare, which as an American the system has felt very broken to me for many years so I'm interested to hear what he's trying to do. Varun, welcome to the show.
Varun Ganapathi Thanks for having me.
BP So we usually ask people to start off by just telling us a little bit about how'd you get into the world of software and technology?
BP Can I find those on the internet archive? Can we check out some of your work? I'll put them in the show notes.
CW That makes a significant difference.
VG I always forget to say that. It's a radio controlled helicopter, like a drone basically, but like a very big drone, like one and a half meters long.
BP Yeah, it sounds a lot more exciting if there was a pilot in there and you were all standing right next to a big helicopter. Got it, a drone.
CW That would be much more risky.
VG Yes. We did still stay a very long distance away from it though because anything could happen. But that was back in 2003 and that was before drones were I think a thing. I don't know. I never had seen drones commonly before that.
BP Yeah, way before. 2013 is when DJI released their first little Phantom drone that was super easy to fly and that's when they started taking off. So you were 10 years ahead of the curve on consumer drones just exploding into something you might put in your back pocket.
VG Yeah, it was definitely cutting edge at the time I thought. And what was hard about it actually was that because it was a helicopter with a single rotor, it's sort of a lot harder to actually fly than a four blade quadcopter which the drones are based on for various reasons. Yeah, that was a really awesome project and then I got extremely excited about machine learning because it was really fun. I was also very happy that I could use my physics knowledge in that project because when we were building the physics simulator, I actually could blend ML and physics because the model was physically derived, but the parameters of the model were trained by machine learning. So it actually would use real data of the helicopter flying and it adjusted those parameters to make the simulated helicopter behave like the real one so that when we trained on it it would actually be realistic. And so that was the bridge moment. That's when I got very excited about machine learning. I was walking down the street on University Ave and I saw a car that said ‘Googler’ on the license plate. This guy came out of it and I just walked up to him and I said, “I really want to work at Google. I'm very excited about machine learning. Can I get a job at Google?” And it turned out to be Jeff Dean, and he said, “Sure, just email me at email@example.com.” And I had no idea who he was or anything, I think this was back in the summer of 2003. And so I just emailed him my resume and then I got a job at Google and I worked there the following school year. That next quarter I basically worked at Google and took classes at Stanford at the same time, but that was a really awesome project where I got way more excited about machine learning and in computer science more generally and that and the helicopter project and taking a class from Daphne Koller at Stanford is what started my complete switch into [computer science]. At that time I took this really great class called Probabilistic Graphical Models that she taught and I was taking it at the same time as I was taking statistical mechanics. And it was just so cool how the ideas from physics about entropy manifest as entropy in the information theory sense in machine learning. And so for me it was a mind blowing quarter where everything sort of came together and I was like, “Oh, these models we're using in physics are actually what they use to create Markov random fields in AI.” And so everything came together and I was like, “Okay, cool. I'm going to do a PhD in artificial intelligence,” and that was sort of how I went from physics to AI in a very long story.
CW Yeah. You stumbled upon some of the biggest names in AI. For those who don't know those names, Andrew Ng, you can find a bunch of free courses on artificial intelligence because he made them. He's very, very big in the AI space. Jeff Dean, I think he runs Google AI– big deal. And then Daphne Koller, I think she works in the Stanford AI lab and she founded Coursera.
VG Coursera, and then also now Insitro.
CW Anyway, lots of very big names to stumble upon and learn from.
VG Yeah, not to add, the last name I would say was also very influential, is during that helicopter project we went to DC to demo it for a grant and Sebastian Thrun was there as well and so I met him for the first time and he and Daphne ended up being my PhD advisors at Stanford. So Sebastian Thrun from the DARPA Grand Challenge, and he started Udacity. Yeah, I got extremely lucky to be at the right place at the right time and just met all these awesome people and learned from them.
BP That's incredible. We just did this amazing collision of things where physics and machine learning came together, you happened to be in Stanford and Silicon Valley when a lot of these big names were figuring out what they wanted to work on, and we were getting to that tipping point where ideas about AI that had been dormant for a long time were beginning to come to life again thanks to the amount of processing power. You went from there to create a couple of companies. Can we touch on those briefly, the origin of each and why you decided to sell them and move on? And then we can transition to what you're doing these days.
VG Okay. So I worked at Google on the Google Print Project where they were scanning all the world's books and I used machine learning to help automate that process of reading the pages, extracting tables of contents, things like that. So that got me very interested in computer vision. When I went back to Stanford to do my PhD I focused a lot on machine learning theory and computer vision. And my PhD thesis with Daphne and Sebastian was on real time motion capture from depth cameras. So what I wanted to build was a camera, and this is a funny story, I was like, “I want to learn how to dance, so maybe I could make a computer program that would record how people do various moves and then teach other people how to do that because they could watch you.” Sort of like Dance Dance Revolution on the Kinect way before, like in 2008. And so my PhD thesis was how to build that. It was funny, when I started the project, these depth sensors cost $5,000 and so people have been trying to do things with computer vision with RGB cameras for a long time and now it's finally working, but back then people just tried for a long time and there were some limitations. You just couldn't break through certain boundaries. And so there were these cameras that basically will measure not only color but distance to an object using time of flight of light. And so I used that sensor and it was very noisy data, very low resolution, to try to build an algorithm that would in real time be able to detect a person's motion in front of a camera and use it to do other things. And so I started working on that and as I mentioned, this was in 2009, 2008, what happened was that essentially, I thought to myself, when I build these algorithms eventually these sensors will become cheap and then these algorithms will be really useful. I did not anticipate how quickly that would occur. Literally a year later the sensor went from being $5,000 to $50, like bill of materials, and then Microsoft announced they were going to use it for the Xbox Kinect to build a system that you could interact with the computer with. We decided to found a company based on the technology on how do we commercialize it, and this was before anything had come out yet and so we started a company called Numovis and we started to go raise money. We went to maybe like two VCs, had a Series A offer, and we also had a presentation at Google where we wanted to have them be a partner. We thought they could use this technology to power some product, and we did our demo, and then at the end of the conversation they just said, “Would you like to be part of Google?” And so it was me and three other people from the lab. And it was a tough decision to answer your question. I literally was still a PhD student at Stanford and I just founded the company. I think I just left or was in the boundary of leaving. The other founders just said, “Let's just do this. It's extremely low risk, a very big return for a very small amount of time spent.” And PhD students make $35,000 a year.
CW In the Bay Area.
VG Yeah, so it was quite a big jump in standard of living from that and so we decided to take the offer. I was always a little conflicted to be honest, but the other three people really wanted to do it so we did it. And so that's how that company ended up getting sold. I also started another mini company during my PhD. I wouldn't call it a company, it was really more of a hobby. I was really into computational photography. So a friend and I wrote an iPhone app called Pro HDR. It was one of the first, if not the first HDR app for the iPhone, or maybe it was within a month of the first app to come out, and we basically thought to ourselves, “How do we computational photography techniques and just put them onto the phone,” because now phones, cameras, and the internet had all combined into one. And so we thought we could take these algorithms that people normally have to run offline after they take the pictures and just put them on the phone. And so we built an algorithm that would take two pictures or three pictures very rapidly and then blend them together into one. It's called High Dynamic Range photography and this was before it existed at all. Now it's a feature on every iPhone and I think every Android even, but back then, this was 2009 December, 2010, it had not existed yet.
BP So you take three pictures and the one where my eyes are open is the one you use along with the one where my hand looks good. Is that what you're talking about?
VG Sort of. You're exactly right. And you could do that with the technology, but the general common use case is, a lot of situations will have something that's very bright and then something that's a lot darker. So for instance, if you're standing in front of a sunset, the background will be a lot brighter than the foreground. And what will happen normally is if you take a normal picture, you'll either say, “I'm going to expose the picture such that the foreground or the darker subjects are well lit,” but then you're going to have completely whited-out background. Essentially everything that's bright will have completely no detail, it'll just be white. Or you could do the reverse. The things that are bright will be well exposed, but the things that are dark, you just can't see them at all. And so what it does is it just takes both of those pictures, the dark one, and the light one, aligns them very rapidly together because you're doing this with your hand moving and that's the hard part, your hands are moving as you take the picture. It seems like not a big deal, but the slight vibration of your hand moving will cause a lot of ghosting if you just take the two pictures and just directly blend them. So you align the two pictures very rapidly, and then you do what's called tone mapping. You basically produce one pixel for every pixel in the image where it chooses the best one essentially. It chooses the better exposed source in order to produce the result. And so it lets you actually have one picture that has everything well exposed. It looks a little artificial because it is, but also a lot of people like the effect and it looks cool. And so that's basically what HDR is and that's what the app did back then.
MK Basically every time you've tried to take a nice photo of you against a sunset, and it's either that you look really nice and exposed and it's just a pure white sky behind you, that is one of the aspects in which this solves correct?
VG Yes, exactly. And any other situation like that. Windows, if you're in a house and you're taking a real estate photograph and there's a window open and it's daytime you'll have exactly the same problem. Anytime where there are objects that are very different brightnesses. So we started that company, that was all before I went to Google. I was a research scientist at Google for a couple of years and worked on AI there. And I learned a lot from the people there as well about computer vision and a lot of things and how to do large scale distributed computing. So I decided to start another company. This time my goal was really to build an extremely easy way to get started in machine learning or computer science in the cloud. So the idea is that instead of having to install a bunch of stuff on your computer and get it all working which can often sour the experience, it would be a website where you could literally create a snapshot of a working environment and share it with someone and then they could just start it up and it would just be running in your browser. The website was called terminal.com, which I got from my dad from the terminal exchange systems old company, so that thread emerged. The funny thing is it was a terminal in some sense. You were just getting access to a machine in the cloud. And so it was just using the browser as a terminal so that's why it was called that. What I ended up doing is, the main focus was education. One thing I learned is that it was a decent market but it was not a huge market. But my purpose was really that I was just excited in teaching people. My goal was just to make it easier to learn how to program. And so that company ended up being used by a lot of different education companies. Stanford used it for their deep learning class because you could get a GPU in the cloud really easily. It was sort of like Colab, if you've seen the Google product Colab.research.google.com. It was like that but before that, and so I built that company and at some point during it I realized I was spending a lot of time not doing AI, which is the thing I was actually excited about. I was helping other people do AI, but I myself was not doing any AI or machine learning. I was creating infrastructure and allowing it to be good. We created our own notion of containers where you could create snapshots of working environments and so on. And ultimately I decided, let me just sell this company and I can go focus on machine learning which is the thing I was actually passionate about. And so while I was running that company I'd seen all these advancements in deep learning and a lot of things that were in my PhD hard to do were now becoming a lot easier to do. So when I was a PhD student, implementing gradient descent for a complex model was not an easy feat. You had to write the gradient yourself by hand and actually test it. You have to implement all of the things that are now completely automated. So in a lot of these modern deep learning packages like PyTorch and TensorFlow, one of the key selling points is auto-differentiation. You can just feed it the objective function, the thing you want to optimize, and it will calculate the slope and give that to you and then you can use that to optimize your model. Back in my PhD you had to write it by hand, it was very difficult and it took a long time. But now it was possible to do all of these things a lot quicker. So basically I started thinking to myself, I want to do something with AI that will make the world better, and I thought healthcare is something that really could use a lot of improvement. And so I sold the company to Udacity and I started gestating this new idea of how I can make healthcare better. A lot of companies had started in radiology and clinical stuff and I thought to myself that no one is addressing this other problem. You hear about all these doctors with burnout and all of this administrative work that they have to do, all of this paperwork, and I thought back to my Google Print days and I was like, “Could we automate all of that paperwork? Could we have AI just take care of it so no one has to deal with it and no patient gets a surprise bill because there's some error that occurs?” And that was the idea, that's the company we decided to start with AKASA, using machine learning and AI to automate all of the boring stuff in healthcare so doctors and caregivers can focus on delivering care rather than paperwork. So that's sort of how I got from there to here.
CW What a journey you've had.
MK Yeah that’s a wild story.
CW That's some really cool products and companies and stuff that you've been able to work on and then just cool people to meet too. That's amazing.
MK Yeah. I was going to ask, who's going to play you in the movie? There are a lot of web developers out there who are curious about AI and machine learning but to them, it's I guess kind of like blockchain. It's a technology that they know and are somewhat familiar with but haven't actually gotten the chance to understand what it is to actually develop with AI and machine learning. So I'm very curious, can you describe for other developers out there who may not be familiar with what it takes to develop with AI or machine learning, what that kind of process is and what you'd recommend to get up and running within that space?
VG That's a great question. The first decision to be made is, do you want to build your own models and actually get really deep into it, or do you want to use AI and machine learning as a technology that you apply to whatever your domain is? And based on that decision, I would recommend different approaches. So if it's an application, which is reasonable, there's a lot of things that already are possible and we can just take advantage of them, so from that fork of, I want to use it as an application, first think about what are all the things you've seen people do with AI. So there's a lot of computer vision stuff. We can now detect people extremely easily. Optical character recognition is very well solved. We can detect objects really easily and so on. And then there's also audio. Voice recognition is now a lot better than it used to be and works really well. Basically pick the domain and say, “These are all the things.” Those are all the detection and recognition applications. And then there's generation, like you've seen things like DALL-E where they can generate images automatically from typing in sentences and so on. So I would say to look at the applications you've seen people do something with and if you want to make your own app, think about what cool app could I build that leverages those technologies. And then probably find an API that you can just call that turns it into a service or into an easy thing that you can embed into the browser that you can try out. It also depends if you're doing mobile or web. If it's mobile there's also a lot of SDKs that Apple has and Google has and so on for running it locally or in the browser or if it's as a service. And I think that's kind of how I would approach it. If you want to learn how to actually do the models, which of course I recommend because a lot of the time there's a model that doesn't quite do exactly what you want, you can just take it and adapt it to make it solve the problem you're actually interested in. I would recommend taking an online course in deep learning actually, and a lot of them make it extremely easy now. At Udacity you can just get instance in the cloud as part of the course where everything will be set up for you and you can just start learning. Coursera also has courses.
BP What a world, man. Take an online course and just get some deep learning instances in the cloud spun up for nothing. That's amazing.
VG Exactly. It's amazing. And I would say learn how the models work if you're serious about it. I think people underestimate this part, but I think just remember multivariable calculus if you ever took it.
CW Just remember that good old multivariable calculus.
VG Okay, you only need the first semester of it. You don't need to know the integral part, just the gradients. Be able to understand what is a gradient in multiple dimensions, what does that mean? If you can really internalize that, everything else follows almost. Everything else is just applying that to all of these models that exist. It's a long road, it's not easy, but I think it's a very rewarding road to go down because this technology is only going to become more and more prevalent and so learning it now is probably a good idea I would say.
BP Yeah, it's interesting the way you say to take one of these, especially ones that are out there with an easy tutorial and some of these established tools and then apply it to your scenario. My favorite one that I ever saw, and this was a while back, it’d probably be easier to do now, was a guy who went and visited his mother and father or maybe his grandparents on a rural farm in Japan. And they had a cucumber farm and he trained this AI model to sort the cucumbers as they rolled past and push the little ones over here and push the big ones over there. And it was just very simple but for them it was really amazing that this thing that they did every day where they manually sorted the cucumbers could be automated, so I thought that was kind of a fun one. And like you said, figure out where it fits in your life and use that as a practical way to learn about it.
BP All right, everybody. It is that time of the show. I'm going to shout out the winner of a lifeboat badge, someone who came on Stack Overflow and saved a question with a negative score, gave it a great answer, and now it has a positive score. Awarded 15 hours ago to John Woo. Hopefully the same John Woo who directed face off and other amazing action films.
CW Yeah, I bet it's the same one.
BP It's the same guy, right? He's an amazing programmer. “Update the row that has the current highest maximum value of one field.” How do you do it? John Woo has the answer. All right, everybody. Thank you for listening. I am Ben Popper, Director of Content here at Stack Overflow. If you want to send us a question or suggestion, hit us up, it's firstname.lastname@example.org. And if you like the show, why don't you leave us a rating and a review. We're always listening. I got an email just the other day from Ellen. She wanted to say she really enjoyed the episode about flipping the interview questions. Cassidy, you were on that one, right? Maybe Matt too.
CW I think I was.
BP Flipping the interview questions and interviewing the interview. She liked that one. Oh, and no cold open. So another vote for no cold opens, they're not coming back. It's three to nothing. No cold opens.
CW That being said, I've been Cassidy Williams. You can find me @Cassidoo on most things. I do developer experience at Remote and OSS Capital.
MK I'm Matt Kiernander. I'm a Developer Advocate here at Stack Overflow. You can find me online, YouTube and Twitter @MattKander.
VG I'm Varun Ganapathi, CTO and co-founder at AKASA. We are hiring for all sorts of engineering roles. You may have heard of tech freezes. We are not freezing our hiring. We are really looking for people so if you're looking for a job, please head to akasa.com/jobs. And you can find me on LinkedIn. That's probably the primary place that I am. So if you want to message me, feel free to message me there or connect. And thanks for having me on the show.
BP Yeah, thanks for coming on. It was a fascinating journey you had. All right, everybody. Thank you so much for listening as always, and we will talk to you soon.
MK Thanks, bye!
[outro music plays]