The Stack Overflow Podcast

Exploring the cutting edge of privacy and encryption with Very Good Security

Episode Summary

We chat with Mahmoud Abdelkader, CEO and co-founder of Very Good Security. VGS is a data security platform that changes the way sensitive data is held by eliminating the need for customers to hold their own data. Abdelkader is also previously known for being CTO and co-founder of Balanced Payments, which was a payments platform that enabled peer-to-peer marketplace businesses to thrive. It was acquired by Stripe in 2015.

Episode Notes

We chat discrete mathematics, differential privacy, and homomorphic encryption. But don't worry, we also break it down in laymen's terms.

Interested in working in security? Mahmoud will personally extend an offer to anyone who solves this puzzle.

Puzzles not your thing? You can still learn more about Very Good Security and its open positions here.

Mahmoud is on Twitter here.

 

 

Episode Transcription

Mahmoud Abdelkader VGS is a great idea to shift the data that's sensitive off of your plate so that when you do have an incident, it's not a catastrophic in the news incidents, it's just more like hey, someone breached defenses, the perimeters breached, but they didn't breach the data.

[intro music]

Ben Popper CockroachDB is the only bug you'll ever love. Because it's the only one you don't have to worry about. As a low touch SQL database that automatically handles scale, operations, and uptime. CockroachDB lets you focus on developing, get your free cluster and a free t-shirt at cockroachlabs.com/StackOverflow. 

BP Hello, everybody. Welcome to the Stack Overflow Podcast. I am Ben Popper, Director of Content here at Stack Overflow. And I have a great guest today, he is coming to us from a cybersecurity firm. And I just feel like I see this stuff all the time in the news. It's constant. It's a wave of sort of ransomware and state level this so we wanted to bring in some experts in chat learn a little about his business and the tech stack that goes behind it. So welcome to the show. Tell people who you are and what it is you do.

MA So my name is Mahmoud Abdelkader. I'm the CEO of Very Good Security. And it's pleasure to be here. Thank you so much for having me here, Ben.

BP Oh, yeah. So yeah, I get to tell us a little bit sort of like your backgrounds. Like, what brought you to the role you're in now? Did you start learning this stuff early on? Was it something that came to you late?

MA So yeah, when I went to college, I studied computer engineering. But before that I actually learned to program when I was about 14 years old, I was introduced to a game called Age of Empires and Warcraft 3, and I used to actually build no fog hacks for that. And to be able to do that I had to learn how to basically reverse engineer the assembly. So I can figure out kind of like how it would work and how to introduce some of these things. So that's kind of how I learned to program, how I learned to be a technologist. I took a c++ course and high school. We were very fortunate to have that in my high school, but then ultimately just led me to go into college and learning to become a computer engineer, where I learned really kind of like, how kind of the hardware and software fit together. But that, that's really kind of like my educational background.

BP Gotcha. And were you focused on security that university level or that came later?

MA I actually took something called cryptology. And then I had a minor in discrete mathematics, which is the foundational mathematics behind today's cryptography and the theory behind it, but I didn't actually realize that I was going to use it at any point in my career, I just really liked it. And really only in this company has that started to really start to make sense, right? Going back to like group theory, understanding kind of all about, you know, Fermat's theorems and factoring of numbers. All that made sense at the time, but I didn't really see how applicable It was my day to day. But at VGS, it's something that you know, it's very important as we think about how we unlock more value from data without having to be encrypted.

BP Yeah, it's always been like, sort of part of the Stack Overflow DNA, mathoverflow is one of the like, the earliest sites created kind of has its whole own culture, and is, it's quite a large site full of pretty serious academics. So let's continue down that thread a little bit. Tell us about the business that you helped to create, and that you run now, you know, what it intends to do? And then yeah, I'd love to unpack a little bit about as you're saying, where math plays a role there.

MA Yeah, interesting, actually. So you know, if you take a step back, kind of starting a business today, it's just hard enough, right. And so but data security, compliance, and privacy regulations make it even harder to innovate, these laws are passed. And ultimately, they do contribute to, through some chain of events, the regulation actually makes the folks who are entrenched that have deep pockets, compete more than folks who are strong to innovate, right, as you start to innovate, and now you have all these different compliances to be able to comply with and data security postures, it becomes more and more of a barrier to entry to be able to start to—

BP Oh yeah, I've been through a sock to audit see huge amount of overhead from a lot of different teams. Yeah.

MA Right. And so compliance, some folks then take compliance as a checkbox exercise. And some folks are like, well, this is the intent behind the compliance is really security. So you know, to be able to understand kind of how to govern and have best of, you know, best in class security processes in place so that we can build that security DNA, right. So really, companies end up doing exactly what you're saying, right, which is a DIY solution just to go to market or they find a really find a vendor that like gives them the ability to automate the check box. But really, the checking the box is not the intent of these regulations. These regulations are actually trying to make us stop and think and try to understand okay, like, why would we need the data right? Like, why do you need it because if you do need it, you're gonna have to secure it because the point of the regulation is supposed to get you think twice, right? But actually, it's hurt. It's hurtful. Maybe notice has like really good intent from the beginning. So companies end up just doing exactly what you said, like you did you went through a sock to audit, you'd have to go through and basically figure out, okay, do I have data security everywhere, the ultimate end of building a shadow company that has nothing to do with your core competency, and staffing it up maintaining it getting it done—

BP Totally. Totally. It was like a lot of, you know, resources from engineering and then having to talk to legal. And then that was sort of like cascade down. And now all of a sudden, like, HR, and marketing has to go through a bunch of audits and do a bunch of stuff. They're getting JIRA tickets, they don't know what it's really about. Yeah. And, you know, like, people are asking, oh, can we, you know, tweak this or that and product, because it's going to loop back and cause this issue in GDPR. And it's like, you know, takes everybody way off track from the roadmap that you had.

MA Exactly. And so, but you think about what it is, is like, if you take a step back, you're like, well, why are you doing all of that.?And so it turns out that you're trying to unlock some value of data, because the data kind of was spread all over the enterprise, or whatever company that you're in, then you have to bring in legal and HR and your office administrator has to now be involved in like procuring laptops. And so the VGS is just trying to move you away from saying, 'Hey, listen to checkboxes, great. But when you do the checkbox, you're just building a better mousetrap. It's just better if you eliminate the mice altogether.' And you can take that approach. What can you do? Or what what can you do to minimize the creep of compliance and regulation to avoid stifling product decisions? That's actually our biggest revenue drivers, product innovation. And, you know, it turns out that if you shift the sensitive data from your custodianship to somewhere else, very similar to our money, today, you shift the money from, you know, under your mattress to a bank today, you can actually unlock more value from it, if and only if you invert the relationship between the data and the compute. And that's very important. And you might be like, oh, what does that mean? Actually very important.

BP I am wondering. [Ben laughs] Yes, break it down for me.

MA Okay, so think about that Excel file. If I sent you a CSV with a column of you know, the number one that I asked you to sum that up for me, and I have an Excel file, typically, what I'll have to do is often make a copy of my file, by sending you you know, two copies identical, I have a copy, you have a copy, you then open up your Excel Microsoft Excel program, import that CSV, and then you run the sum function on that column to create the number 10, right? The column that 10 times I get 10. So when we say inverting the relationship between compute and data, what we say is, you send the sum function instead to a neutral Switzerland. And then I send my CSV, to this neutral Switzerland. And then what ends up happening is, then it computes, and then I get the number 10. You never saw the data, but you still were able to operate on the data, right? You're still able to provide that function there. And that's really what VGS does, it inverses the relationship that we have typically with compute with data, and we do this today, we do that with money, we don't usually go to the ATM and take money out unless we have to. Usually, most of the time we send Venmo or Square cash and we get on our banks and send Zeller wire we send an exchange value of money without physically moving custodianship all the time, right? And so how can we apply that same principle, that same experience to data, which is already inherently digital, in the same way that we do it with money today, and that's really the thesis behind VGS. That's what we do in a nutshell, we are effectively infrastructure for moving data, so that you can move the value to maximize it without having to have it in your custodianship.

BP So are you Switzerland in this equation? Are you the neutral party? Are you the one who enables some other party to be Switzerland?

MA No. So we would be Switzerland, right. So the idea here is that we would govern the use of data based on the laws and the regulations that are required to be able to hold that data. So then, just like your bank manages all the different acts that the treasuries and the Treasury Department and the Federal Reserve enact, you don't have a consumer care about that, right? No, you hold the cash, but you expect your bank to comply with it. That's really what VGS does. It just complies with all these different regulations and jurisdictional problems that come about.

BP So I have this like, yeah, sensitive customer data that comes from you know, clients, or I have user data, which has, you know, personal PII that I don't want to accidentally let go up. But I also want to interact, as you said, with that data, and I might want to do it with a bunch of different services. So if I want to use some of that client customer data through a Salesforce or an intercom, or I want to use some of that, you know, customer data, and I want to put it into, you know, a marchetto or iterable. How do you, you know, sort of facilitate that chain of events?

MA Yeah, that's a really interesting question. And so what we do here is by building a base An agentless VPN, right? And so what's an agentless? VPN? It's basically like a web proxy. I'm sure we've all done it where we've anonymized proxy, we go to like the settings, go to like the network settings, we go to—

BP Yeah, I have to use a VPN for work now, it's like a remote work reality.

MA Right? Right. Okay, so so instead of installing a VPN, one way to do this would be like, let's configure the proxy parameters of your system or browser or whatever application right to speak through a proxy that VGS provides for you. And so you get to interact with Salesforce, or you get to interact with all the services that you just mentioned. And with this, like alias data, it's just a substitute. It's a synthetic substitute that has like, the preserving properties of the underlying data. So you might have something that's like, you know, instead of a 16 digit card number, you might have like a 20, digit one, right, or a 20, alphanumeric one, or you can even create a very similar 16 digit one, right. And the whole idea is that or an email address. So you send this fake data to iterable. Or you send that fake data to, you know, Salesforce or stripe, or whomever through the proxy VGS intercepts that payload, understands and sees that redacted alias, and then rewrites on the fly that payload and embeds the real sensitive data in it. And that's where we're policy execution engine comes in and says, oh, are you allowed to send that data to that destination, but you authenticate yourself properly, when you did that is your system patched and up to date, all of these distant different things that we can take into account to say, Hey, this is an authorized reveal of data, and a transmission. And so the good news is, because it's a VPN like system, like a proxy, we don't have to program and hard code integrations into like Salesforce or like, you know, I don't need to do that, you just still use the data normally. But before we transmit it, we basically deduce the value that you're trying to extract from it, and then replace the sensitive data that replace it in line, in real time, as it goes to its destination.

BP That's really interesting. So I guess, right, you know, if I were to play devil's advocate here, I would say, this sounds great. Let's say I've started a new company, I'm at 50 people, but I have, you know, now international customers, and some of them are financial institutions, or healthcare companies, you know, I want to scale up, but I understand that there's going to be huge overhead, you know, with sock to GDPR, and you had a number of others listed on your site, I haven't personally had to contend with them, but I'm sure they're just as bad. I can, you know, get this sort of one stop shop here, which is basically like a cloud service of sorts, I don't know if it's also local, but in essence, it's going to allow me to whenever I send data, it's going to, as you said, like, disguise it or, you know, changes slightly intercepted in the middle, clean that up, and then send the real value after checking to make sure that everything you know, has been done securely. And if there's some change to GDPR, sock to it's on you not on me, you know, to like, figure out how to comply with that, I guess the question would be right, like, then that means I have to trust you, as opposed to, you know, like doing it myself. So like, you're now, you've added a security vector that's like, outside of your own company. And then right, you know, what is the chance that when you're like changing the date around, you know, it somehow gets corrupted or you know, becomes incorrect? And then I guess, finally, you know, like, what is the latency here? Like, how much overhead does it add to do this? So I'll put those three questions to you.

MA Great question. I'm gonna address it a second one. And then the pre question first, which is, yes, it is a cloud based system. That's true. But you know, one of the major improvements of I think VGS, that they are, very good degree, that brings to the market is this developer experience, right? So my company before this, that I sold, I sold it to Stripe. And one of the things that, you know, Stripe is known for my previous company balance was known for was like how awesome the developer experience that was right. And so the idea is that, okay, what can we do to make it such that the developer experience when you use sensitive data is very much in line with your normal developer workflows, right? We want the developer to like not think too much about sensitive data, because as you know, especially Stack Overflow, copying pasting is exactly like how most developers work. Right. And so the and then they try something. Yeah, they said, they ship it, and then that's, like, minimum value that they deliver. So the idea is that, okay, like, let's not overcomplicate this, we don't need a PhD to be able to add sensors, you know, to be able to use data, we just need to be able to extract value from it. And so that's really what VGS is going to do. So we created a local replica of VGS, called Satellite. Interesting name, that you can literally just install locally, and it mimics the invalid environment that you would effectively develop against VGS. So that you're able to kind of understand how to do this normally, right. That's super important. And then so that, that hopefully addresses kind of like, you know, but in terms of when you go to production, we don't host anything locally, right? The idea is that you're supposed to trust VGS because we have all these different locations and regions so that we can silo your data based on the, you know, aspect, you know, basically geographical tenancy. If you're like, I need to install it in my data center in Florida, for example, and you have European data, how do you satisfy data sovereignty laws that way? And so it becomes very difficult to deploy VGS locally. So what we do is we shy away from listen, it's not something that this works locally, but you can have a local experience as you develop. Right. So hopefully that addressed pre-question. And then the second question. So the first and then the first, your first question was, you know, you have to trust VGS to kind of offload and security. And to that, I say, look, we've already—that makes sense. And I think that question made sense. You know, even in 2000, you know, the first cloud providers were coming out, because they basically, it's not a client server model. But they were like, listen, let us host your app. And I think now that we trust, effectively, Amazon Web Services, and Google Cloud and all these things, we, we have developed ways to basically say, listen, let's push our data to these elastic compute clouds, right, that will be able to perform operationally better and more efficient and effectively than me building it in house. So that means like, so VGS isn't for everybody, if you are determined to have your own data security, and you want to build a VGS, like clone in your own organization. I mean, there's nothing stopping you. If that's the best use of your time for your business, then that's fine, right? But the point is, like, the idea is supposed to say, hey, listen, you don't go to a dentist to operate on your heart, you go to a specialist to operate on your heart, right? And so the question becomes, why don't we go and give data to the folks whose primary mission is to protect it and comply, so that you just focus on the thing that you do best? Right. And so the third question, I don't know if I answered, but your third question was latency. Yes, we that's a very important conversation, we constantly monitor that. And we have infrastructure that we've deployed to the edges, right, so that we are able to push out that logic to the edges. But that's ultimately the thing that VGS focuses on is listen, we're an infrastructure company, we totally understand what an infrastructure company and we take that job really seriously.

BP I have two final questions before we wrap up. And we might be able to skip the first one easily. But I sort of remember a presentation from Apple where they were talking about, you know, the encryption that they do, which is biometric, and how they can share that with all these other providers in a way that doesn't like, ultimately give the other provider a look at your data at all. So is that similar to what you're talking about with this neutral switch in the middle like Apple has—

MA So this goes back to Well, did you study mathematics in college? And how does this tie back to your company? Exactly right. So it turns out that that's called privacy preserving analytics, or you know, differential privacy, homomorphic encryption, there's all of these different little things that, you know, are still being developed today that allow us to, you know, figure out what's the best path for the type of analytics or processing that you're trying to do. And so the easiest way to do is to introduce noise, and then filter out based on some variables to say, Hey, listen, let's say your zip code is like, you know, 21030, right? Like, in this case, VGS will say, Oh, it's 21035, like very adjacent county. And so because you don't need you don't need that kind of like specificity, right. And so the idea is that VGS is trying to understand which path to take and lets you operate, we've actually created a subset of Python that we've derived from Google's basic language called Starlark, it's hermetically sealed. And we've actually modified it to comply with a subset of Python's interface. So you know, in this game called Starlark, right after Starlark. And so Starlark is actually a subset of Python. So it is basically it works in Starlark, you will work in Python. And and the idea is that if you're a data scientist, and you're using it to run all your analytics and stuff, because mostly using something like a Python language, VGS can actually translate that into Starlark, and give you the ability to push that logic. Again, remember that some function with the Excel push that logic to the data to operate on it. And so what ends up happening is you start running all these analytics, and in a way that has a privacy budget. And that's how you can achieve a lot of the things that you're the sky describing here. But obviously, we were not claiming we have homomorphic encryption, because sometimes that's income, computationally feasible. But it is an active field of research, we're looking into to see where the advances have made so that we can understand what point does that make sense.

BP Okay, very cool. Alright. Last question. Before I let you go, you know, I think probably the biggest security story of the year was the Solarwinds attack, you know, that happens sort of very early on in the supply chain, and then filtered out to all these large companies and federal agencies. So, you know, put yourself you know, obviously, hindsight is 2020. But like, if they had been a client of yours, would you have been able to see that in some way or prevent that in some way, like, you know, going all the way down to sort of like these elemental levels of the software supply chain is what made It's so effective. Where would a service like yours plant preventing something like that from happening? 

MA It's like, it's hard to say, right? If we were able to prevent that or not, because that that failed on so many the fences that I think the whole point of defense in depth, I think was violated. So the way VGS works is the idea is that even if you're hacked, what are they going to hack, just meaningless data, right? So in that retrospect, maybe their credentials that they would have used to like go and then proxy chain and attack to another system, that would have been prevented by something like a VGS. But like whether or not like we were able to say, a like this checksum verification of the file that you download is just different. Like, that's not what we do, what we do is we focus more on like the individual pieces of data that contain that sensitive, you know, that are sensitive themselves. Those are the ones that are you know, are very, very, that we can potentially shield you from. So the idea is that VGS is a great idea to shift the data that's sensitive off of your plate so that when you do have an incident, right this it's not a catastrophic in news incidents, it's just more like, hey, someone breached defenses, the perimeters breached but they didn't breach the data, but does that make sense?

BP Yeah, that makes a lot of sense. 

[music]

BP Alright, I am Ben popper, Director of Content here at Stack Overflow. You can always find me on Twitter @BenPopper, email us podcast@stackoverflow.com and if you'd like to show do leave a rating and review, really helps. Why don't you say who you are, what you do and where you can be found on the internet if you want to be found?

MA Sure! So my name is Mahmoud Abdelkader, right, and you can find me at twitter.com/mahmoudimus. And you can find me on just our website verygoodsecurity.com. We're always looking for really smart folks to come join us so you know we have a puzzle that you can solve puzzle.becomeverygood.com and if you solve, I will personally contact you if you solve that.

BP If you've run out of things to do on our Code Golf Stack Exchange, you can head on over.

MA Exactly. [Mahmoud laughs] Or just like send something to jobs@verygoodsecurity.com. Yeah.

[outro music]