The Stack Overflow Podcast

Privacy is a moving target. Here’s how engineering teams can stay on track.

Episode Summary

On this sponsored episode of the podcast, we talk with Rob Picard and Matt Cooper of Vanta, who get that question every day. Their company makes security monitoring software that helps companies get into compliance quickly. We spoke about the shifting sands of privacy rules and regulations, tracking data flows through systems and across corporate borders, and how security automation can put up guardrails instead of gates.

Episode Notes

Ever since personal information started flowing into applications on the web, securing that information has become more and more important. General security and privacy frameworks like ISO-27001 and PCI provide guidance in securing systems. Now the law has gotten involved with the European Union’s GDPR and California’s CPRA. More laws are on the way, and these laws (and the frameworks) are changing as they meet legal challenges. With the legal landscape for privacy shifting so much, every engineer must ask: How do I keep my application in compliance?

Many security frameworks are undergoing modernization to reflect the way that distributed applications function today. And more countries and US states are passing their own privacy regulations. The privacy space is surprisingly dynamic, forcing companies to keep track of these frequent changes to stay current and compliant. Not everyone has in-house legal experts to follow the daily developments and communicate those to the engineering team.

For an engineering team just trying to understand the effort involved, it may be helpful to start figuring out where your data flows. Tracking it between internal services may be overkill; instead, track it across corporate boundaries, from one database, cloud provider, SaaS system, and dependency. Each of those should have their own data privacy agreement—plug into your procurement process to see what each piece of your stack promises on a privacy level.

Your DevOps and DevSecOps teams will probably want to automate much of the security engineering process as possible. Unfortunately, automating security is hard. The best path may not be to automate the defenses on your system; it might be better to instead automate the context that you provide to engineers. If someone wants to add a dependency, pop up a reminder that these dependencies can be fickle. Automate the boring stuff—context, reminders, to-dos—and let humans do the complex problem solving we’re so good at.

If you’re looking to add an in-house security expert as a service, check out Vanta.com. Their platform monitors connects to your systems and helps you prep for compliance with one or more security frameworks. If those frameworks change, you don’t need to do anything. Vanta changes for you.

Episode Transcription

Matt Cooper Scope the data that you care about, because a lot of data flows through our systems that's frankly boring and not regulated. It doesn't matter from a regulatory perspective, because it's not personal information. At the end of the day that's what privacy is concerned about. So make sure when you're tracking down data that you have some reasonable basis to think that you have PII in these data flows before you even get started there.

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast. I am your host, Ben Popper, Director of Content here at Stack Overflow. And I am joined as I often am by my colleague and collaborator, Ryan Donovan. Hey, Ryan.

Ryan Donovan Hey, Ben. How’re you doing today?

BP I am okay. So we are going to be talking security and privacy today. This is something that comes up at Stack Overflow a lot because we have an audience of developers. They are hyper attuned to this stuff. They are always on the bleeding edge of poking and prodding to make sure we're keeping things shipshape, and also I think pushing from a community perspective to make sure that we're being responsible with people's data and only collecting what we need essentially to improve the community and run our business. So we have two great guests on the show today. This episode is being sponsored by Vanta, and we have Rob Picard and Matt Cooper. Welcome to you both.

MC Thank you.

Rob Picard Thank you.

BP So, Matt, let's start with you. Just tell the listeners a little bit about how you got into this world. What was your first experience with software engineering? What brought you to the job you're at today?

MC So, my name is Matt Cooper, Principal for Cyber Security and Data Privacy at Vanta. I have spent my entire career actually in some form of IT or security. I've done quite a few different things. Prior to joining Vanta, I got into an information security consulting space, so I was helping folks from a consulting perspective get ready for a lot of the common security and privacy frameworks, so things like HIPAA, SOC 2, ISO 27001. And then about a year before GDPR became effective, our practice director said, "Hey, this GDPR thing, I think this has legs. We should go figure this out because I think folks are going to need help with this." And so we essentially spun up a GDPR practice at that time. Ever since then he was right. It's been very popular from a consulting perspective and we’re helping a lot of folks with that. Long story short, we had a mutual client where we were helping them with SOC 2 readiness. They were using Vanta. I met the Vanta team and they told me, “Hey, this is a super exciting space. You should pop over here.” And the rest is history.

BP This all makes sense to me. Your clients I'm sure like you once you help them get through these things. We all resent being asked to be compliant in a million new things every year, but we certainly can use help doing it faster and more efficiently. Rob, how about you? How'd you get into the world of software and technology, and how'd you find yourself at Vanta and what do you do there these days?

RP Yeah, so I am the Security Lead at Vanta. I've been here since September of last year, so, older than half the company, that kind of thing. I actually got my start in penetration testing. I worked for a company called Matasano Security, which was kind of a penetration testing firm and ended up getting acquired. I've worked at a couple companies. More recently, I was at Robinhood for a few years as sort of an early employee on the security team there. I started a company and went through YCombinator and did that for about a year before I realized, “Hey, Vanta is doing really well in the security space. I should go see what's going on over there.” And I cold emailed the CEO and said, “Hey, I'm shutting down my thing and you might need me over there.” So I ended up over here at Vanta and as Matt said, the rest is history. Just having a good time here.

BP Fantastic. I hope you took at least some of your Robinhood equity with you. God bless. Let's dive into it. For both of you, at a high level right now, talking about privacy frameworks and the changes, what do you see out there? And what is most challenging to the clients that come to Vanta?

MC So this year in particular there have been a number of changes to some of the major frameworks. Folks are generally pretty familiar with it, but ISO 27001, which isn't a privacy framework per se, but it's the information security framework, which is the basis of another privacy framework, the ISO 27701. That went through a pretty significant update where they added several new controls to the Annex A, or ISO 27002, depending how you think of it. They took some controls out. They consolidated some things. Really this is I think an effort to modernize the ISO framework. A lot of folks in the space, especially cloud companies, one of their complaints is they're like, “Matt, these controls are dated.” One example is that there's one control about network controls and there's like 12 controls around physical security, we're all in a cloud and we're not taking those things anyway. The core clauses of ISO are staying the same so I don't think this will necessarily be majorly impactful for businesses who are in an ISO certification scheme. They can just work in these updates as it makes sense for them. But the goal here is that the controls will be a bit more modernized, a bit more cloud friendly, and just ready for the 21st century. Moving on, PCI also went through another major framework update this year. It went from 3.2 to 4.0. I won't go into the details, PCI is fairly dense. I will say one of the most impactful changes is that service providers now are going to have to do a third party audit, so a report on compliance, or a RoC in PCI terminology. Whereas in the past, service providers that had a lower level of processing were able to self-attest, and I think just with the risks of the third party environment, that's no longer sufficient. If you want to be a service provider you're going to need to go through an actual formal third party audit. Let's move down to privacy. This is probably the most dynamic and interesting space from my perspective. So one of the inherent challenges here is just the rapid pace of change in the privacy space. So you have things like GDPR. It came out, it's great, but now there are lawsuits and there are interpretations and so GDPR itself is not totally constant, it's continuing to evolve. And then on the US side, or just internationally, we're continuing to see additional privacy laws either taking effect as in the case of the United States. I think most famously, California has now the CPRA. It goes into effect at the beginning of next year. Virginia and Colorado are also getting a lot of attention. They both have new privacy frameworks that are going into effect next year. And internationally there are new privacy regulations happening pretty much all the time. So that I think is one of the inherent challenges. Just how dynamic this space is, how frequently things are changing, and companies are having to keep track of these things just to stay current and stay compliant.

RD Yeah. That sounds like there's a lot. I had to deal with writing about PCI and HIPAA, GDPR, and I never really thought about all the changes in there. So for engineers trying to keep up with this, what do they have to do?

MC It depends. First off, I'll just use Vanta as an example. So Vanta's a software company. Rob is a technical engineer, he leads our security effort. I support that team on the compliance and privacy side. And that really made sense for Vanta because we are a software company to have an actual engineer overseeing the full efforts, and so much of what we do becomes technical very quickly. That being said, I think it's a lot to just simply ask engineers to read the news about GDPR every day and figure out what that means for them. And so at Vanta we have this specialist role that I'm in, and I sit as part of the legal team, and so I do think it's helpful if you have essentially a layer of translation or a layer of interpretation or someone who can go out and it's really their job to stay on top of the regulatory environment and then help to translate that into concrete requirements that an engineer can then deal with, just like they deal with any other requirement for software development. I'll turn to Rob and see if he has anything to add to that. It's my first take.

RP Yeah, it's interesting because I would say Vanta is actually kind of lucky as an example because our product deals with a lot of these things that we are going to have in-house experts like Matt who are really specialized in these things. Most startups at Vanta’s size are not going to have a dedicated privacy expert focused on how things are going internally. I think without that, you have to work with your legal team, either external counselor or internal. You have to have some interpretation of what these things mean for your company. And you have to translate that into at least high-level requirements, like, “Hey guys. We need to list our subprocessors.” That's a pretty basic requirement from a lot of these privacy frameworks. So you have to figure out what does that mean for you, what is the definition of a subprocessor for you, that sort of thing. And not everybody's going to need a full time dedicated resource to this, but it's very useful to have one I will say.

MC Rob makes an awesome point. It's a little bit different for Vanta because of the space that we're in. And I was going to add to that, I won't belabor it here, we'll talk about it I'm sure further, but I do think if you don't have that opportunity, making the effort to really understand the regulatory framework and how it applies to you can be hugely advantageous, because it's confusing. There's a lot of detail and you can't just rely on what people tell you because people misunderstand, there's different interpretations, et cetera. So I do think at the end of the day for the organization to really make the effort to understand this for themselves is well worth it.

BP This is really interesting. I want to talk a little bit about how to keep track of all those things. You mentioned subprocessors. It sounds like it would in some ways maybe be frustrating for engineers who are just focused on building product to spec and on time and on budget, making sure it works well at the memory and the speed and to all the different clients on the other end. When you're thinking about tracking data through all those subsystems, but also trying to build something that's great with APIs and microservices and talking to a million other endpoints, do you make those decisions early on in the architecture, or is that something you can evolve, Matt, to your point, as new legal cases move through or new organizations update their frameworks?

RP I can jump in on that. I think the key here is you want to plug into the procurement process. And at a larger company, you have a very robust procurement process where it's got a million checklists, legal is going to see every vendor, security's going to see every vendor, IT, enterprise, engineering, they're going to see every vendor. They might even run the process. At smaller companies you're not going to have that, but in theory, somebody has to pay somebody eventually. So you need a plugin there, and at the bare minimum, I cannot stress enough how much I am not the right person to interpret these laws, but at the bare minimum you really need to be at least tracking where your customer data is going, like which third party companies. And that can be scoped a million different ways, that can mean a million different things. Maybe it's emails, maybe it's IPs, and that's where you get into having a good lawyer or a person who is an expert in this space to interpret it. But you just need a checkbox somewhere when you buy a new tool that says, “Hey, do we need to add this to the website and maybe notify people?” Maybe you notify them every now and then, that sort of thing. You don't have to go so far beyond the maturity of the rest of your company.

MC I would add to that, I think that's definitely the right way to do it. The best way to do it is attack it in that procurement process or really upfront before you start sharing data. Understand what you're going to share, when, how it's going to be used, et cetera. But to the point Rob made, not everyone can do that. Or maybe you're coming into an organization and this is now your responsibility and they just didn't have the capability to do that in the past. And so there could be just a little bit of brute force effort with the engineering team to just come in and unwind what's already in place and just take the time to go through and understand your current data flows, maybe remap those, et cetera, and just do that work. It's just work effort at that point.

RD Yeah. I mean, with systems getting so complicated these days, like Ben said with microservices and APIs and the cloud, how do you go about tracking your data through all those subsystems? I remember companies having enough trouble figuring out what services were even running in their cloud.

MC Yeah. On the one hand, the tooling environment in this space is maturing as well. We work with a lot of SaaS companies and there are a lot of cool new products that are trying to solve these business problems. One tool I looked at was basically an API-level DLP tool which is going through and trying to use some AI or what have you to analyze data flows through API connections. Again, those aren't probably silver bullets, but there are a lot of interesting products that you can use to apply to this problem. The second part of my answer would be to scope the data that you care about, because a lot of data flows through our systems that's frankly boring and not regulated. It doesn't matter from a regulatory perspective because it's not personal information. At the end of the day, that's what privacy is concerned about. So make sure when you're tracking down data that you have some reasonable basis to think that you have PII in these data flows before you even get started there.

RP Really the things you care about are crossing corporate lines. So when data is flying between a million different AWS services or Azure, Google cloud services, that's fine. You have to think of, “Okay, what are the different separate companies with different legal structures and environments and contracts that the data is going between?” And you have to think of the geographic, depending on your company and your customers. But you have to think, “Okay, is it going between US regions and European regions in AWS?” that sort of thing. But every microservice doesn't have to be explicitly defined as a separate subprocess or anything like that within your company. It's like, “Hey, your cloud provider.” “Hey, your support system,” maybe. “Hey, some APIs that you use for data transformation,” that kind of stuff.

RD That's nice, that simplifies it. How about those systems that you don't control? How about all your APIs, all your tooling, all your open source dependencies?

BP Oh gosh, yeah. Your cloud providers. Geez.

RP Like I said, there's sort of these corporate boundaries that you have to really think about. From a security perspective, you have to worry about all your open source dependencies, you have to worry about a million different things that go beyond the sort of regulated privacy-focused data. But I think from all these systems you don't control, it comes down a lot of times to getting the right contracts in place. You get a DPA, a data processing addendum, in place and you make sure that these companies you're sending data to, and if you're using AWS or MongoDB or Google, they have all of these things in place. They have a very robust privacy program. So I think where you need to really focus your attention is on the smaller companies that you're using, and making sure that, “Hey, you guys are meeting the boxes here to make sure that if I'm handing you some customer data to process on my behalf, you're going to handle it with care. You're going to be a trustworthy steward of this data.” But most of your processors or most of your third parties either aren't touching data that is especially interesting, or they have extremely robust privacy programs with like hundreds of people working on it.

BP Let me throw one more. Or Matt, if you have something to say on that you can, and then I was going to throw one more wrinkle in there and ask about when your developers start asking to use different open source projects that may or may not be connected to big corporations with legal departments. But go ahead.

MC In terms of systems you don't control and third parties you don't control, I totally agree with Rob. Just doing due diligence on them. Using major players obviously can make that a bit easier. But the second thing I would say would be, keep in mind what is the objective here? Because security and data privacy can become an endless rabbit hole. You can always be more secure. You can always understand things at a higher level of detail, or almost always for most organizations. But just keep in mind the frame of like, “What are we trying to accomplish?” Personal data, we need to know where it goes because it's regulated. We need to be able to be responsive to people if they're asking, “Hey, what do you have on us and who did you send it to?” But there's a point at which you can answer that question at a human level and that might be as far as you need to go. Meaning, how much money, time, and effort you’re spending on this should be related to some sort of a business objective. So that's a frame that I often throw out there to help folks kind of simplify, because again, you can really get into the weeds and into the details and quickly feel kind of overwhelmed with the complexity.

RP Yeah. So there's a million open source projects every company uses, and I don't even know if that's that much of an exaggeration to be honest. I think it's totally fine to use open source projects. I think the key is, where is the data going when it comes to privacy. When it comes to security, we can have a whole conversation about how you evaluate the security of these and sort of box them off within your environment such that the blast radius is reduced so that if something is wrong with these and it's a poorly-maintained open source project, you one, can evaluate whether or not you should use it. Two, you can isolate it such that the blast radius isn't that bad. But when it comes to privacy, if it's open source stuff that you're hosting yourself, for the most part, it doesn't come into play aside from just being responsible with that data and making sure you're not sticking it in some system within your own environment that is open to the public and anonymous access. But I think by and large, privacy doesn't come into play too much if you're using something hosted in your own environment or some tools that you're running on your own systems. I think where it kind of blurs the lines is when you're using brand new startups, really early projects, on their own servers, and you're sending data over there, you really have to evaluate whether or not you're being responsible with your customer's data ultimately.

BP That's interesting, because that kind of presents another hurdle to newcomers as opposed to incumbents as they don't have all that stuff built up, the DPA, they're not as ready to answer those questions. So that's interesting to hear.

RP Yeah. And that's a really good point. I think luckily there's sort of a framework or a sort of canon being developed out there where it's pretty easy to go and get a DPA. If you're a newcomer, you can spend probably 500 bucks and get a DPA from a lawyer and you're going to be able to use that pretty much forever. It's going to need to be updated over time, but that's just a pretty standard document. So there's some processes you’d need to put in place, there's a little bit of that problem, but hopefully over time it sort of gets reduced.

RD We love talking about DevOps stuff here. DevOps folks, they love to automate everything, so how do we make this easier and automate security on this big complex system?

RP Yeah, automating security is hard. There's a lot of things you can do. I think the biggest thing to think about is, if you're ever inclined to tell someone not to do something because of security, you should ideally be putting in a technical control or evaluating whether or not it's really something you need to block. It's the question of, do I need to put something in place that prevents this from happening with a break class mechanism for when it needs to happen? Or do I just need to give somebody more context? I'll give you an example. You don't need to block somebody from adding a new NPM dependency, even though there's a lot of mistrust of the node package system amongst the security community, I think, just because it's so sprawling and it's so big and there's a lot of incentives to go and find little bugs with it and report them as CVEs. But what you can do is you can use tools that put some context in a pull request. You go and add some code, it adds a dependency, it says, “Hey, by the way, this is not very well maintained. There's some known vulnerabilities.” You as the developer are now educated enough to make this decision yourself. You can refer to your security team or other developers on your team to say, “Do I need this? Is this the right call?” But ultimately you're empowered with the context that the security person would have. So I think a lot of automating security is really just automating context at the right time, just in time right when it's the most relevant to you as the developer. In general, if you're just going to say no to something, the best way to do that is just make it almost impossible to do. Put up guardrails, not gates.

MC It's kind of a cliché, security is a journey not a destination. This is a process and it's ongoing, and so in an ongoing process, some parts of that are repetitive, boring, reminders, tedious. Those are the awesome things to automate so that you can free up the human to do that thing that we're good at, which is the complex thinking, solving the hard problems, and just free up that brain space from just remembering to run my vulnerability scan. Well, let's just automate that thing, or remembering to do some sort of an activity or exercise. Let's just take all that low hanging fruit and automate that, because that's sort of the easy button and then we can continue to mature as we go down that road.

BP So for folks who are listening, just give us a quick pitch. We're almost at the end now. How does Vanta do this is the overarching question, but why choose you? And if I'm a client who comes, let's say, given our discussion, I'm a medium-sized company, I don't have a huge legal or compliance department. I've got maybe some people who want to learn, and I'm using some cloud providers, I've built some of my own stuff, and my developers are interested in using open source projects. I come in and try to talk to you. Give me a quick sales pitch on what you think Vanta can deliver, and then we'll tell folks where they can check it out.

MC So Vanta does automation as we've talked about for a lot of these things that you need to do to maintain your security and get ready for audit. There are bits and pieces of other things that were already happening in the market, but then there are some unique value propositions as well. One thing to keep in mind, especially for folks who haven't done this before, security and compliance tends to be like an organizational set of controls. Meaning, it's all over the board in terms of what you have to gather and what you have to maintain and what you have to prove to an auditor. And a lot of times, especially in a smaller company or a technology company, you have a lead engineer and they're like, “Awesome. Firewalls, database encryption, TLS. I got that.” But then you need the board minutes, you need the background check, you need a performance review. So one of Vanta’s value propositions was taking everything that you need and putting it in a single place so that you, first off, know what you need to do, which is a huge question for a lot of folks that are new to compliance. What do I actually need to do to comply with GDPR? This question I get pretty much every day. So taking that point of view, making it clear that these are the things you need to do, giving you that single pane of glass, and then reminding you of anything you haven't done, “Hey, you haven't done this thing yet. You're not in compliance. You need to fix it,” consistently and constantly on the hour is a big part of the value prop. Last thing I'll say, this is a talk I just gave at ISACA. In my point of view, this space has now been proven to the extent that everyone will do this in the future, meaning the next three to five years. So as an analogy, back in the day people might have reviewed logs. That's crazy. No one reviews logs anymore because the volume of logging in a modern enterprise is massive. So you're reviewing alerts from logs that a machine is checking for you to say, “Hey, I think this is interesting. This correlates with something. Let's check that.” It’s similar in compliance. We're still kind of moving out of the spreadsheet manual era, and in the future no one is going to do that. Everyone is going to have a tool that dials it in for them, that tells them what they need to do whether it's something custom or not, just because it perfectly makes sense and it's a perfect use case for technology.

[music plays]

BP All right, everybody. It is that time of the show. I'm going to shout out the winner of a lifeboat badge and thank somebody who came on Stack Overflow and helped to contribute a little bit of knowledge. We give out a lifeboat when somebody comes and they find a question with a score of -3 or less, they give it an answer that gets a score of 20 more, and now that question has a score of 3 or more and it's been saved from the dustbin of history. Thanks to Ghoul Ahmed, “How to detect a browser refresh in an angular project.” Appreciate the knowledge. All right, everybody. Thanks again for listening. I am Ben Popper, Director of Content here at Stack Overflow. You can always find me on Twitter @BenPopper. Email us, podcast@stackoverflow.com. Or leave us a rating and a review. It really helps.

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find me on Twitter @RThorDonovan. And if you have a great idea for a blog post, please email me at pitches@stackoverflow.com.

MC Great. So I’m Matt Cooper. I can be found on LinkedIn, Twitter, in real life, @IRLCooper is what I tend to go by. Also you can find me at cooper@vanta.com and would love to talk to you if you have any interest in security or privacy.

RP I'm Rob Picard, Security Lead at Vanta. You can find Vanta at vanta.com and you can find me on Twitter @ItsRobPicard.

BP All right, y'all. Thanks for coming on. And everybody else, thanks for listening.

[outro music plays]