Sam Scott, cofounder and CTO of Oso, joins the home team to talk about what makes authorization a challenge, the difference between authentication and authorization, and what zombies taught him about web development.
Oso is authorization as a service. Check out the docs or explore use cases.
Sam’s post “Why Authorization is Hard” covered what makes authorization challenging, some approaches to solving it, and their associated tradeoffs. You can also watch Sam’s talk at PyCon US 2022. Since it’s impossible to address everything that makes authorization hard in just 5,000 words, Sam is currently at work on a follow-up article called “Why Authorization is Hard Part II.”
Sam first learned web development via Rails for Zombies, a beginner-level Rails course. In creating Oso, he tasked himself with “putting rails on authorization.”
ICYMI: Read Sam’s post about best practices for securing REST APIs or listen to his previous podcast appearance, where we talked about how Oso makes security easier for developers.
Find Sam on LinkedIn or GitHub.
Today’s Lifeboat badge winner is OscarRyz for their answer to I am trying to solve '15 puzzle', but I get 'OutOfMemoryError'.
[intro music plays]
Ben Popper This episode is sponsored by Porkbun, a refreshingly different domain name registrar. Get a free .app or .dev domain name for your next online project by visiting porkbunstack.dev and using the coupon code stackpodcast at checkout. If you're looking to create a new project and you want a free domain, make sure to use that link and coupon code and you'll let them know the podcast sent you.
BP Hello, everybody. Welcome back to the Stack Overflow Podcast, slightly gravelly baritone voice edition. I am Ben Popper, Director of Content here at Stack Overflow. I went to a friend's 40th birthday party last night, and for those of you who don't know, the wonderful James Murphy of LCD Soundsystem now has a coffee shop by day, dance club by night, with quite a good sound system so we had a fun time. I am here as I often am with my wonderful colleague and collaborator, Ryan Donovan. Hey, Ryan.
Ryan Donovan Hey, Ben. You sound extra smooth.
BP Thank you. Yeah, I'm pitching myself for the next LCD Soundsystem live tour. So back in 2021, you and I got the chance to chat with some folks at Oso and also to do a blog post with them. We're having them back on the show today. That old blog post, remind me what it was about. Maybe that'll give people a frame of reference for some of what we're going to chat about today.
RD I believe it was best practices for APA. I think it was just auth. I think we just did a little double duty.
BP Gotcha, gotcha. I was just listening back to a podcast where we were talking about auth and whether you’re saying authentication or authorization, and we were like, “We’ve got to come up with some better nomenclature because auth is too close.” But yeah, we're happy to have Sam from Oso back on the show to talk about authorization, some of the stuff that they've written recently, and what's been going on with the company over the last year. So Sam, welcome back to the Stack Overflow Podcast.
Sam Scott Hey, thanks for having me back.
BP So originally you were slated to come on to talk about a very popular blog post you wrote: Why Authorization is So Hard. And then you were going to write Why Authorization is So Hard Part Two that actually ended up coming out a little while ago, and I think you told us it went off the rails in a different direction, or rather it became about Rails. So why is authorization so hard part two, and what does this have to do with Rails?
SS Fantastic pun and segue, by the way.
BP Thank you, thank you.
SS That was good. Many years ago, we wrote this blog post, Why Authorization is So Hard, where we basically shared the kinds of things we had heard from all the companies we had met about what they were struggling with authorization. And we touched on things like that it's hard to know how to write your authorization logic, it can get complex over time. We spoke about things like that it's hard to know where to store data in your architecture, and it's hard to do all the different enforcement things you might need to from the database all the way to the front end. And it resonated with a ton of folks, but the natural thing that happens is everyone came and was like, “Ah, but you missed X and Y and Z, and you didn't really talk about how to do this and this.”
BP That's how developers leave comments. Yeah, for sure.
SS Yeah, exactly. I was like, “Okay, okay. We're scratching the surface, all right? It was a long enough post. It was already like 5,000 words. Give me a break.” But so we accumulated more and more things and details on some of the nuances like, it's not as simple as centralized or decentralized. Your data is a mix of both. And so I started writing a part two that sort of went into more depth in all those things and the draft got to about 10,000 words and we realized that nobody quite cared about authorization that much other than us. And so what we kind of realized was that maybe at this point people don't want more problems, more reasons why it's hard. They maybe want to be given a little bit more of what they can do about it and the solutions and things like that. And we had this kind of premise of, how do we make authorization more opinionated? How do we give people best practices? And it just reminded me of my very beginning days of web development. So probably about 10+ years ago, I had a friend who wanted to build a startup and I was the only kind of developer-adjacent person he knew so he asked me if I could help, and so I went and learned how to do web development. I did it through this amazing tutorial called Rails for Zombies. It was a Ruby on Rails tutorial where the whole thing was like building Twitter for zombies, basically. Phenomenal course, you should go and look it up. It's great. But the thing that was awesome about Rails is that it is very opinionated about how you should build a web app, and it's got a lot of built-in conventions. Model-view-controller, MVC, was this structured way to think about the different parts of what a web app has to do. And those things are stuck with me forever. I don't do much Rails development anymore. At Oso we mostly do things in Rust, but I still think about everything as MVC and so kind of the challenge we put to ourselves was what would it look like if authorization had a Rails equivalent?
RD So what would be the opinionated version? I mean, I know that one of the problems is the OAuth spec is very much not opinionated. We had a two-part blog series on OAuth and how it works and the spec itself and that's been very popular, so I'd be very interested in how you put the rails on it?
SS Yeah. So I guess one quick segue on OAuth specifically is, you were right at the beginning, auth, authentication, authorization, each of those terms are kind of not super helpful. They all cover different pieces of things. And the HTTP header, which is authorization, is primarily used for authentication. You pass in credentials with the authorization header. And so similarly, OAuth, a big part of that is around kind of a concept of delegated authorization very specifically, which is giving services consent to access your data, things like that. And then there's kind of a layer on top of it which is more for authentication, which is ways that you can implement things like single sign-on, which is like, “I give this app that I want to connect to permission to talk to GitHub to verify my identity and know things about my identity.” They're using delegated authorization there to implement an authentication layer. All of that is entirely distinct from a lot of the stuff that we talk with folks about for authorization, which is how do you build the core permissions logic into your application? So if you are building a SaaS product and you want to make sure that admins or organizations can invite users and that members are allowed to create documents and things like that, who can do what inside your app, that's the core piece of authorization that we often talk with folks about that we focus on. And OAuth is kind of somewhat overlapping but not entirely the same part of that. So putting that segue aside.
RD Yeah. I think the simplest differentiator I've heard is that authentication is, “Who are you?” and authorization is, “Should you be here?”
SS Yeah, exactly. Authentication– who are you, identity, stuff like that. Authorization, what can you do. But then there's things like OAuth which crosses those two things in strange ways. So when we think about the application level authorization like what can you do inside of an app, we sort of have our own framework for thinking about it and it's kind of our version of MVC but for authorization. So we think of it as made up of these three things: one is the logic, two is data, and three is enforcement. So for example, suppose a user wants to come into your web application. They want to create a new project. The three pieces of that might be, maybe only admins at companies are allowed to create projects. That would be the logical piece. Who can do what inside the app? Abstractly, admins are allowed to do this thing. The data is the concrete piece. This user who is trying to do this thing, are they an admin at this particular company? That piece of data might have come from an OAuth token or a JWT token. It might live in the database. It might be something else. That's the data piece. And then the enforcement is, now you've got your logical rules plus the concrete data. You combine those together and get a decision like, “Can this person do this thing?” Enforcement is what you do with that. Do you return an error page? Do you just pretend that things don't exist? Do you log an order event? And so on and so on.
BP So I guess one question that comes to mind as we're talking about this is what modern architecture for software development means for all of this. Stack Overflow, when it began, was on a physical server located inside the same building where people were working. It was a grand monolith that we are still in the process of dismembering and keeping the parts we like. But when I talk to folks who are working at newer companies, there's a lot of cloud and containers and microservices, all of which are connected to each other, but not all of which they own. So when you talk about rules of logic, data, and permission, how do you make that work with a more modern sort of infrastructure stack?
SS Yeah, so there's two big trends that I think are impacting authorization right now. One is the rise of SaaS– Software as a service. Almost everything these days, especially at a startup, you consume is via a SaaS product. Everything you work on, and this is creating a really increased need for good authorization solutions, because when you build a SaaS product, you probably want organizations to only interact with their own resources. You don't want everyone to be able to do everything. And so suddenly you need some concept of multi-tenancy, which is a fundamental authorization problem. But then you have, as you saw, larger and larger companies using you, there's multiple teams, departments, even sub-companies within companies that are all using this one product and you need to be able to have that granular access, because I shouldn't be able to see things that other teams are working on and be able to see sensitive information those other teams have. Or if I'm using my HR SaaS product I shouldn't be able to see everybody's salaries. I can only see my direct reports’ salaries, things like that. And so this rise of SaaS has created a huge need for authorization. There's just more people out there who are trying to implement this kind of thing. And then as you said, there's this architectural trend where there's a big move of people towards microservices, and even not in the extreme Uber cases of microservices where you have 3000, but even two services or three services. Maybe you just have a couple of apps that you have separate backends for. The thing about authorization and any kind of distributed architecture that's challenging is that unlike many things, authorization is something that almost always has to be shared across everything inside your architecture. Because when I say, what can you do inside of an app, you can see this thing because you're a member at some organization, that's going to apply everywhere. It's very unlikely that you have some app that doesn't care about authorization. Yeah, you can do anything in this one, but not in this one. Because that brings with it a lot of unique challenges, especially around those things I was just talking about. So the logic piece of who can do what needs to be shared across all these microservices. The data of what roles people have, something needs to be shared across all these microservices. Enforcement needs to happen in each of these microservices. So it's not one of these really nice problems that you can just, “Oh, we need a service to do blah. Let's stand up a service and let it do its thing.” We want a service that can do authorization and then we need to find a way to integrate it into every service, make it so that every team can interact with it that can share data between all these services. And it just brings so many fascinating challenges with it.
RD When you talk about the MVC model of auth, I know with that they're often separated out but not always. Do you have sort of a best practices take on whether you separate the auth logic, enforcement, and data out?
SS Yeah, it's a great question, and the short version is yes. So what we've seen is that you very much want to separate out the shared pieces. So in particular if you have that core shared authorization model of who can do what, you want that to be in a central place, and the main reason I see for that is around the customer experience that it creates. So you'll see some places, maybe they'll have each team own their own service and they'll implement authorization slightly differently, and that's when you get those kind of weird products where you go to a different screen and suddenly things work differently or maybe you have to have your role that you had over in one place doesn't work somewhere else. And that creates a really inconsistent bad user experience. So you're going to want to have that core role-based access control model centralized, the data around it, so what roles people have, things like that. However, what you also want, at least this is our opinion on this, is that each individual team, if they own their own services, they have that domain knowledge of that specific app and what it would mean for someone to get access and be able to do different things within it. So you want to kind of try and create this sort of balance between a shared central thing that is kind of the core structure that applies across the whole org, but then on my app, I have the specific knowledge. I'm on the project service and I know what kinds of things you can do on a project and I want to be able to control what can somebody do on a project level. I should have enough ownership that I can contribute to that central model and that I can bring my own data with it. Because if I want to go and implement a new feature, I don't want to lose control of that and need to wait for a PR, or need to wait for some other team to go and implement this authorization feature. I want to go ship my thing, I want to get it done, and I know how to do that better. So getting that balance right is important.
BP So Sam, a few weeks ago we had some folks come on the show with Cassidy Williams and myself from Cerbos, and they talked as you have about how there's just a really growing demand for this. So you're both in the same business. They have an open core model, but a lot of what you said today kind of rings true. If I'm a developer listening to this or I'm an organization who's considering how I want to handle authorization, how would you delineate between the two companies? Do you think there's a different philosophy, a different toolkit? What should people who are listening know about what the market looks like?
SS Yeah, that's a good question. I mean, there's a couple of core differences out there in how people approach those kind of logic data enforcement pieces, and a couple of core technologies that exist out there in the market and things that people have done around them. So I'll kind of go through those in turn and sort of talk about what we've done at Oso and maybe how others in the market have approached it. So starting with logic: the way that you represent who can do what inside an application, a lot of this comes down to people having something like a policy language, maybe a configuration language, or some kind of way of having a central definition of who can do what inside the application. And so for example, I believe that Cerbos has a YAML configuration template that allows you to sort of express who can do what through that. There is other people out there that are based on OPA, open policy agent. They have a language called Rego that again is sort of a more general purpose policy language. It has a lot of uptake in the infrastructure world like Kubernetes, so there are folks who are building on that. And then at Oso we've basically built our own language called Polar that is a declarative language specifically built for authorization. The reason we went that approach kind of comes down to being opinionated about things. What we've seen as one of the biggest things that folks struggle with with authorization logic is where do you even begin? I have an app and I need to do some things. People don't have good mental models for what they need to go and implement, and so we've sort of provided a lot of those best practices built into the language. We have built in primitives for things like role-based access control, for doing relationship-based access control. Roles is like, “Member admins can do these things, members can do these things.” Relationship-based access control is like, “You can edit a project if you're an admin of the organization it belongs to.” That ‘belonging to’ piece is like a relationship. Or users can edit their own data and ownership is like a relationship. You have things like attributes built in, so you can have public resources. So what we were able to do was, from probably thousands of conversations with Eng teams that we've had, boil down the common patterns we've seen and put them directly into the product to make it easier for people to use. So I think one of the big reasons people refer us is that they can get something working faster rather than trying to figure out how to build it themselves with some of the other ones.
RD That seems like there's a trend, especially for small companies starting up to just offload and outsource these complicated things that are not related to the business logic. Give me some quick cloud hosting, give me some auth, give me SSO eventually. Hook me up.
SS Exactly. And honestly, we've got more to do there as well. I think we've simplified things dramatically with Polar, but we're sort of not done. So we've been implementing this workbench thing where you can sort of visually build up an authorization policy. You don't need to learn the language to begin with. Consistently, people don't want to think about the how. They kind of roughly know what features they're trying to implement. They know what their customers are trying to achieve, but they don't want to learn the details of authorization on how to achieve that. And so that's kind of what we're striving for.
BP Yeah, I think when we talked with Cerbos they mentioned Go and YAML. You mentioned those as possibilities. Y'all are in Rust, and now you have your own language. Roll something new. But right, the end user can go with the workbench, especially if they're maybe not on the Eng team.
SS Exactly.
[music plays]
BP All right, everybody. It is that time of the show. We want to shout out someone who came on Stack Overflow and helped to save some knowledge from the dustbin of history. A lifeboat badge, awarded seven hours ago to OscarRyz. Somebody was trying to solve puzzle number 15, but they were out of memory. Well, Oscar has your back with an answer. If you are listening and you enjoyed the show, I am Ben Popper, the Director of Content here. You can always follow me on Twitter @BenPopper. Email us with questions or suggestions, podcast@stackoverflow.com. And if you like what you hear, do us a favor, leave us a review or a rating. It really helps.
RD I'm Ryan Donovan. I put together the newsletter and edit the blog here at Stack Overflow. Find the blog at stackoverflow.blog. And if you want to reach out, I'm on Twitter @RThorDonovan.
SS So I'm Sam, the co-founder and CTO at Oso. I would say if you're in one of two camps you should reach out to me. If you love thinking and talking about authorization, reach out and I'll happily entertain you and talk about how Oso can solve your problems. Or if you hate authorization and would never want to touch it, you should also reach out so we can kind of help take all that stuff away from you. You can find us at osohq.com, and I'm just Sam@osohq.com, so feel free to just email me directly.
BP All right, everybody. Thanks for listening, and we will talk to you soon.
[outro music plays]