The Stack Overflow Podcast

Building an API is half the battle

Episode Summary

Marco Palladino, CTO and cofounder of Kong, joins Ryan to talk about the evolution of API protocols over time and why building the API is only half the battle.

Episode Notes

If you prefer, you can read this as a Q&A article or watch the video.

Kong is a cloud-native API platform. The first iteration of an API marketplace Marco and his colleagues built was Mashape.

Developments like GraphQL and gRPC have become critical as the number of APIs increases over time.

Find Marco on LinkedIn and Twitter.

Episode Transcription

[intro music plays]

Ryan Donovan Welcome to another Stack Overflow Q&A. I'm Ryan Donovan. I edit the blog and put together the newsletter here at Stack Overflow. I'm joined today by Marco Palladino, CTO and co-founder at Kong, and we're going to talk all about APIs.

[transition music plays]

Marco Palladino Well thank you, Ryan, for having me here. When we started our journey with APIs, we started with an idea in 2010, or 2009 even, to build an API marketplace like Amazon but for APIs. We imagined a world where APIs would be the main foundation block for every application that anybody creates in the world. In 2009 that was just about to get started, and so people were asking us, “What is an API?” And so we built our first business called Mashape, which was an API marketplace. If the world runs on APIs, then we need to have a marketplace for APIs. And then that product was the beginning of the Kong journey, because the Mashape marketplace didn't work very well for us but the technology we built was very good in this new microservices and API world. We built it for ourselves and we open sourced it, so we extracted it and we pivoted into Kong as part of a transition we made in 2015.

RD That's very much ahead of the game. You must be excited about the innovations in Jamstack these days.

MP Yeah, I mean there's innovations that are happening pretty much across the board. Now in my space, which is the API space, what we're looking at is APIs as fundamentally running pretty much every digital experience we can think of. 83% of the world internet traffic today runs on APIs. APIs are powering everything, as we all know, in our daily lives, in every category and every industry that we normally interact with.

[transition music plays]

MP Well we should think of APIs as interfaces, as user interfaces, except the user is a developer. APIs therefore are good when APIs are easy to use, are easy to understand, they are not convoluted, and fundamentally they provide a nice abstraction on top of the service or the data that we want to access through the APIs. The ones that are bad are the ones that don't have any of these properties. They're ugly, they're hard to use, they're inconsistent. There is no documentation whatsoever, therefore, it's hard to consume them and it's hard to test them and debug them, and therefore they become bad APIs. But ultimately, the criteria that separates the good APIs from the bad APIs is the consumption. At the end of the day, APIs are as valuable as the consumption that we're able to create on those APIs. And if these APIs are not being consumed, it doesn't matter how good the service is or the data is that's behind that API. If the API is not being consumed, that API quite frankly is useless.

RD Do you have an opinion on the various architecture styles or frameworks like the REST versus GraphQL, or even SOAP from back in the day?

MP Yeah, so it's funny to see the evolution of API protocols over time. We started with SOAP, but some in the audience may think we started earlier than that with CORBA and so on and so forth. But APIs as a concept have been a concept that permeated our industry since forever. Now with SOAP APIs we have the emergence of web service for the first time. SOAP APIs were notoriously hard to use, hard to consume, very verbose, and so when mobile came out in 2007-2008 as a trend and Steve Jobs with the iPhone and all that followed, everybody needed to create mobile applications. And it turns out that consuming RESTful APIs is a much easier endeavor and we can also leverage most of our existing knowledge and clients to be able to do that so we don't need to have specialized SOAP clients, for example, to consume those APIs. And the problem is, as the number of APIs increases over time, it becomes very computationally and network expensive to make lots of requests to all the RESTful APIs that we have. And so we started to see new patterns emerge, like for example, GraphQL, which allow us to essentially get multiple responses for multiple APIs in one request and one response. That allows us to save in bandwidth which is very important, especially for mobile, and also in improving the latency because we're not sending 50 requests across all the APIs but only one request. And then GraphQL, the gateway is going to be responsible to aggregate all those other responses. Now GraphQL is obviously one of those trends, but we're seeing a lot more. Internally especially we're seeing adoption of GRPC, whereas where we want to use faster protocols that do not require computationally intensive serialization and deserialization in JSON for example, as well as we're seeing events as being used as a way to create asynchronous microservices by propagating state changes in the data, not via a service-to-service synchronous request, but in a synchronous event that we can store in a log collector like Kafka. So we're seeing that APIs were SOAP only for a very long time, then REST came in, and then now it is many different protocols depending on the use case and the requirements that you have.

[transition music plays]

MP Well, building an API is half of the job. Once we do have an API we need to expose the API and govern how we want these APIs to be consumed, either internally or externally in the organization. And so there's lots of controls that we have to build in the infrastructure for our APIs that allow us to manage access to the API or revoke access to an API, monitor and capture analytics for the API, document the API, and create an onboarding flow for the API. So all of these complimentary use cases are actually critical for that API to be successful. Having an API sitting somewhere does not mean that API will be successful. And this is very important at the edge where we want to expose our API to partners, to a developer ecosystem, to mobile applications. We want to have that whole product journey to the API to be very nice. APIs are products in a way, right? And so we have to treat them with the same life cycle that we treat every other product. How do we version them? How do we decommission them? How do we make them better? API gateways are great at this. API management is a function that allows us to productize an API, either externally or internally, and it allows us to create all these flows and highways to the consumption of the API. Now some of these APIs are going to be consumed internally within the applications themselves– so not across different applications, but within the application itself. There we don't need to have this higher level management of the API, but what we need is a lower level that's faster, lower level network management of the API, and that's where service mesh comes in. With service mesh, we can reduce and remove that extra hop in the network that we would have by having a centralized ingress. We can remove that and go from service to service via a sidecar model in such a way that we make that performance much quicker because there is less networking hops we need to do, as well as it allows us for a more fine grain, lower level management of the underlying networking. This allows us to implement zero trust. It allows us to implement observability. It allows us to implement across data centers, across cloud failovers. If you experience problems in one cloud we can automatically redirect to the other cloud. Now reality is we need both. We need to have a service mesh to create this underlying network overlay that's secure, that's reliable, that's observable, and then some of these APIs we want to expose at the edge or to another team or another application. And that's when API management comes into the picture to provide all those other capabilities. So the way I see it, these are complementary technologies.

[transition music plays]

MP Yeah, as a matter of fact, APIs are the biggest attack vector for pretty much every product that anybody is creating these days. Every product runs at the end of the day on top of those APIs, so APIs become a great source of problems if we do not secure them properly. Security means many things in the world of APIs. Security means securing the protocol and the underlying transport, so we want everything to have an identity and we want everything to be encrypted over a secure, for example, HTTPS connection in the case of RESTful APIs. We want to secure access to the API, so we want to make sure that we can create tiers of access for those APIs. We can assign clients and consumers to these tiers in such a way that we can control who consumes the APIs, but we can also then apply specific rules to a specific tier of consumers, such as, “This type of consumer can make x number of requests per second, but this other tier cannot,” so managing the governance of the access to the APIs. And then there is a third level of security where we are looking at all the traffic that anybody's making through our APIs and trying to identify patterns that are suspicious, and so suspicious requests of a developer trying to send random fields to an API to see if it breaks or not. Every attacker is going to be exploring the APIs and using APIs in ways that were not intended in such a way that they can find a vulnerability. And so being able to detect these types of traffic patterns becomes very important to identify suspicious behavior.

[transition music plays]

MP So I've seen it all. There’s different types of APIs. There are APIs that are high frequency so there's lots of value to those APIs, but fundamentally each response is not as valuable so we can afford to lose some of that traffic because it doesn't really matter. For example, I'm sure that Twitter has lots of API requests whenever somebody wants to open a tweet or send a new tweet. It's not a big deal if somebody cannot see a tweet; they can just retry. So there is high volume but low value for each transaction, so to speak. And then there are low volume but high value transactions in certain use cases, for example, when we send a tax return using one of those tax return services. We are never going to use that app and that service ever throughout the year but that one time that we're going to be submitting our report, and that request happens once a year for each user but it's very high value. So in my experience working with enterprise organizations and customers, Kong today is the most adopted API gateway in the world in the open source community, but we also work with great enterprise organizations around the world that are building their API infrastructure on our technology. And I'm seeing all of these use cases so it's very hard to pinpoint a specific one, but I've seen responses of gigabytes of data. So you make one request, you get gigs back, you get this huge response back. And I've seen APIs taking days to be processed because those APIs probably should have been replaced with a job queue system. And so there's pretty much everything out there.

RD So for those high-value APIs, how do you ensure reliability without sort of duplicating effort?

MP It's very important to provide the right API infrastructure in place. And this is why building an API is only half of the job. The other half is to make sure that these APIs are reliable. How do you make them reliable and secure? Well, we need to build that for every API that we have. And there is a series of things that have to happen to make sure that APIs are reliable, but first and foremost, reliability intended as security that has to be in place. Reliability intended as low latency and performance, we need to be able to trace the full stack of our requests to determine where potential bottlenecks could be located in such a way that we can fix them. And then there is reliability intended as being able to measure the API response status codes and response parameters in such a way that we can detect those types of anomalies and then act upon them. For high-value APIs that are low frequency, we're working with customers where every 500 error is an open investigation that may take two or three weeks to be resolved, because they cannot lose any API request because it would create harm in their reputation and to the final end user. And so there is different levels of reliability that we want to achieve sometimes. And then being able to also replicate our infrastructure across multiple clouds and multiple regions in such a way that we can tolerate failures that are unpredictable in the underlying infrastructure, that becomes very important. Now the thing is, when we have lots of APIs, it's very hard to think of these problems on an ad hoc basis for each one of these APIs and it becomes much easier to provide this reliable infrastructure for APIs to the whole organization in such a way that we can cater to everything that's creating APIs and not just a subset of it.

[transition music plays]

MP I am speaking with customers that are telling me in the next five years they're going to be creating more APIs in the organization than all the APIs they've created up until now. So what we're going to be seeing in the next 10 years is an incredible amount of scale. And scale is both exciting and frightening. Scale is exciting because it allows us to build faster and better, and this is why we're adopting APIs. APIs allow us to turn every product and every silo into a platform. There is lots of value in that because we can build products faster on top of that, we can create ecosystems that are much more efficient, like partners ecosystem across the globe. So there is lots of business value in that scale that we're going to be creating, but there is also a requirement to have the right infrastructure in place so that that scale can be enabled in the first place. If we are not making the application teams that are building all of these APIs extremely productive whenever they ship a new API, then the application teams are going to be worrying about all these complementary concerns that they shouldn't be worrying about. That's not their job. So it's very important that as we prepare for this type of scale we make sure that the application teams are builders of APIs but not builders of infrastructure. We want them to be consumers of infrastructure and builders of APIs.

RD Was there anything else you wanted to touch on before we sign off here?

MP No, this has been a fantastic conversation. APIs are fundamentally changing and shifting the way we think of software. The way I see it, APIs are providing us the opportunity to create an assembly line of software where you pick and choose different pieces like an assembly line, and put them together to ship new applications in a better way, in a faster way. And they are fundamentally changing how we are building software in the digital world. And so thinking about APIs really is thinking about the future of the business, because without an API vision there is not going to be a business vision that is going to be successful, because that business vision has to rely on an API to be successful. So it's becoming very strategic for every organization these days.

RD Yeah, that's been a very informative, great conversation about APIs. I've been Ryan Donovan, editor of the blog here at Stack Overflow. I’m going to thank my guest, Marco Palladino, CTO and co-founder at Kong. Where can they find out more about this?

MP Konghq.com.

RD All right. Thank you very much, and we'll talk to you next time.

[outro music plays]