What makes a good API? | endjin

Carmel Eve Matthew Adams 25th September 2020

API Specification Conference 2020

When you delve into GraphQL, oDATA, Hydra and pals, you see a lot of very clever, highly generalized, very "self-describing" API design patterns.

But sometimes we lose sight of the fact that, above all, we want people to be able to consume our APIs, to derive business value. Quickly. Effectively. With a minimum of fuss.

In this session we are going to talk about APIs from the consumer's point of view. What is "easy" to understand for the average developer? Where do common patterns create obstacles? Or bugs? How can we keep it simple?

We're going to look at some popular APIs and analyse them for ease of understanding, and look at the kind of code that's generated by popular client tooling, to see which patterns work well, and which get in the way.

At the end of the session we'll walk away with some API design patterns that are less "clever", but much simpler. And therefore more beautiful.

Transcript

Matthew Adams: So hello everyone. And welcome to our breakout session on what makes a good API. My name is Matthew Adams and I'm one of the co-founders of endjin who are UK based Microsoft gold partner cloud consultancy. We help small teams do big things and I'm joined by my colleague car. She's is a recently graduated apprentice from endjin apprenticeship program and is now a fully fledged engineer. I'll leave you to derive your own pun in the spelling of engineer. So this breakout talk is all about what makes a good API and it's subtitle just don't make things difficult. Which is really my definition of a good API, but and I'm sorry if we can get this disappointment out of the way. Oops. What was happening there? Get this disappointment out of the way right at the beginning, don't expect any answers to that question in this talk.

This is really all about stimulating discussion and challenging some of the assumptions that we make about API design. Let's start. How do we define good. Does it mean popular? Does it mean widespread? Does it mean standardized? Documented lots of users. What came to mind when you saw this session title and thought about a good API? Maybe we need an example. Here's one this is an API. These are the filtering rules filtering by rules operations from the Twitter API. I think Carmel's got the documentation for this in the in their website. If she pops that up.

Carmel Eve: Yeah, so here's the documentation. The Twitter API can see that there's a lot of documentation you've got quick starts to help you with various things.

You've got a full API reference. And also I think the really interesting bit about this API is they've documented everything in the context of their use cases. So if you go in here, you can see a tutorial for specific use cases that they thought. For this API, and then they've driven all of the documentation from there, which I think is a pretty interesting way to do things.

But yeah, so there's a lot of documentation about it, but I guess is it good?

Matthew Adams: Excellent question. It does, it's leading you through through a sort of pit of success. But yeah, as you say it is that necessarily. I dunno, let's look at another one. This one is the list users operation in the Microsoft graph API. How similar does that look to the to the Twitter ones come out?

Carmel Eve: Yeah, so we've got this is the, yeah, as Matthew saying documentation for the list users API, and again, it's pretty extensive. You've got. All of the requests that you can make and the responses documented. And the interesting thing about this one is that it's it's all based off of O data specifications. So you've got a fully You've got a full list of the query parameters that you can supply in order to get different results back from the API which follows a standardized pattern. So I think that's a, yeah, obviously a lot of documentation. It's very understandable from the offset, what you expect when you supply certain parameters.

Matthew Adams: Isn't that great? Yeah. Is that good though? Carmel? I don't know. Is it no, exactly. Of course. The answer is you don't know. They might be it. Because what might be a good API for me now to solve my problem might not be a good API for you tomorrow to solve your similar problem. So when designing an API or reviewing an existing API, I think there are four main factors that you need to consider.

Now you could pick any four, but these are the four that that I'm going to talk about today. The audience time. And technology. So what do I mean by that? Audience is who's meant to be using this API and also who's maintaining it and maybe others as well. And there's time, when are we expecting people to be using this? And you might be wondering exactly what I mean by that. And we'll definitely look at that in some more detail, then there's scope. What's the scope of the API? Is it narrow in function and capability or is it part of a larger L. Finally there's technology. What technologies are we expecting consumers to be using?

And on what technologies are we implementing? The API, probably the easiest one to start with is audience. So we're seeing from that Twitter documentation that their approach is very use case driven. And that's definitely one way to start thinking about the audience. What are they trying to. But it is only a small part of it. In addition to their use case, you need to think about their skills and their experience. What do they like doing? What are their preferred ways working? Do they have any prejudices or pet hates and how do you know this is standard persona development stuff, but recontextualize for APIs and the best way to understand all.

Is literally just to talk to users. That's always the best way to develop personas is to talk to real users and it doesn't all have to be face to face communication. Look at other APIs that your users consume. Is there anything common about them? Is there anything dissimilar? The nice thing about APIs is also that they're intrinsically machine driven, provided you a careful about sensitive data and bearing in mind that could be literally anything throughput, stats, for example, often commercially sensitive given that you can ask really interesting questions about the API telemetry.

Which is another point. Do you, in fact, gather telemetry about the way people use your APIs and if not, why not? Do you publish aggregated telemetry about how your users use your APIs and solicit feedback on those insights from the users themselves, perhaps about the aggregate as a whole or their own individual usage?

Can you group your users together into patterns of behaviours beyond just their use cases? I think Carmel's got a really interesting example, which is explorers and map readers.

Carmel Eve: Yeah. So I guess Partition these people up into two groups, you've got explorers who are very data driven and they really enjoy the experimentation.

So the, when you make a request on API or an operation, not knowing exactly what you'll get back and experimenting with the different things you can do, discovering new behaviors and things that perhaps haven't thought about before. And then on the other hand, you've got map readers who are incredibly goal driven, who like to know exactly what's gonna happen when they make a request.

Want to basically take example code and put that into their solution, know exactly what's gonna happen whenever you hit an endpoint. And this means you want full documentation of what's gonna happen, the hate documentation discrepancies. And so you have these people falling into these two camps. So I think Matthew has a bold statement to make.

Matthew Adams: Yeah, so I, I would say that there are a lot more map readers out there in our industry than there are explorers. But equally, I would contend that a lot more API designers are explorers rather than map readers. And what does that tell us about the kind of APIs we are likely to design and whether they're suitable for the larger audience of consumers?

I'm not gonna make any bold assertions than that, but I think it's definitely something worth thinking. But your audience is not just the users. There are countless others involved in the exploitation of your APIs. How does your API look from all those perspectives and which do you care about and why? Why not? Is an API good. If business decision makers aren't able to make business decisions, for example, what might interest them? Might they be interested in cost per transaction or total ownership or non-functional aspects or standards and interoperability or competing teams or choices that they have, or even other demands for their attention?

What is it that's going to make your API easy to. And thinking about documenting your APIs beyond the technical and beyond the use case. Look at things like cost of deployments in common scenarios and expected support requirements for those deploying and those consuming the API, make it easy for people to understand the impact of adopting or maintaining the API. And again, talk widely don't focus too narrowly on the purely technical, even if only to consciously dismiss those areas as.

Okay. That brings us to the next topic, which is time. Remember an API that is good for me today. May not be good for me tomorrow. An API I need today. I may not need tomorrow. Would you design a custom API that you're going to stand up and use for a one time data migration from a partner that needs to be available tomorrow or yesterday in the same way as you would design a general purpose API for onboarding a large customer base.

And if you wouldn't, why wouldn't you and would one be good just because it was a more general purpose and one bad or. There's only one truism that comes when it comes to time. And that's that the API has to be available to do what is needed when it's needed a beautiful API a day too late or one that's being maintained beyond its useful life is always a bad API.

That's, definitively. So an important aspect of this time factor is the natural life cycle of the API APIs cost money to deploy and maintain, even if they're unused or underused, you always need to audit your API portfolio and build in effective monitoring procedures and processes just to manage their life cycle. Designing an API is not just about designing the endpoints, but the support and governance regime around them too. How do you know when an API is no longer performing efficiently or cost effectively or securely? How will you version and change it? How will you manage its end of life? How will you migrate users between versions or when the API is deprecated?

Perhaps another is recommended in its place who owns this audit and the overall portfolio of APIs. Is it the same as the owner of the API? And if not, why not? All these things need to be considered as part of the design process, right? From the outset. You also need to think about the consumer's perspective on the life cycle. How often do they need to update their clients as the API evolves? Is your goal outward stability and extension with internal improvements? Or are you moving fast in breaking things? Always remembering that those things you're breaking are usually your.

And are you aware when you transition from one to the other? And that's really a question of API maturity. If an API has not withered and died and remains useful, it does gradually mature and a mature API usually changes little and it's clients quietly depend on it. A good mature API still evolves as we have discussed, but it does so slowly and in a carefully considered manner.

Backwards compatibility is important when many clients or critical systems depend on it, just working is always done. And in some industries that might even be a critical regulatory environment. Regulatory require. The problem is that a lot of those mature APIs were built originally by move fast and break things, teams. That's how it accelerated its evolution towards being a useful entity in the ecosystem. But the traits of a team that were good at moving fast are not necessarily the traits of a team. Good. At predictable evolution, you need to have the right people involved at the right. The critical moment comes when the API shifts imperceptibly from being seen by its clients as young and frisky to mature and.

The process by which we manage the change and the evolution of the API needs to change. And sometimes even aspects of the API itself in order to support that change. For example, you might need to add the health probes and telemetry that you got away with without while it was in its youth. Now that people are expecting it to just work without error, that might be harder.

If the mature API has become large in scope, cuz APIs always seem to get larger, not smaller. So scope, how much does your API do? Is it the size of the Microsoft graph API or is it the HMRC? Hello API. If you want to bring that up Carmel for people who haven't seen the HMRC, Hello API. So HMRC for those that don't know is the revenue service, the government revenue service in the UK. And they have a hello world API.

Everybody should have a hello world API like this. It's great. It's got it's got three endpoints on it. One that shows you an unrestricted endpoint. One that shows you an endpoint restricted by user. And one that shows you an endpoint restricted by application. And there three of the core horizontal patterns for accessing APIs in the HMRC portfolio, having an, but with the simplest possible way of accessing it, having endpoints like this is, genuinely a really great thing to do when you're building your APIs.

So that consumers developers can determine whether they're basically talking to it in a in the correct way for the common access. And lots of APIs start out as simple as the hello API and become the Microsoft graph API over time. One of those is the Microsoft graph API. It started out really simple and is now quite large. So what happens, smaller services aggregate and become more interdependent until eventually you add some kind of instant messaging type operations to your API service area. And, you've reached enterprise API, critical mass.

There are already four areas to look at. Then when you are considering the scope of an API, and we'll start with the easy one, which is use cases, can you enumerate the end user use cases that are intended to be supported by the API? And does that look like a manageable amount of work for any one API team? And if not, can you draw boundaries between those use cases and deliver the distinct subsets of the API? And if so, you should consider doing so, or at least documenting it in that. Similarly, how tightly coupled are the operations in the APIs, which ones must deploy your version together, which can be separated.

This helps you understand the natural boundaries within the API itself. You might call these services. Are they too big? Are they too small? Are they just right? Because there's no rule for what just right. Is microservices mini services, macro services, but you need to know it when you see it in your context. What constitutes good for. And finally there's the overall surface area of the API. How many operations are there? How many knobs and dials on those operations? One area of scope creep in an API comes from what is supposed to be a good thing, listening to user feedback and implementing their suggestions.

And with that APIs tend to go one of two ways over time. Almost always it's based on the addition of complexity and how we manage the addition of complexity, because people want to provide every way a client has ever imagined wanting to access the service. And within that, the complexity may manifest in different ways. The first is the apparently simple API that actually provides a highly generalized tool. We've seen one of those already an no data API, which could come into that highly generalized category. GraphQL is another popular example of that approach. It provides a general purpose query language over some domain.

I think Carmel can pop up the URL for that. Site. There we go. So yeah, so it's a query language for your API. A actually it's a, it's an API as a query language, right? You expose this this query language out through your APIs and it's designed to let you to be very highly generalized this evolve, your API without versions thing is actually it defines an approach to versioning, which is essentially a free.

And is that a good or a bad thing? It's up to you to decide. The advantage is that it provides a standardized where delivering some horizontal functionality. The danger with this approach is that you become a prisoner of that general generality. You try to be as general as possible and lose any connection to real use cases. On the other hand, there are simple single purpose API. They're easy to understand and target one particular precise use. The challenge is that they either proliferate and become a complex mass of many, slightly different operations, where it's hard to understand exactly which one services, your exact use case, or they start to gather more and more pro properties and parameters, levers, and dials.

And head back towards being general purpose, although often worse than the general purpose case, cuz they've not been designed systematically. They're just a mass of buttons. One of those classic developer settings forms with 3000 fields, all of which seem to say the same thing. Once again, there is no generally good choice, but you need to understand where you are on this spectrum and in which direction you're one common anti pattern. One common, definitely bad scenario is to be the worst of both worlds. You have lots of APIs that started out specialized and have all become more and more general. So they all they overlap in functionality. There are, they are overloaded with properties and parameters and there's no clear way through the system for any use.

So try and spot that happening early and stop it because it's it's one of those boiling the water kind of things you don't really notice until it's too late. What's been going on. Sorry. We've got one last thing to talk about, and that is technology. You'll notice that we haven't really talked about technology at all here.

Once this would've been a key part of a talk like this, depending on your tech stack, you'll be strongly guided down one API design path or another is my platform of choice, RPC based or object, remote thing, or message pipes. Modern client and server technologies are pretty flexible in types of API patterns.

They support. And I'm guessing the vast majority of people here are designing APIs implemented over HTTP. Not all. I'm sure there's at least someone implementing something very different on an IOT device, but there has been a lot of technology convergence. That doesn't mean that technology doesn't impinge on good API design in particular, on the types of client that we've support.

As you said already a good API needs to fit with the client's needs and expectations. One of the nice things about modern API design is that we could document our design using something like open API and just generate clients for all manner of languages and operating systems that would give us the widest possible reach. Or we could carefully handcraft clients for selected use cases, encode idiomatic to the platform concerned that would give the best experience. Now Microsoft's approach to designing clients is interesting. In this regard, they usually generate a low level rest API for the platform and then build idiomatic clients for.net and JavaScript and Python and Java.

Usually over that lower level API, it's a lot more work that gives a better overall experience. And they conversion the server APIs independently. Your client's APIs, interestingly there, because they've got this this sort of adapter. The deeper point behind this part of the discussion is that when we design, we have to think of our API, as it extends out into the client space, not just as it's exposed from the server.

For example, we could design beautiful APIs that include non polymorphic types that might be represented as a discriminated union. Great for clients in JavaScript or F sharp say, but not so good for C sharp developers who don't yet have an efficient implementation of that. Our APIs might be beautifully expressive.

If you can cheaply and easily reflect over the responses to discover their shape and content bearing in mind the differences between explorers and map readers. We talked about earlier on, but if our client platform imposes a serialization layer on us, say that might be brittle or expensive to operate. You may be able to mitigate this by handcrafting clients, as we mentioned on the previous slide, but you may equally have to balance economy of expression in the server for elegance of expression in the client. Finally, at least in this session, we come with the thorny question of designing for non-functional requirements.

A good API for smaller data sets of say a million or so records becomes a bad API over billions of purchase orders. The limitation being largely technological, perhaps what you can page into memory on the client is the constraint or the bandwidth for the client's connection to the service. For example, A common mistake.

Of course, as we all know is to design the APIs. It's never going to see more than a dozen records layering in paging or continuation tokens after the fact is always a pain. And interestingly, there, we have the question of whether paging or continuation tokens are good or bad. And again, your mileage may vary and some of that may be directed by your implementation technologies under the covers because your data store may support curation token model rather than a paging model.

And so it might be easier to follow one path than another. Conversely another common mistake is to design and test the API as if it has to support billions of records. Whereas in fact, you'll only ever face a few thousand in your real world scenarios that leads to excessive cost of implementation maintenance, and actually often poor performance in the smaller numbers records cases.

So again, good and bad are very contact sensitive. So what's a kicking off point for discussion from this session. As you say good and bad are entirely context sensitive. You need to define what you mean by good or bad in all of these areas and others that you might think are often prioritize as more important before you finalize your initial API design.

And you need to reevaluate this throughout the life cycle of the API. You need to understand that life cycle and consider it from the out. You need to think beyond the service boundary and out into the clients and the users to be able to do this. So thanks very much for joining us. I hope this kicks off a good debate.

We'll be around now online. This was past us. Pre-recording this this talk but current present us are I hope providing we haven't had any major incidents right now online and we'd love to we'd love to chat to you about this and start to kick off this discussion from here. I'm Matthew Adams.

My colleague Carmel is here too. Thanks very much. And we look forward to talking to you. Thanks very much.