Let's Make Extending API Formats Safe and Easy
It turns out, there's a pretty easy and very safe way to support API stability and change over time.
One of the common challenges in creating long-lasting, stable APIs is designing them in a way that supports easily updating them with new fields, modified responses, and other details. Getting the API design right the first time is extremely unlikely. On the flip side, getting requests to modify an existing API design is almost inevitable.
So how do we reconcile these seemingly opposing forces? It turns out, there's a pretty easy and very safe way to solve for both stability and change over time. I'll share with you a pattern that I use to address this problem for almost all my API messages designs.
The type is strong with this one
Typically, API responses are designed as strongly-typed objects. For example, here's a Person API message:
{
"givenName": "John",
"familyName": "Doe",
"age": 21
}
And this object is usually described -- defined, actually -- by a schema document. Notice that I've added the "additionalProperties": false element to the schema below.
{
"$id": "https://api.example.org/person.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Person",
"type": "object",
"additionalProperties": false,
"properties": {
"givenName": {
"type": "string",
"description": "The person's first name."
},
"familyName": {
"type": "string",
"description": "The person's last name."
},
"age": {
"description": "Age in years which must be equal to or greater than zero.",
"type": "integer",
"minimum": 0
}
}
}
The good news is we have a simple object and a strict validation schema. We can release this into the wild and all will be good -- until someone wants to add a new field to the object (for example. middleName). Now, we need to update the schema and update the runtime message and update the API producer code and update the API consumer code.
NOTE: I'll point out that we could code the API service and/or client apps to depend on the shared schema alone as a way to process and validate the messages. This is an added level of abstraction that we'll address in another post.
The problem we run into with changing strongly-typed API messages is that the designer is dependent upon all the API producers and consumers to first support the change before you can safely roll it out. That's easy when your team implements all the API producers and all the API consumers. However, if your API design is used by people you don't control -- including people you will never actually meet -- this total control of strong-typed updates is very unlikely.
Instead what we need is a safe way to modify an existing design. One that doesn't require all the API consumers and producers to implement ahead of time. And one that is not likely to break any existing API applications.
And we can do that with an extension pattern.
I can haz extensions?
A common challenge for updating exsiting API message formats is to add new fields. The problem is that adding the field (e.g. middleName) usually means updating the schema -- unless you already accounted for that possibility.
How do you do that? By adding an extension structure to your API designs from the very start. I do this by adding an array of name/value pairs like this:
{
"givenName": "John",
"familyName": "Doe",
"age": 23,
"nvp" : [
{"name" : "middleName", "value" : "Seymore"}
]
}
Note that there is an array (nvp) that can hold one or more simple objects ({"name": "...", "value": ...}). Here's a JSON Schema document for this new Person type:
{
"$id": "https://example.com/person.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Person",
"type": "object",
"additionalProperties": false,
"properties": {
"givenName": {
"type": "string",
"description": "The person's first name."
},
"familyName": {
"type": "string",
"description": "The person's last name."
},
"age": {
"description": "Age in years which must be equal to or greater than zero.",
"type": "integer",
"minimum": 0
},
"nvp" : {
"type" : "array",
"items" : { "$ref": "#/$defs/nvp"},
"description" : "List of name/value pairs.",
}
},
"$defs": {
"nvp": {
"type": "object",
"required": [ "name", "value" ],
"properties": {
"name": {
"type": "string",
"description": "The name of the property."
},
"value": {
"type":["number","string","boolean","object","array", "null"],
"description": "The value of the property."
}
}
}
}
}
Now, any time you want to add a new field to the message, you don't need to update the schema. You just need to add the field to the array of name-value pairs (NVP). Notice that the above schema makes it possible to define any value type for the name NVP including object and null.
Take a walk on the mild side
This additional name-value pair element in my API designs give me a "mildly-typed" message format. I know there will be a strict structure and I allow future extensions to be added safely. This is a kind of indirection. What you might call a ju jitsu ("gentle art") of applying schema to a message format.
NOTE: I've seen people want to strengthen this "mildly-typed" schema by adding a collection of _possible_ values for the name-value pair. That's not a good idea. It's better to keep this part of the message loosely typed to avoid conflicts down the road.
Now, for those who consume a lot of JSON objects, this might seem like an inefficient design. As time goes on, it's likely that important data elements you need to use are "hidden" in the NVP array (called nvp in my example), making it harder to access these "additional properties."
But there's a handy way to solve that, too.
One ring to access them all
At the outset, this looks like we're in for a "two-tier" message model. The strongly-typed properties (I call them explicit properties) and the loosely-typed properties (I call them the implicit properties). It would seem to take additional effort to know whether the property you are looking for is explicit or implicit before you write code to access it. All because we have this added level of indirection through the use of an NVP in your message.
But we can use a simple accessor routine to overcome -- actually hide -- this indirection. Here's what it looks like:
//
// return a single property
// whether explicit or implicit
// args = {name:n,message:m,nameValuePair:p}
//
function find(args) {
var a = args || {};
var n = a.name || "";
var m = (a.message || local.m) || {};
var p = (a.nameValuePair || local.p ) || "nvp";
var r = undefined;
if(m==={} || n==="") {
r=undefined;
}
else {
if(m.hasOwnProperty(n)) {
r=m[n];
}
else {
if(m.hasOwnProperty(p)) {
try {
r = m[p].filter(function(i) {return i.name===n})[0].value;
}
catch {
r = undefined;
}
}
}
}
return r;
}
Now I can write simple property requests like this:
console.log(find({name:"givenName", message:person}));
console.log(find({name:"middleName", message:person}));
Even better, I can add new elements into the NVP space without worrying that client apps will break. For example:
{
"givenName": "John",
"familyName": "Doe",
"age": 21,
"nvp" : [
{"name" : "hatsize", "value" : null},
{"name" : "middleName", "value" : "Seymore"},
{"name" : "nicknames", "value" : ["J","JJ","Johnboy","Jack"]},
{"name" : "address", "value": {"street":"123 main", "city": "byteville",
"state": "MD", "zip": "12345"}}
]
};
Now, whether the target of my search is an explicit property or an implicit property, I can use the same routine (find()) and the results will always we deterministic -- event if the results returns +undefined+.
But there's more...
One more thing...
Yes, this is an added bit of structure to your objects and to your code. But the payoff is not only a universal way to access properties in a typed object. Since, as a message consumer, I am now using a module that allows me to ignore whether the property I am looking for is an explicit field (part of the message structure) or an implicit field (a member of the extended NVP), the following messges are -- as far as I am concerned -- equivalent (see below)
Here is a message where the middleName property is part of the NVP:
{
"givenName": "John",
"familyName": "Doe",
"age": 21,
"nvp" : [
{"name" : "middleName", "value" : "Seymore"}
]
};
And here is a message where the middleName property is part of the message structure:
{
"givenName": "John",
"middleName": "Seymore",
"familyName": "Doe",
"age": 21
};
Now, when I use this line of code in my message consumer:
console.log(find({name:"middleName", message:person}));
I'll get the same reply for each of the example messages.
This provides an added level of abstraction between message consumers and message producers and that means minor changes in the layout of messages passed back and forth are unlikely to cause breakage or confusion when you are passing messages back and forth between machines. Igt also offers the possibilty of supporting multiple schemas at runtime — another way to ensure compatibility for future changes.