Immutable vs mutable schemas in APIs
How mutability affects reliability of APIs when consuming historical data.
For some time, I have been pondering how the nature of data, whether mutable or immutable, provided by an API, changes how consumers can build solutions on top of the API.
In this post, I'm focusing on API consumers that do not contribute data. I suggest that such API consumers will prefer an immutable schema API, meaning that the same request will provide the same data when requested at a different date and/or time regardless of preceding events in the system of the API.
Let's explore this characteristic with an example of two order systems, Impala has immutable characteristics, and Mantee has mutable characteristics. Both systems has public APIs that expose commerce order data.
Both APIs provide the endpoint with the single order schema outlined below:
api/orders?startDate=XXX&enddate=YYY
The endpoint would return an array or paging object but lets omit that to avoid a bunch of boilerplate.
Mantee Order Schema
{
"orderId": 0,
"orderItems" : [{ ... } ],
"payments" : [ { ... } ],
}
Impala Order Schema
{
"orderId": 0
"events" : [
{
"items": [{...} ],
"payment" : { ... }
}
],
}
Note that the immutable schema in this post has only been constructed to be enough to explain the concept of this post.
Scenario
The Mantee and Impala APIs are consumed by a consumer that tries to sum up how much VAT was accumulated on each day.
At
10:30 on the the 11th of January
a customer pays for an order of 100 pebbles and 100 booksAt
9:30 on the 12th of January
the customer they only need 10 pebbles so a refund with quantity of 10 is made.At
18:01 on 13th of January
the customer notes that an extra 10 books were not needed so 10 books are refunded.
For this scenario we will now explore what the order data will look at when making a specific request after each event above.
Requests at 11:00 on 11th
ManteeGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39,
"orderItems" : [
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 40.00,
"description": "books"
}
],
"payments" : [
{
"id": "xyz",
"timestamp": "2024-01-11T10:33:46Z",
"amount": 8480
}
],
"updated": "2024-01-11T10:33:52Z"
}
ImpalaGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39
"events" : [
{
"type": "purchase",
"timestamp": "2024-01-11T10:33:46Z"
"items":
[
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 40,
"description": "books"
}
],
"payment" :
{
"id": "xyz",
"amount": 8480
}
}
],
updated: "2024-01-11T10:33:52Z"
}
Requests at 10:00 on 12th
ManteeGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39
"orderItems" : [
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 10,
"description": "books"
},
{
"quantity": -10,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
],
"payments" : [
{
"id": "xyz",
"timestamp": "2024-01-11T10:33:46Z",
"amount": 8480
},
{
"id": "abc",
"timestamp": "2024-01-12T09:29:46Z",
"amount": -424
}
],
updated: "2024-01-11T10:33:52Z"
}
ImpalaGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39
"events" : [
{
"type": "purchase",
"timestamp": "2024-01-11T10:33:46Z"
"items":
[
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 40,
"description": "books"
}
],
"payment" :
{
"id": "xyz",
"amount": 8480
}
},
{
"type": "refund",
"timestamp": "2024-01-12T09:29:46Z"
"items":
[
{
"quantity": -10,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
}
],
"payment" :
{
"id": "xyz"
"amount": -424
}
}
],
updated: "2024-01-11T10:33:52Z"
}
Requests at 19:00 on 13th
ManteeGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39
"orderItems" : [
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 40,
"description": "books"
},
{
"quantity": -10,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": -10,
"vat": 0.06,
"amount": 40,
"description": "books"
},
],
"payments" : [
{
"id": "xyz",
"timestamp": "2024-01-11T10:33:46Z"
"amount": 8480
},
{
"id": "abc",
"timestamp": "2024-01-12T09:29:46Z",
"amount": -424
},
{
"id": "def",
"timestamp": "2024-01-13T18:01:46Z",
"amount": -424
}
],
updated: "2024-01-11T10:33:52Z"
}
ImpalaGET api/orders?startDate=2024-01-11T00:00:00
{
"orderId": 39
"events" : [
{
"type": "purchase",
"timestamp": "2024-01-11T10:33:46Z"
"items":
[
{
"quantity": 100,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
},
{
"quantity": 100,
"vat": 0.06,
"amount": 40,
"description": "books"
}
],
"payment" :
{
"id": "xyz",
"amount": 8480
}
},
{
"type": "refund",
"timestamp": "2024-01-12T09:29:46Z"
"items":
[
{
"quantity": -10,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
}
],
"payment" :
{
"id": "abc"
"amount": -424
}
},
{
"type": "refund",
"timestamp": "2024-01-13T18:01:46Z"
"items":
[
{
"quantity": -10,
"vat": 0.25,
"amount": 33.92,
"description": "pebble"
}
],
"payment" :
{
"id": "def",
"amount": -424
}
}
],
updated: "2024-01-11T10:33:52Z"
}
Summing up VAT
When consuming the Mantee API, we can match payments and order items as long as each event is processed when it occurs. Even if we need to correlate the order items with payments it works well in normal circumstances and when we are not interested in data in the past.
Notice what happens if we after the last refund tries to sum up VAT for each day. Since there's only one positive payment we can match with order items that are positive and get the correct result. For refunds abc
and def
we try to do the same but unfortunately the payments have the same amount.
While the same amount is one way the correlation becomes unreliable it is not the only one. Number of order item combinations or complex use of discounts could also be the source for correlation to be unreliable.
With Impala (the immutable schema), payments and items are grouped into events. For this reason working with events as they occur or far in the past will have no difference.
Conclusion
Mutable APIs lack reliable data handling for any consumer working with historical data. This becomes most prominent when the consumer tries to work with data that was produced before the consumer was activated to consume data from the mutable API. But in other cases it also lacks reliability that an immutable API provides. Imaging that a system is unresponsive and cannot handle incoming webhooks. While the system normally handles the example scenario when receiving webhooks on every purchase. It will during an outage be required to rely on risky correlation. With a growing number of integration the likelihood of being unable to correlate correctly will increase.