speculative nonfiction
Thinking about heuristics for avoiding code duplication across the stack
At the office I’ve sprinkled some glue labor on a piece of documentation that attempts a heuristics for avoiding code duplication across client and api layers. I kept seeing this type of repetition occur and it was creating too much maintenance risk in the code.
Here’s an example. Imagine a piece of code that takes inventory items from an order and derives a cost based on the order status. This calculation could be executed in the api layer, where it would be serialized into response body adjacent to the original cost data. Something like:
{
inventory_items: [
{
order_id: 1234
cost: {
major: ...,
minor: ...,
// etc...
},
cost_display: 'MXN186.67' // <== augmented data
}
]
}
This is a totally reasonable API design. However, a client developer could decide to do the calculation in the view code with a more raw serialization of the inventory item and order objects, and then merge that data:
const composeInventoryItemForDisplay = (inventoryItem) {
...inventoryItem,
costDisplay: calculateCostForDisplay(inventoryItem);
}
function calculateCostForDisplay(inventoryItem) {
let newCost;
const order = lookupOrder(inventoryItem.get('order_id'));
const currency = order.get('currency');
// subsequent processing steps...
return newCost;
}
In a software system there are many floating nodes urged toward similar ends: teams, software that’s running, software that’s being worked on, etc… It hardly uncommon for both a front and back end team to write some code to solve a similar problem – what might even be a simple calculation – while totally blind to the other side tapping away. Frequent team project reassignments, geospatial and organizational distance, depth of expertise, weak cultural value around curious coding, etc… these types of contention costs create the regrettable complicatedness, the vivant natura, that results in risky duplications. Especially risky for inconsitencies that might develop inside essential software modules like pricing computations.
Supplemental thoughts from Coda Hale’s new blog post going around:
Contention costs grow superlinearly as new members are added.
Coherence costs grow quadratically as new members are added.
Limit the number of people an individual needs to talk to in order to do their job to a constant factor.
I’m not sure I would have believed this type of thing would happen before I joined a larger company with an engineering group spread across geos; teams flung across the ownership matrix by the high winds of market shift. This investor and wall street casino shit drives tough team decisions. Lol team pet names: stunted protologisms. Like, what happened to the Squirrel team? Oh, they are the Crib Gto Pomp team now. Well, at least we are consistently circumlocutory at naming teams.
But lo and behold, such an example like the above cost calculation exists. I saw something like it in our JavaScript code. Then I asked the folks who work on backend if the frontend really needed to be doing this work. We shouldn’t. We literally shouldn’t because the computation for this existed in our backend service already. This is one reason we can’t have nice things.
So I’ve been thinking a lot about an approach to code, a kind of mental shim, to mitigate the coherence costs across the stack. Even with a fluctuating number of devs with knowledge about the code. Maybe we should just call this predicament patchy; patchy coding. Here’s what I’ve got so far with a cute story and real life examples:
Avoid duplicating code across the stack
Consider the following scenario which is based on true events.
One day Team RocketShip is working on a feature for Big Initiative X. They implement some computational logic in Ticket Availability Service which adds a new attribute on TicketClass. The code is great, it ships, customers are elated. But soon after, Team SpaceShip is tasked with another piece of Big Initiative X and needs to display the result of similar computational logic in a React app. However, Team SpaceShip doesn't know about Team RocketShip's previous implementation in TAS, and they decide they can actually write this computational logic into the client code using existing TicketClass data. The code is great, it ships, customers are elated.
Uh oh. Now we have duplicate code on the backend and frontend. Which means code that is very likely to become out-of-sync and result in disruptive maintenance work or bugs in the future. But how do we avoid this situation? Unfortunately, this is one of those social (read: people) problems of software engineering. There's no way to reliably automate prevention with static analysis of the code.
Heuristics:
There > 1 attributes returned from the API that are similarly named
Here's a real life example:
TicketClass used to have attributes for auto_hide, hide, auto_hide_before, and auto_hide_after that client code in checkout utilized to compute the ticket's "hidden" status. In other words, client code asked of these associated attributes: should this ticket be unavailable for purchase. However, creating statically derived properties from a mixture of data may be an indication the backend can do this work for you, or already has.
Ticket Availability Service already had code that computed this logic. Ultimately the frontend computation was replaced with a new attribute added to the TicketClass API.
There > 1 attributes returned from the API that are related "communicationally," in that they reference similar types of data that are computed differently.
Similar to the example above, the existence of auto_hide and hide attributes seem conceptually related – through "hide" – and can be a clue that there may be logic on the backend that could be reused to for hidden status calculation.
I need a static value that won't be impacted by user behavior after first load
Make the backend do the work
Heuristics:
Will the iOS and Android team's have to implement this too?
Always remember that the desktop website is just one place where we write code for consumers of our applications. If the feature you are building will likely end up on a device, it's likely the backend code can expose what you need in the API.
Monday January 20, 2020