> While small microservices are certainly simpler to reason about, I worry that this pushes complexity into the interconnections between services
100% true in retrospect.
I've found a lot of bugs in software in my career and basically none of them were at a single spot in a codebase. They've all been where two remote spots (or even more frequently to spots in two different codebases) interact. This was true even before microservices were a thing.
There's a stat I think quoted in "Code Complete" by McConnell that says the number of bugs in a system strongly correlates with the number of coders. The conclusion is that as the # of coders goes up, the # of lines of communication between them grows exponentially, and it's the lines of (mis)communication that lead to bugs.
This:
1. explains Brooks' assertion that adding coders to a late project makes it later
2. emphasises the importance in clearly defining interfaces between components, interfaces being the "paths of communication" between the coders of those components.
So your assertion is well founded.
I mean, microservices split the code into smaller chunks but now lots of little pieces communicate over the network and unless you are using some form of RPC, this communication channels are not typed and there's a lot more stuff that could go wrong (packets dropped, DNS not resolving). Plus you could update one microservice and not update its dependents. I think a lot of people jumped on the hype without realising that it's a trade-off
I work at a largish org (where microservices make sense but there are monoliths too) and the scary bits are unowned functionality. Leaning into a platform but being a business it isn't pure generic like say AWS it knows ahout the business. Some features are distributed across dozens of services. It is a skill hunting down who to blame for a problem. Not blame, ask for help, of course ;)
I dream of a SQL like engine for distributed systems where you can declaratively say "svc A uses the results of B & C where C depends on D."
Then the engine would find the best way to resolve the graph and fetch the results. You could still add your imperative logic on top of the fetched results, but you don't concern yourself with the minutiae of resilience patterns and how to traverse the dependency graph.
Isn't this a common architecture in CQRS systems?
Commands to go specific microservices with local state persisted in a small DB; queries go to a global aggregation system.
Datomic?
AI has also changed the dynamics around this. Splitting things into smaller components now has a dev advantage because the AI program better with smaller scope
A separated component does not necessarily mean a microservice. It could be its own process, its own module, or even just its own function, which is fine. But microservices bring their own problems.
> AI has also changed the dynamics around this. Splitting things into smaller components now has a dev advantage because the AI program better with smaller scope
This is not AI specific and nothing new and also precisely why microservices are a good solution to some problems: They reduce a teams cognitive load (if architected properly, caveats, team topologies, etc, etc)
Well yea... but the big con of microservices is still a thing: unexpected interactions
But some of that could be mitigated I guess.
99% of systems out there are not truly microservices but SOA(fat services). A microservice is something that send emails, transforms images, encodes video and so on. Most real services are 100x bigger than that.
Secondly, if you are not doing event sourcing from the get go, doing distributed system is stupid beyond imagination.
When you do event sourcing, you can do CQRS and therefore have zero need for some humongous database that scales ad infinitum and costs and arm and a leg.
A lot of this first law was specifically coupled to how these systems often hid that distributed objects were distributed. In the past 10 years, async has become far more common place, and it makes the distributed boundary much less like a secret special anomaly that you wouldn't otherwise deal with and far more like just another type of async code.
I still thoroughly want to see capnproto or capnweb emerge the third party handoff, so we can do distributed systems where we tell microservice-b to use the results from microservice-a to run it's compute, without needing to proxy those results through ourself. Oh to dream.
Async fixes one problem with microservices. It does not fix the unexpected latency swings, the network timeouts and errors, the service disruptions when the microservice is unavailable, etc.
or the mismatch between request and response when using HTTP, or the overhead of using RPCs to protect against the previous scenario, or the issue of updating one microservice and not updating all the dependents