Avoid Transactions over Multiple Microservices
Context
Microservices are in use or are planned to be adopted. There is an apparent need to implement transactions across microservices in order to guarantee strong consistency.
Problem
- Distributed transactions are notoriously difficult and error-prone to implement.
- Distributed transactions introduce coupling between microservices since all involved microservices need to be available for the transaction.
Solution
Avoid distributed ACID transactions spanning multiple microservices because of the following reasons:
- Distributed transactions have the prize of high coupling since all microservices need to be available.
- The more participants in a distributed transactions, the lower the overall availability of the system.
- ACID systems don't perform well at large scale L59.
- Distributed transactions are notoriously difficult to implement.
- Heterogenous data stores of microservice pose and additional hurdle to ACID transactions.
We recommend to follow these techniques to avoid distributed ACID transactions over multiple microservices:
- Question the transactional nature of your domain.
- If eventual consistency is acceptable:
- use workfows or no-ACID transactions in combination with compensation workflows to realize the use-case, or
- use data replication to make data available to other microservices with eventual consistency guarantees.
- If there is a need from domain perspective for strong consistency:
- reconsider the service cut to satisfy strong consistency needs.
Maturity
Proposed, requires evaluation.
Sources of Evidence
L14:
- decentralized responsibility for data across microservices => implications for managing updates
- traditionally transactions when updating multiple resources
- (+) consistency
- (-) coupling
- distributed transactions notoriously difficult to implement
- microservices emphasize transaction-less coordination
- accept that only eventual consistency
- problems dealt with compensation operations
- traditionally transactions when updating multiple resources
L20:
- heavy distribution of data => challenging to implement distributed transactions
- even to query data due to heterogeneity of data stores
- need for techniques to simplify execute transactions/queries on distributed and heterogeneous data stores
L22:
- principles from before microservices
- "no distributed transactions"
L34:
- Transactions spanning multiple microservices: complex
- avoid distributed transactions => however, choreography
- design should target transactions spanning single microservices or involve message queue
- a no-ACID transaction type proposed for this context: compensation transactions
- further research required
L52:
- Microservices not well suited for complex and long-running transactions with data updates
- usually embrace eventual consistency within highly distributed environment
L55:
- each microservice has private database
- => difficult to implement business transactions across microservices
- => difficult to maintain data consistency across multiple databases
L59:
- monolith often based on reliable ACID transactions
- proven to be impossible to run ACID systems at very large scale
- large systems must live in a BASE world: basic availability, soft state, eventual consistency
- often get data through API calls to microservices
L61:
- data consistency and transaction managing as challenge
LN43:
- Context: comparison to previous monolith
- monolith had DB with ACID transactions, was easy
- easy had to wait for whole transaction to complete
- now multiple services with own db
- transactions harder to handle
- more time to be spend on dealing with transactions
- rather using transactions, microservices agree on eventual consistency of data
- changes done by other services might not persist immediately
- but eventually; when service processed message
- user cannot immediately explore the data that dependency service will create
Interview B:
- Example of web shop
- if cut: Order service, customer service, inventory service, address service, payment service, ...
- need for transaction over multiple services
- (-) complex, especially if it doesn't work
- (-) introduces a dependency: need to be available in time span
- transactions over multiple distributed dbs don't work
- there are mechanisms
- but they don't work in a world where services should be independent and have own db
- Monolithic DB => transaction, but won't lead to success!
- Saga pattern usually used
- distributed domain transaction split into multiple local technical transactions
- in case of error: need for domain compensation
- in case of error of compensation: need for another compensation
- can become arbitrarily complex
- end with error in a database column that requires manual fixing
- but I need to detect the error first!
- Big problem: developers are used to think in transactions since coming from monolith
- trick: keep flow within service => no transaction
- => need for transactions influenced by service cut
- Trick: question the domain functionality => is the reality transactional?
- not as transactional as we might think!
- Example: opening a bank account
- fill out many forms
- mandatory fields: without cannot go further
- example: birthdate of wife
- can't open bank account
- in reality bank account would have been opened, but without wife having access
- DDD, distributness, etc. forces us to think about domain alternatives
Interview C:
- Past: don't do anything related to transactions
- Now: Saga pattern, event sourcing
- solve parts of the problem + nice additional benefits
- still hard to find root cause if errors occur
- otto.de
- where in system is eventual consistency not okay?
- customer gets delivery in 2-3 days, even with prime next day
- system has to be running really bad if order processing does not happen within 8-10 hours
- change of catalogue
- what if synced over night? No problem!
- bank transfer
- I'm not interested in it happening consistently right now, but within 2-3 days
- 2-step mode: reserve money
- bank hopes not 2 reservations too shortly after another
- usually cannot use my card that fast twice to go over my limit
- the actual processing happens over night because it is expensive (HW security modules, transactions, 2-phase commit over multiple institutions, ...)
- bank: how are the odds something goes wrong? very low!
- how much would it cost to build system better? much more!
- => accept the risk and that's it!
- Need to get "transaction as something holy" out of heads
- otherwise end up in central database
Interview D:
- Context: consistency huge topic in microservices
- customer: everything has to be ACID over all systems => strong consistency
- he starts calculating in sense of availability (from past experience, not really calculating)
- availability of 99.5% incl. planned downtimes
- usual RZs: 99.7% on mainframe with transaction monitor up for 14h per day => effectively guarantee of 60%
- 10 systems involved => overall availability of 95%
- 50 systems involved => overall availability of 75%
- => every fourth transactions fails; is that okay for you customer?
- quick answer: no
- he starts calculating in sense of availability (from past experience, not really calculating)
- eventual consistency requires much explanations
- how much consistency is really needed?
- are there places where I need more? => please don't distribute!
- replication => eventual consistency - awareness for that!
- which systems are leading for which data? (differs ffrom monolith which was leading for all data)