Microservices: Distributed Transaction Management (Part 1)

Agnivesh Verma
6 min readJul 2, 2021

--

Most of the application we have been working/using in the past used relational DB. DB transactions are supposed to abide by ACID properties to ensure the accuracy and state of data. ACID properties in the database system context are fundamentals, let’s just see what it means. It stands for Atomicity, Consistency, Integrity and Durability.

Atomicity — It’s all or none. That means either complete transaction happens in a single shot or doesn’t happen at all.

Consistency — Database should be in a consistent state before and after the transaction.

Isolation — Multiple transactions should be able to run in parallel without any interference to each other (i.e., in isolation)

Durability — Transaction updates are durable (persistent) on successful completion, means updates are written to the disks, not ephemeral and should persist the system failure if any.

In a monolithic application, with use of relational DBs it’s comparatively easy to have ACID compliant transactions. There is a single database and transaction initiated to that DB can be committed or rolled back based on the outcome of the transaction. While it’s easy to work with, it brings lot of challenges like unavailability of a resource in high volume system. A record under transaction kept locked unless the transaction completes and that makes it unavailable for any other transaction which needs that record.

Unlike, with Microservices architecture which recommends SRP (Single Responsibility Principle), it becomes very complex. SRP follows one DB per microservice, and no other service can make use of that DB without going through the corresponding microservice. Now when there are multiple databases are involved, transaction is called distributed transaction and there is no easy and simple way to maintain ACID properties in a distributed transaction.

There is different level of abstractions that can be applied to conform with SRP –

Server Based — Each service has its own database server. Its good until different microservice are using different DBMS like Oracle and MySQL etc. or service needs a very high throughput, and a single instance limits the performance if shared among multiple services.

Table Based — Each service has its own table (s). Its good if a database table is relevant to only one business context and doesn’t need to be duplicated for another one.

Schema Based — Each service has its own schema, and all the tables with respect to that service belong to that schema.

Based on the use case, database abstraction can be chosen to implement, however table or schema-based abstraction are used mostly.

Transaction management in Distributed systems is a real challenge and part of almost every business application. Every business application provided some functionality to end user and for that user needs to provide some input and system performs some transaction and based on the functionality it either provide results to user or persist the data for future/further transactions. Transaction can be imagined as a unit of work and real-world transaction examples are deposit, withdrawal, send email, online orders, making payment etc.

Transaction management in Microservice architecture is handled with the use of some patterns. These patterns are not exclusive, in some cases only one pattern might work, in other you may need a combination of more than one. Following strategies are commonly used to handle distributed transactions in Microservices –

1. Eventual Consistency

One of the strategies is to just make sure that transaction will be successful even though if it takes time, so data will not be consistent all the time but eventually be consistent at some time in future within expected time. This is good for long living transactions.

Let’s take an example of an online banking application. You must have seen the disclaimer/note in your account once you login about the account balance where it says that “Balance as of MM/DD/YYYY” and that the transactions in progress may not reflect instantly. This is an example of eventually consistent data where the data is not consistent all the time but will be consistent eventually, and that too within a reasonable time whatever has been agreed by the system SLAs.

Microservices design is always a trade-off. In some cases, even monolith systems might be better based on the need.

(Note — Take a look at this article if you are interested in finding out if Microservices are right choice for you??)

What a service will do, completely depends on what you want to offer to your service consumers.

So, what is the trade-off in our example. Most of the time account balance will be same as of in the DB, only in certain case when there is a transaction is in progress. That’s a tradeoff that if every time service needs to go to DB to pull latest account balance that would be slow, rather service can cache that data and only refresh that data in a defined time. If the delay in getting updated account balance is agreed between the service provider and consumer then this approach works very well. In this example there is a trade-off between performance and most recent updated data availability.

2. Abort All Operations

Another strategy is aborting all the transactions that could not complete. This is achieved via Compensating transactions.

In a typical business functionality, there are a series of steps and each one of those is performing a transaction. Via compensating transactions every step tries to undo what the previous step did. Essentially that’s like playing the process in reverse.

This pattern can get very complex soon if the system involves too many transactions, then there will be a compensating transaction for all the transactions and keep in mind that these compensating transactions may fail too. So, this pattern is good for a low number of transactions. The order of the compensating transactions doesn’t necessarily have to be exactly in reverse as along as the compensating transactions either running in parallel or serially are able to undo the actions taken by the original transaction and put the system back in a consistent state.

Due to the complexity of this design pattern, choose this if it’s absolutely necessary to undo the failed transactions.

Let’s take a real-world example to understand this –

I have taken an online order example and remember not all online order systems work in the same way. Here this order creation business process has 3 steps –

  1. Add item to cart.
  2. Make Payment.
  3. Update inventory to mark the item as sold.

Now once item is added to cart, but the payment failed, the cart needs to be updated via a compensating transaction. Same way one the payment is successful, but while finalizing the order, the item is out of stock, payment needs to be marked as to be eligible for refund via a compensating transaction.

This is a simple example for understanding purpose, in reality it’s not that simple. There are many more transactions happen during an online order. For example, once the item is added in cart, its marked in inventory too so even if customer doesn’t order the item and leave the cart abandoned, this data provides the interest of people on that particular item and this kind of data is used for further analysis. Same way some systems allow you to place the order and deduct money too and later finalize the order based on the inventory status, while in some cases order gets finalized first, item is taken out of the inventory and then payment happens, if fails then inventory gets updated.

These compensating transactions involves the business logic, and their design is completely based on the business requirement. If Compensating transactions is the pattern that is chosen, then Circuit Breaker design pattern is very useful. Because compensating transactions are expensive, their use should be controlled by a retrying mechanism like Circuit Breaker.

There are some more patterns which are used for managing distributed transactions in Microservice architecture. They are –

3. API Composition

4. Saga

5. CQRS

6. Event Sourcing

Now we have a fair idea what are the challenges with respect to the data management in Microservice Architecture, next is how to deal with them with the help of these patterns. Soon I’ll come up with my thoughts on rest of the patterns in detail with some real-world examples. Stay Tuned.

--

--

No responses yet