Saga Pattern | How to Implement Business Transactions Using Microservices

In the previous post, we saw some of the challenges of implementing distributed transactions and how to implement Saga’s pattern using the Event/Choreography approach. In this article, let’s talk about how to address some of its problems like complex transactions or cyclic dependencies of events by using another type of Saga’s implementation called Command or Orchestration.

Saga’s Command/Orchestration Sequencing Logic

In the orchestration approach, we define a new service with the sole responsibility of telling each participant what to do and when. The saga pattern orchestrator communicates with each service in a command/reply style telling them what operation should be performed.

Let’s see how it looks like using our previous e-commerce example:

Order Service saves a pending order and asks Order Saga Orchestrator (OSO) to start a create order transaction.
OSO sends an Execute Payment command to Payment Service, and it replies with a Payment Executed message
OSO sends a Prepare Order command to Stock Service, and it replies with an Order Prepared message
OSO sends a Deliver Order command to Delivery Service, and it replies with an Order Delivered message

In the case above, Order Saga Orchestrator knows what is the flow needed to execute a “create order” transaction. If anything fails, it is also responsible for coordinating the rollback by sending commands to each participant to undo the previous operation.

A standard way to model a saga orchestrator is a State Machine where each transformation corresponds to a command or message. State machines are an excellent pattern to structure a well-defined behavior as they are easy to implement and particularly great for testing.

Rolling Back in Saga’s Command/Orchestration

Rollbacks are a lot easier when you have an orchestrator to coordinate everything:

Stock Service replies to OSO with an Out-Of-Stock message;
OSO recognizes that the transaction has failed and starts the rollback
1. In this case, only a single operation was executed successfully before the failure, so OSO sends a Refund Client command to Payment Service and set the order state as failed

Benefits and Drawbacks of Using Saga’s Command/Orchestration Design

Orchestration-based sagas have a variety of benefits:

Avoid cyclic dependencies between services, as the saga orchestrator invokes the saga participants but the participants do not invoke the orchestrator
Centralize the orchestration of the distributed transaction
Reduce participants’ complexity as they only need to execute/reply commands.
Easier to be implemented and tested
The transaction complexity remains linear when new steps are added
Rollbacks are easier to manage
If you have a second transaction willing to change the same target object, you can easily put it on hold on the orchestrator until the first transaction ends.

However, this approach still has some drawbacks, one of them is the risk of concentrating too much logic in the orchestrator and ending up with an architecture where the smart orchestrator tells dumb services what to do.

Another downside of Saga’s Orchestration-based is that it slightly increases your infrastructure complexity as you will need to manage an extra service.

Saga Pattern Tips

Create a Unique Id per Transaction

Having a unique identifier for each transaction is a common technique for traceability, but it also helps participants to have a standard way to request data from each other. By using a transaction Id, for instance, Delivery Service could ask Stock Service where to pick up the products and double check with the Payment Service if the order was paid.

Add the Reply Address Within the Command

Instead of designing your participants to reply to a fixed address, consider sending the reply address within the message, this way you enable your participants to reply to multiple orchestrators.

Idempotent Operations

If you are using queues for communication between services (like SQS, Kafka, RabbitMQ, etc.), I personally recommended you make your operations Idempotent. Most of those queues might deliver the same message twice.

It also might increase the fault tolerance of your service. Quite often a bug in a client might trigger/replay unwanted messages and mess up with your database.

Avoiding Synchronous Communications

As the transaction goes, don’t forget to add into the message all the data needed for each operation to be executed. The whole goal is to avoid synchronous calls between the services just to request more data. It will enable your services to execute their local transactions even when other services are offline.

The downside is that your orchestrator will be slightly more complex as you will need to manipulate the requests/responses of each step, so be aware of the tradeoffs.

If you have any questions, feel free to ask me at @deniswsrosa

Denis Rosa, Developer Advocate, Couchbase

Author

Posted by Denis Rosa, Developer Advocate, Couchbase

Denis Rosa is a Developer Advocate for Couchbase and lives in Munich - Germany. He has a solid experience as a software engineer and speaks fluently Java, Python, Scala and Javascript. Denis likes to write about search, Big Data, AI, Microservices and everything else that would help developers to make a beautiful, faster, stable and scalable app.

All Posts

4 Comments

JuliaCarmen January 16, 2018 at 1:59 am

Very good!

Log in to Reply
Tom Stachura July 27, 2018 at 11:42 am

Is there an implementation?

Log in to Reply
1. Bertil Muth July 29, 2019 at 6:03 am
  
  Here’s an alternative to a state machine as implementation: https://github.com/bertilmuth/requirementsascode. I’m glad if you let me know what you think.
  
  Log in to Reply
mrLSD November 15, 2020 at 9:19 am

Hello, in your visualization you used Message Broker and channels. So it looks like Sage Publish to broker channel. And for example Payment Subscriber do that channel and receive that message. After that it should send result to another channel for Saga. So it’s PUB/SUB MQ. So if Payment service is offline, Saga should proivde some logic with timeout for response. So why we not use for that case REQ/REP (Request/Response MQ model), or direct request to Payment service? So we will immediately to know response and service availability?

Log in to Reply

Products

See How Capella Stacks Up

See How Capella Stacks Up

By Industry

By Need

Why NoSQL

What is NoSQL and why choose it?

Popular Docs

By Developer Role

Developer Playground

Start a Capella session

Resource Center

Education

Certification Exams 2023

Get Couchbase certified

About

Partnerships

Our Services

Partners: Register a Deal

Ready to register a deal with Couchbase?

Marriott

Saga Pattern | How to Implement Business Transactions Using Microservices – Part II