Implementing a Reliability Pattern for Unreliable Transport

June 12, 2015
blog author

Appno Blogger

Appnovation Coop

I recently had a non-functional requirement to implement retry mechanisms in any flow that has access to an external system. In this Blog, I take you through some steps to implement a reliable pattern using Until Successful Scope for an unreliable transport such as HTTP.

  • The project had some constrains as follow:
    1. Mule 3.4 EE must be used
    2. Workday connector could not be used because of custom functionality required from Workday and custom report was created instead 
    3. Orchestrated flows need to be implemented synchronously


  • The integration orchestrated the following steps in an application called Workday Report Adapter:
    1. Sending a request to Workday report (which is internally exposed as rest service) every 4 hours.
    2. A request is sent using HTTP outbound endpoint where for every interval the path of outbound endpoint is dynamically created with different query parameters.
    3. A response from workday report is in JSON format and its payload is transformed to some data transfer objects. These data transform objects are used for applying some business rules on them. After applying some business rules, these data transform objects are divided to 4 different lists where each list corresponds to objects related to specific event.
    4. Objects in each event list are splitted by object id and each object goes through the following:
      1. Two attributes of each transfer object are used to update corresponding canonical domain object in a GemFire cache
      2. Each object transformed to a JSON payload which includes objectId and event type and then sent to a RabbitMQ's exchange


For the clarity of this blog and focus on the main topic, the implementation of step 2 is referenced by subflow2, step 3 and 4 are referenced by subflow3; therefore, in the following code snippets, subflow2 refers to abstract implementation of step 2 and subflow3 refers to to abstract implementation of step 3 and 4.  

The following diagram illustrates touch points that Workday Report Adapter is accessing:


As above diagram shows, they are three touch points: Workday Report, RabbitMQ and GemFire. Workday Report Adapter is using RabbitMQ and GemFire connectors and therefore retrying transient failures are seemingly done using connectors' retry mechanism. For the other touch point, Workday Report Adapter uses unreliable http outbound transport. In order, to make HTTP outbound transport resilience for transient failures, I needed to use Until Successful Scope.  However, up to Mule 3.5, the Until Successful's processing occurs asynchronously from the main flow and this is not the behaviour Workday Report Adapter is expecting.

In the below code snippest, the code would work perfectly fine if we remove the untill-successful which means withought retry mechanism. However, with untill-successful, the code would not work as one would expect. The reason that the code is not working is related to the fact that untill-successful's processing occurs asynchronously from the main flow by different thread than the main thread. That is, the https:outbound-endpoint without a path gets called once the subflow2 still is being executed. The subflow2 creates a dynamic path and assigns it to the flow variable which is used to set payload in line 38. In Mule 3.5 and up, this bahvoiur can be changed by setting synchronous attribute on until-successful to "true".  However, before Mule 3.5, this was not possible.         


In the following code snippets, the above code (line 28 to line 57) is refactored in order to be able to use Until Successful's processing asynchronously, but providing the same kind of behaviour as if it was implemented synchronously. In order to do that, subflow0 and subflow1 are introduced along with an one-way vm endpoint to decouple setting workday report path from the rest of processing steps as shown in line 39 and 43. This decoupling now guaranties that thread in the main flow will first execute subflow2 to create dynamic path and then dispatch the path into vm outbound in line 39. Once the dynamic path is received in line 43 by vm:inbound-endpoint, then subflow1 under until-successful scope gets executed by different thread asynchronously.  As it can be seen, in the subflow1 the https:outbound-endpoint executed first and then subflow3 gets executed. Since message processsor https:outbound-endpoint and subflow3 are both get executed under until-successful scope, any transient failure which are thrown by SocketException, SocketTimeoutException, ConnectException or http.status which is not 200 would be retried 3 times with interval of 60 seconds.  


In the above example, any transient exceptions resulted from execution of subflow1 would be retried 3 times with interval of 60 seconds between each retry. Although this solution was introduced because in Mule 3.4 the until-successful scope was processed in asynchronous mode, this implementation of until-successful worked better for over all project’s requirement.


In this blog, we have looked how to use until-successful in asynchronous mode for implementing a reliability pattern around unreliable http transport to increase resilience of a MuleSoft integration solution.  As of Mule 3.5, the until-successful behaviour can be changed by setting synchronous attribute on until-successful scope to "true" so that it can be processed synchronously as well.  However, using until-successful in asynchronous mode still is very valuable approach for many use cases.