Caching is one of the most effective design strategies for improving the latency and performance of an API. An API taking a longer time to respond may result in cascaded delays and timeouts in the consuming applications. Further, a longer processing time of an API could result in consuming and blocking more and more server resources during high volume API calls and may end up in server crash or deadlocks. Hence low latency is one of the most desired features in API implementations and it's imperative that while building an effective API, the right design strategy should be adopted to improve the API response time as much as possible.
Caching can be very effective technique to improve API response time, however, care must be taken to use caching for appropriate scenarios only, in order to avoid functional errors resulting from stale data. In this blog, I will be discussing some of the guidelines and scenarios where caching could be effectively used while developing APIs in mule.
Typically, in APIs, caching approach can be effectively used for data such as,
- Static data which rarely changes over time (e.g. static lookup data)
- Data that changes only during known interval of time. (e.g. data from a back end data source system which allows only daily data updates – while working on integration projects with various client organizations, I frequently come across internal data systems that allow only daily/nightly batch updates)
- Fixed lifetime data (e.g. access tokens that expire after fixed time interval)
Note that the data has to be non-consumable in order to be cached in Mule.
It's important to note here that a caching mechanism is effective only when the API that’s caching the data is invoked considerably frequently within the cache data’s lifespan (Time To Live), in order to realize the effects/benefits of caching. The more ‘cache hits’, the better is the overall performance (average API response time).
Also, a word of caution: every caching implementation has to be accompanied with a suitable cache-invalidation approach in order to avoid the APIs returning stale data to its consumers. Functional errors in production environment due to stale data are very difficult to catch and debug, as they leave no trace in the form of errors in the application logs.
In most cases, setting a right TTL (TimeToLive) period is good enough to handle timely cache invalidation. In few cases, when its critical to avoid stale data in cache at any cost, you might need to devise a custom cache invalidation logic that should be triggered externally as soon as any data in the cache changes in the source system.
There are multiple approaches that can be used in mule to implement caching for APIs, based on what data you decide to cache.
The two prominent approaches are:
- Caching the whole API response using “HTTP Caching” policy in API Manager
- Caching a specific back-end response within the API code using “Cache Scope” (We'll discuss this in Part 2!)
- API response caching using HTTP Caching policy:
Mule 4 provides out of the box HTTP Caching policy that can be easily configured and applied to a registered API using the Anypoint API Manager.
With this Policy, you can cache the entire HTTP response (including the response body, headers, status codes and session variables) in order to avoid re-executing the underlying API code for subsequent invocations with a matching cache Key input (a cache key has to be defined in the policy configuration)
Here is the link to the detailed MuleSoft documentation for this policy: https://docs.mulesoft.com/api-manager/2.x/http-caching-policy
Some of the key benefits of this approach are:
- It supports some of the HTTP Caching Directives (RFC-7234), which make it quite convenient to skip or invalidate cache by simply passing appropriate HTTP header values. Thus you can easily design cache invalidation mechanism outside of the underlying API implementation.
- Because the policy implementation uses MuleSoft’s Object Store, it provides all the features and flexibility of the Object Store such as – High Availability and sharing the cache across different nodes in cluster, providing persistent caching, specifying cache key using dataweave expression, providing max cache entries, defining TTL.
- Since the caching is done at the API gateway layer, it could save workload on the runtime worker (if configured as separate from the gateway worker) where the underlying API implementation is deployed.
- The policy leverages mule 4 features like being non-blocking policy, classloader isolation between the policies and the API runtime.
Note that the policy limits the individual cache entry size to 1MB
When would you use the HTTP Caching?
Being a gateway caching policy, a cache-hit will result in completely skipping the execution of the underlying API hence it is important to be careful while selecting candidate APIs for applying the caching policy.
Some of the guidelines are,
- This caching approach is useful when it is safe to cache complete API response without affecting the overall end-to-end functionality. That is, when you do not anticipate the whole response data (including headers, body and session variables) of the API change for a given set of inputs over a predictable period of time.
- Since a cache-hit results in “no execution” of the underlying API, it is also important to make sure that the underlying API implementation contains only read operations on back ends (if any), and it is safe to have no footprints of the API execution (e.g. Log entries) during cache hits.
Being a gateway policy, it could be more effective in deployment scenarios where you have configured separate API proxy layer isolated from the underlying back end API layer (back end layer may or may not be mule based). In such cases, the caching on gateway could save workload on the underlying API workers. However this does not limit the usage and benefits of the policy in a unified runtime environment (common runtime for gateway and the API instances).
Some of the example use cases applicable for this caching approach are:
- In REST APIs, a GET operation is an ideal candidate for this approach when the back end data source doesn’t change the data quite often. It is also useful for APIs that retrieve historical (read-only) data, which never changes.
e.g. suppose you are building a GET operation for a Flight History record which provides history details of an already completed flight. Since the results of this API call will never change for a given flight number and a past date, you can safely apply HTTP Caching policy on it.
Note: as I mentioned earlier, though you can safely apply the caching on this API, it may be beneficial only if there is a possibility that the API gets called frequently for the same input data.
- Another scenario for using HTTP caching is when your API is meant for or purposefully built to generate same output for a given set of inputs over a predefined period of time.
E.g. a custom access token generator which is meant to generate same token for a given client id when called during specific period of time. Implementing HTTP Caching policy on such API will avoid unnecessary processing of the API code to generate exact same token and thus reducing API response time.
- Another use case for this caching approach could be an API which performs quite a complex data transformations and processing on the input data alone, in order to provide response.
e.g. APIs meant to encapsulate proprietary complex business logic/rules that involves time consuming data transformations and calculations without any dependency on external data systems. Applying HTTP level caching will save the processing time and resources spent in performing the complex processing and in turn improve latency. Most of the Business Rule Engines in today’s world use this caching mechanism in their external facing REST APIs.
Stay tuned for our next post, where I'll be discussing Approach #2: Caching a specific back-end response within the API code using “Cache Scope”.