Architecture & DesignScaling Microservices Architecture using Caching

Scaling Microservices Architecture using Caching content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Microservices Tutorials

Caching can be a great way to boost the performance and scalability of microservices-based applications. It has been in use in web applications for decades now to improve the performance and responsiveness of applications. Due to the distributed nature of microservices-based architectures, it is important that you devise scalability strategies to protect them and make them fault-tolerant. Here is where caching comes in.

This article examines how caching can help boost the scalability, performance, and availability of a microservice.

Read: An Introduction to Microservices.

Why Do Microservices Architectures Need a Cache?

Caching can be used to minimize database hits or network round trips and is a strategy often used for scaling along the Z-axis of the AKF scaling cube. Caching can be used to boost the performance, scalability, and availability of your microservices-based application. You can take advantage of caching in your microservices-based application to reduce the round trips to the database server for fetching relatively stale data and decrease downtime.

Additionally, by using caching, you can avoid making redundant calls to other microservices. You should associate a key with any item that is added to the cache so that you can use the key to retrieve the cached data when needed. Further, caching may enhance availability by allowing data to be retrieved from the cache in the event that a service is not accessible.

However, you should be aware of the challenges involved and how you would manage your cache. You would not, for instance, be caching all the data the application needs – there are certain considerations in this regard, and we will examine them here.

Read: How to Align Your Team Around Microservices.

Considerations for Caching Data in Microservices

Deciding when to cache, the duration for which the cached data should reside in the cache, and what data to cache are a few key points to consider for being able to leverage caching effectively. We outline some of these considerations below.

Determine What to Cache

This is an important consideration – you should cache only data that is frequently accessed and / or relatively stale. Typically, you should cache objects and application-wide settings. The objects can be business entities or objects that hold frequently accessed and relatively stale data. You can also cache application-specific settings.

What is Cache Warming?

Cache warming refers to a strategy for storing mostly used data in the cache so that there will be a cache hit then next time a real visitor searches for data in the app. An e-commerce app might want to store the most frequently used items in the cache so that new orders can be placed using an item already available in the cache rather than having to make a call to another service to retrieve the required data.

Pre-loaded and Lazy-loaded Caches

There are two kinds of caches based on the availability of data in the cache: pre-loaded cache and lazy-loaded cache. While in the former case data in the cache is populated ahead of the start of a service, in the latter case the cache is warmed as and when data is requested for. Hence in the first case, data is available in the cache even before it is requested for i.e., even before the first request to the service. In the second case (i.e., in a lazy loaded cache), data in the cache is populated on demand – the first-time data is requested. From then on, the cache is used to serve all subsequent requests for the same data.

Cache Storage Location

Depending on where you would like to cache the data, you can have two types of caches: In-Memory Cache and Distributed Cache.

In-Memory Cache is faster compared to Distributed Cache and the cached data resides in the memory of the service instance. In a Distributed Cache the cached data resides outside the memory of the service instance but is accessible to all service instances.

Distributed caching can use either of these two approaches: Private caching and Shared Caching. Private Caching is fast, and the data resides in the memory of a service instance. If your microservices-based application has several instances, each instance would have its own copy of cached data. Though this type of caching is fast, you are constrained to storing data limited to the memory that is available on the system where the service instance is running.

The other approach is Shared caching where the cached data is shared amongst the application instances. You can scale your application easily by adding more and more servers as needed.

Cache Invalidation

The length of time that data should be kept in the cache is determined by the requirements of your application. You can refresh the data on the cache periodically if you don’t need the refreshed data instantly. The other approach to refreshing data in the cache would be based on the business workflow. Whenever data changes, you must invalidate the cache to ensure that the data residing in the cache and that in the data store are the same, i.e., they are in sync. The cache data – the data sitting in the cache – may become stale over time; you must determine how long it can reside in the cache (i.e., for how long the cached data can remain stale).

So, you might set the Time to Live or TTL and then when the time elapses, you would invalidate the data and remove it from the cache. Hence, a request for data (for which TTL has expired) would result in a cache miss – here’s where you can apply cache patterns to handle this situation.

Cache Hit or Cache Miss?

When the data being searched can be found in the cache, it is called a cache hit. Cache miss is when the data being searched is not in the cache. A good caching strategy must have a high cache hit ratio. This is calculated by subtracting the total cache hits from the cache misses, as shown below:

Cache Hit Ratio = Cache Hits/ (Cache Hits + Cache Misses)

Failover Caching

Failover Caching is a strategy that is adapted to provide the requested data even when services have failed. Typically, two expiration dates are used – a shorter one and a longer one. While the former is used to specify for how long the cached data can be used in a normal situation, the latter is used to specify the duration for which data in the cache can be used in the event of a failure.

Microservices and Caching Tutorial

Despite all of the benefits that caching provides, there are certain trade-offs as well. You must follow the recommended best practices when using caching in your application. As an example, neither should you refresh the cached data too often nor should you refresh it too infrequently – there must be a balance. Used judiciously, you can reap the benefits of caching and build applications that are scalable and high performant.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories