Performance Improvements: Caching, Page 2
Whether your caching needs are not volatile at all or are very volatile you have another decision to make about what level at which to cache data. There are techniques built into ASP.NET which allow you to cache the output of a control as well as caching individual objects like we've been discussing. With both of these options available the question is bound to come up about which of these techniques you should use or whether you should use both.
The key benefit of output caching is that once the data is processed for output once it doesn't have to be reprocessed. In other words, if you have some complex algorithm for building a set of data or a graph, then caching the output of that process would prevent reprocessing the same data over and over again. Caching the output is also good when you only need a small subset of the data to create the output and caching the objects themselves would require a larger cache.
However, as a general rule the use of output caching produces a larger cache because you may have a few different visual representations of the same data. For instance, you might have a control which is a vertical sidebar representation of a cart, the actual cart display page, and a summary indicator for the number of items which appears on every page. These controls would likely require more memory if cached at the output level than would be required to cache the cart and rebuild them.
Of course, for this kind of cache disk files can be used. SharePoint, for instance, uses this strategy for page output caching. This approach can be valuable particularly if most of what you're generating is coming from a database and thus is relatively expensive to get to. Nearly all content in a SharePoint site comes from a database. Of course, if you decide on a strategy that utilizes the disks on the front in web servers youll want to make sure that the disk performance is acceptable. This generally means going for SAS (Serially Attached SCSI) instead of SATA disks to get better responsiveness.
The reality is that processing power is cheap and is rarely the bottleneck in the system. More frequently the coordination point the database is generally the bottleneck in any large system. Because of that reprocessing a bit to regenerate the user interface isn't generally as big an issue as mitigating the impact on the back end database server.
Once you've concluded whether you want to cache the output or the objects you're still not done, you still have to consider what items you're going to put into the cache and what priority you're going to give them should the cache manager need to make decisions about which objects need to get thrown out of the cache to make room for new items.
As a practical matter cache's can't be of infinite size. There's a limited amount of memory in a server, there's a limited amount of entries in a database before performance suffers, etc. Knowing that you have a fixed size cache to work with the question often comes up as to which objects should end up in cache, for how long, and which objects should be purged first if there are pressures to reduce the cache size.
Selecting objects to cache is really a balance between four factors: volatility, frequency of access, cost to recreate, and size in cache. The ideal candidate for caching is one that never changes, has a really high frequency of use and cost to recreate and is very small. Of course, real objects don't have all of these attributes. In the real world objects may be relatively volatile but don't cost much to recreate. (These objects may not even need to be cached at all let alone cached with a high priority.) Similarly, you may find that an invoice is hard to recreate but is relatively infrequently accessed and is fairly large in memory.
The greater the performance impact by caching the object the more important it is to keep in memory. In review, slowly changing (low volatility), highly accessed, hard to recreate, and small storage objects are the best to maintain in a cache. The more an object fits the characteristics the more important it is to keep in cache.
Caching is a useful tool but it has its own issues and limitations. Using caching means you have to consider when the cached data should expire and how to invalidate it. It also means a renewed concern for the amount of memory in use on the server and in the process. Caching can greatly improve performance but it can also lull you into a false sense of security. In some cases it can even make the performance worse. The next stop in our journey to improve performance is to evaluate ways to track down problems, and use bigger and better hardware to solve problems.
About the Author
Robert Bogue, MS MVP Microsoft Office SharePoint Server, MCSE, MCSA:Security, etc., has contributed to more than 100 book projects and numerous other publishing projects. Roberts latest book is The SharePoint Shepherds Guide for End Users. You can find out more about the book at http://www.SharePointShepherd.com. Robert blogs at http://www.thorprojects.com/blog You can reach Robert at Rob.Bogue@thorprojects.com.