Optimizing EhCache configuration for WebCenter Sites

In this blog article I would like to share my ideas and strategies that I have used on how to configure WebCenter Sites’ page, pagelet and inCache. I will discuss several trade-offs in configuring the cache. I would not go into detail on ticket caching for CAS.

This blog expects that you have a good understanding of WebCenter Sites’ and its caches. It also focussed on Delivery server cache tuning. For Editorial servers the Asset Cache and resultset cache are much more important than pagelet cache.

 

Introduction

 

  1. It is valuable to realise that the caching strategy of WebCenter Sites has two goals:
  2. 1) Take load of the database for repeated-read items and therefore provide a huge gain in scalability as a potential source of contention is reduced.
    2) Reduction of CPU operations. Because the caching is done of compound objects a lot of separate calls to either the database or the rendering engine (JSP engine for instance) is avoided. For instance; reading an asset from the database can easily take 20 database calls. If now the asset POJO is stored on a in-memory map, the CPU cycles involved in combining the results of those 20 calls are avoided. This is in addition to avoiding 20 round-trips to the database to read the data.

Now this caching comes also with a cost. Caches need to be managed. This can be done with the time-to-live timeout, or in combination of managed expiration at time of change. The simplest form to implement (time-to-live [http://en.wikipedia.org/wiki/Time_to_live]) suffers from a data freshness. If data is changed before the timeout occurs, the client sees old and stale data.

In WebCenter Sites most caches are managed, meaning that items expire when data changes. This expiration can be at the level of the while cache ‘region’, for instance with result caches, or at the level of an individual cache item, as with pagelet caches.

The other costs of caching is memory consumption. As caching is per definition storing objects in a fast accessible store, all the caches in WebCenter Sites are on-heap caches, sometimes with overflow to disk. As heap memory (and disk memory too) is limited, the cache sizes need to be configured to the correct limits in conjunction with garbage collection tuning.

This overflow to disk has two functions: hold more items in cache than memory allows and be able to persist a cache so it can survive JVM restarts. The latter resolves the common cold start and thundering herd (http://en.wikipedia.org/wiki/Thundering_herd_problem) issues.

After this insight in the trade-offs between CPU cycles, database access, liveliness of data and memory usage, I would like to explain a little bit on how the WebCenter Sites’ caches work together.

At the lowest level is results caching. In this cache are (if so requested) all the cached restless of data queries stored. As most data is read from the database, for instance asset data, most data will be stored. There is a one-to-one relationship between the query and the resultset.

On top of that cache is the AssetCache. In this cache are assets stored as they are entirely read from the database. As it takes multiple database queries to read one asset is this cache a composed cache. As there is considerable overhead to compose an asset for its resultsets, this is a very efficient cache. Especially when assets are read in uncached pagelets. The AssetCache sits on top of the resultset cache for all the queries that are fired to read an asset. It is also a subset from resultset cache as not all cached queries are used to read assets.

On top of the AssetCache is the ContentServer pagelet cache. This cache stored the output of rendering of parts of a page or even the whole page. It stores character data, which can be HTML, JSON, XML or plain text. The cache consist of the rendering results of invoking a a call to a SiteCatalog entry if that SiteCatalog entries was configured to cache its results. The result includes the rendering results to calls to the database, assets, search results, call to external systems etc. As such it is a composed cache and layered on top of resultset cache and AssetCache.

Next to ContentServer pagelet cache is Blob Cache. This holds the results for calls to BlobServer. Technically it is comparable to resultset cache, as it caches the database query and a pointer to the binary file on disk. Where the ContentServer pagelet cache holds character data and the output of rendered ‘elements’, does the Blob Cache hold binary data.

On top of the ContentServer pagelet cache and the Blob Cache is the SatelliteServer pagelet cache. This cache is used by SatelliteServer to assemble the full pages from the individual pagelets for character data like HTML or JSON. These pagelets can be cache or uncached. It queries the ContentServer pagelet cache for these pagelets, and stores them almost identically in it’s own cache. Just some metadata is different. For binary data it queries the Blob Cache and then also caches the binary data. Depending on the size of the binaries this can be a lot of bytes being stored for a long time on the heap. Depending on the available heap size this might or might not be a challenge to configure optimally.

Besides pagelets does SatelliteServer also cache the web references, for the vanity urls. This is similar to resulset cache as there is a direct relationship between the URL (=cache key) and the database query issued. Technically there is more to it that is not relevant to this blob, but for cache configuration it is just an in-memory store of name/value pairs.

Now there is one more cache layer and that is Remote SatelliteServer pagelet cache. This is the same cache as for SatelliteServer pagelet cache but now stored on another JVM and thus heap. The Remote SatelliteServer pagelet cache queries just like the Co-Resident SatelliteServer pagelet cache the ContentServer pagelet cache and Blob Cache. For Remote SatelliteServer the tranport from ContentServer to SatelliteServer is done over HTTP, for Co-Resident it is in-process.

In case of Co-Resident SatelliteServer: the resultet, asset, blob, CS pagelet, web reference and SS pagelet caches all live on the same heap. As some caches contain data read from other caches you may want to use that information to tune the caches in case of a heap size that is not large enough to hold all the data in it’s in-memory caches. As storing all the data in-memory will result in best performance and throughput. And this is the main point of this blob that I want to bring across: how to tune the caches in conjunction with each other.

In the table below is a summary of the different characteristics I just mentioned.

cache what on heap on disk composed remarks
resultset cache database queries results yes no no
asset cache fully read assets yes no yes
CS pagelet cache composed pagelets yes yes yes
blob cache binary data yes no no
SS pagelet cache composed pagelets and binaries yes yes (no!) yes subset of CS pagelet cache
SS webref cache vanity url mapping yes no no

|

Let’s consider some use cases. In all these cases we only consider the situation where the heap size is not large enough to store all items in memory. This can also be the case when the machine has enough physical memory but Garbage Collection tuning indicates that you are better off with a smaller heap to avoid (too) long Full GC pauses. If all items can be stored in-memory and no items are evicted from the cache because the cache is full you are done with tuning as your have reached an ideal situation.

  1. 1) Fully cache pages without a CDN (Content Delivery Network)
    2) Fully cached pages with a CDN
    3) Partially cached pages without a CDN
    4) Partially cached pages with a CDN
    5) Fully uncached pages without a CDN
    6) Fully uncached pages with a CDN

Important is to realise is that adding an additional item to a full EhCache is expensive as at the same time that the item is added (by the same thread) an item needs to be selected to be removed from the cache. The selection for removal is expensive both in CPU cycles and the involved locking. It is even more expensive if items also need to be removed from disk as additional I/O is added to the the mix. So a full cache where constantly items are getting added and removed is something that you desperately want to avoid.

For monitoring the caches it is best to use the WebCenter Sites SupportTools that can be requested from Oracle Support. A lot real-time information can be gathered from reports in the cache section of the SupportTools. Also JMX can be used, but then you will need to write the reports. If you want to graph and trend over time, JMX is an very solid way to read cache statistics data.

1) Fully cache pages without a CDN

This is a relatively simple case as the SatelliteServer pagelet cache is most used, compared to the lower level caches like AssetCache and resultset cache. The other caches are only used when something changes, i.e. when assets are published. All focus should be on the pagelet cache and this should be made as large as possible. Blob Cache is hardly used and serves no function as blobs will the cached on SatelliteServer. ContentServer pagelet cache serves as a store for embedded pagelets (see render:call template style=embedded in the WebCenter Sites’ documentation). But as the same data is available on SatelliteServer, the in-memory pagelet cache it is mostly redundant. TheContentServer pagelet cache in-memory store can be sized small and the on-disk store very large. This makes the ContentServer pagelet cache a persistent store for pagelets and in this way does not compete with the heap for Satellite pagelet cache.

WebReferences cache should ideally not be full (at 100% utilization) and configured large enough to hold all data.

Depending on the frequency of publishing (or in general frequency and amount of change) you will need to monitor the AssetCache and resultset cache. They serve as a helper function to quickly regenerate pagelets that are expired from cache due to the publish. What I mean is that for repeated reads of database queries or assets in both time (before and after publish) and for multiple pagelets, it is faster and more scalable to read them from a cache than from the database. But if there is a memory trade-off to be made between pagelet cache and asset cache, pagelet cache should take a preference.

2) Fully cached pages with a CDN

With adding a CDN to the architecture, the read patterns to the Satellite caches change drastically. As the CDN will cache blobs for a long time there is not a lot of use in caching these blobs on Satellite. If you use a CDN or another front cache like Squid or Varnish for blob data, there is no need to store blobs in Satellites’ page cache, as done by default. In this case it is advisable to set the expiration of blobs to immediate, as blobs are cached externally. There is no benefit in polluting the page cache with large blob data.

In satellite.properties you can set expiration=immediate. It is also strongly advisable to add a Cache-Control header with a long expiration time to blobs. This can be done with either the web references configuration for blobs, or with a additional header in the render:bloburl call to construct a URL with the Cache-Control header.
The SatelliteServer pagelet cache serves mainly as a persistent store for the CDN that is initially queried upon first request and periodically when the CDN cache times out. More advanced configuration schemes are possible, but for the cache configuration this is the basic idea. In this use case the HTML/JSON/XML response can also be cached on the CDN and the CDN can be configured to do so.
AssetCache, web reference cache, blob cache and resultset cache should be configured and monitored similar as under 1).

3) Partially cached pages without a CDN

Compared to 1) the access characteristics change in such a way that AssetCache and to lesser extend resultset cache become more important. As more frequent CSElements and Templates are executed, we cannot rely on pagelet caching alone for optimal results. Again, close monitoring of AssetCache and resultset cache will guide into the right configuration. Repeated reads for various uncached pagelets of assets becomes more important than repeated reads over time. With that ‘over time’ I mean that pre and post publish reads of assets for non-changed assets can be re-read from cache without going to the database and composing the asset POJO.

4) Partially cached pages with a CDN

Compared to 2) is also the access characteristic radically different. Blobs can still be cached on the CDN, but HTML pages cannot. This mean that for each page request the CDN needs to phone home and get the HTML page. SatelliteServer is not access much more often for HTML (actually any character output that is rendered though Templates and CSElements). It also means that for the HTML pages the characteristics become similar to 3). The remarks in 3) around AssetCache etc hold also for this use case.

5) Fully uncached pages without a CDN

This is not a typical use case. It might be one where Sites is used as a CMS only and delivery of the HTML is done through another front-end framework. The output of WebCenter Sites might be binary data (images) and JSON or XML. That JSON is then used by a front-end framework to render pages. Caching might be done in the front-end framework.
SatelliteServer Pagelet caching is only used for for blobs. It’s configuration is irrelevant for pagelets. The ContentServer pagelet cache is not used at all. Blob Cache does not matter much as Satellite Server pagelet cache is used to cache the blobs.
AssetCache and resultset cache is very important. It’s important that you monitor and tune those caches carefully.

6) Fully uncached pages with a CDN

This is also not a typical use case. The difference with 5) is that the blob caching function is now mostly performed at the CDN layer. This means that the Satellite Server pagelet cache layer not relevant and blobs can be set to expire immediately.

When does deploying Remote Satellite Server make sense?

Remote Satellite Server brings most value when your ContentServer JVMs are maxed out on either CPU or memory and a lot of CPU cycles are spend on assembling the page. The latter is the case when you have a lot of pagelets per page. In this case you can off-load memory (the Satellite Server pagelet cache) or CPU cycles to another machine. When you have uncached pagelets on your pages (maybe just an uncached outer wrapper) you are likely to see a degradation in performance as at least some pagelets in the page need to be fetched from ContentServer over HTTP. Those uncached pagelets will also need to be parsed. In this case you will need to balance memory (if you are memory bound) and performance. If you are memory bound and cache blobs on SatelliteServer page cache it is advisable to off-load caching of blobs to either a CDN or another front-end cache like Varnish or Squid.
Another pattern  you can deploy if you are memory bound and have many domains/sites to serve and you have a multi node cluster, is to  shard traffic over the cluster nodes. You can  appoint 2 nodes per virtual host to receive traffic, whilst dedicating other nodes to other virtual hosts. A cluster node can serve traffic of multiple virtual hosts, but it should not serve traffic for all hosts. This will limit the number of pagelets and assets cached per JVM.

It should also be noted that the default settings for resultset caching and pagelet caching is not optimal. For instance the default settings for cc.AssetTypeCSz, cc.cacheResults and cc.cacheResultsTimeout are too low for production Delivery use.

 

General Guidelines

  1. To conclude here are the general guidelines.
    1) Default out-of-the-box settings are not optimal for Production Delivery.
    2) SatelliteServer pagelet cache should only be used for in-memory caching and ContentServer pagelet cache mostly of disk cache. In this way they work well together without much in-memory overhead.
    3) AssetCache and resultset cache become important when you have uncached pagelets. Even with uncache pagelets it is important to understand what us happening in those uncached components to optimise the lower-level caches.
    4) Web References cache is important and should ideally by not 100% full. Assign a large enough size to it. You don’t want to constantly add and remove items from this cache.
    5) Make sure you shut down you web application nicely. If EhCache does not get shut down properly, it will mark its whole disk cache as invalid. This means that after restart all the pagelets will need to be regenerated. This can be a drain on the system.

As you can see there are many ways to tune the WebCenter Sites’ caches. This blob should be seen as a starting point or an intermediate touchpoint for an optimal caching strategy. And an optimal caching strategy requires a good understanding of the business goals around performance, scalability and content freshness.

Add Your Comment