WebCenter Sites has multiple layers of cache. Closest to the database server is the Result Set Cache, then Asset Cache, then WebCenter Sites Page Cache, and finally WebCenter Sites Co-Resident/Remote Satellite Server Cache. Many times Remote Satellite Server is projected as an edge cache. However, there are many limitations in using Remote Satellite Server as an edge cache. For details refer to A-Team Chronicles article “Should Remote Satellite Server be used for Edge Caching”.
Many of our customers use CDN to improve the performance of their web site. Typically WebCenter Sites customers use CDN to cache static images and web pages rendered by WebCenter Sites. Usually CDN caches images and web pages based on the URL. Let’s look at some of the use cases regarding what artifacts can be cached on CDN when using it for a WebCenter Sites rendered web site.
Many clients use CDN just to cache the static content. In an image rich web site a significant portion of the web page size is due to static content like images, CSS, Fonts, JS etc. The size of the dynamic html may be quite less. Caching the static content in CDN will reduce the time required to download the static content, and thereby improve the page response time.
To cache static content on CDN, it’s best to put the static content (images, css, js) on WebServers/HTTP Server. This static content can directly be cached by CDN.
To implement a robust cache strategy for static content it is important to determine the duration for which static content should be cached. You should consider how frequently the static content is updated, how long you can tolerate the stale static content, and how you inform CDN that the content has been updated.
The WebServer/HTTP server should stamp relevant cache instructions on the static file headers like JS, Images, CSS, Fonts and others. Using these instructions the CDN determines the length of time for which to cache the content.
The dynamic content usually refers to HTML that is generated in response to a client’s request to view a web page. Generating html requires some logic to execute on the application servers and data/content to be read from database. The time required to generate the html can be quite large and a significant part of the response time.
WebCenter Sites uses its “Page Caching” to cache the generated html. The cache maintains a list of all the parameters required to generate the html for the dynamic web page. Thus, when the next request with the same parameters comes to view the same page the html does not have to be generated. WebCenter Sites retrieves the html from its ‘Page Cache’.
Additionally CDN can be used to cache the dynamic web pages, as discussed below.
To cache the WebCenter Sites rendered web pages the web site should be coded to utilize the friendly URL capability. This makes it simpler to cache the web pages against the given URLs. For the purpose of caching WebCenter Sites rendered dynamic web pages in CDN, two types of cases are important – fully cached web page, and partially cached or un-cached pages.
WebCenter Sites breaks the page into many pagelets when it renders a web page. Each of these individual pagelets can be ‘page cached’ on WebCenter Sites and Satellite Servers. ‘Fully Cached’ web page refers to the case when all the pagelets that comprise the page are cached. In this case the web page is ‘fully cached’ and does not have any component that is not ‘page cached’ on WebCenter Sites. ‘Partially Cached’ web page refers to those web pages where some of the pagelets are not ‘page cached’. Thus this web page is only ‘partially cached’ on WebCenter Sites.
When WebCenter Sites generates and renders a ‘fully cached’ web page it sets the Last-Modified time in the http header as the time when the web page was generated. When WebCenter Sites gets subsequent requests for the page, it does not have to generate the html again and retrieves the page from its 'Page Cache'. In such a case WebCenter Sites does not change the Last-Modified time. Its only after the page cache expires or is flushed that WebCenter Sites needs to regenerate the page. When it generates and caches the page again, it sets the Last-Modified time accordingly. Thus the Last-Modified time refers to the time the web page was last generated by WebCenter Sites.
CDN can use the Last-Modified time to determine if its cache is up-to-date or needs to be refreshed.
When WebCenter Sites generates and renders a ‘partially cached’ or ‘un-cached’ web page, it sets the Last-Modified time in the http header as the current time or time when the web page is generated. When WebCenter Sites gets subsequent requests for the page some templates/cs_elements need to be executed again. So the WebCenter Sites again sets Last-Modified time in the http header as the current time or time when some components of the web page are generated. Thus the Last-Modified time always refers to the time the request for web page is received.
In this case Last-Modified time is not very useful in determining whether to refresh the CDN cache or not. For such pages usually customers specify a time out value for the duration for which the web page should be cached. After the time out, CDN makes another request to WebCenter Sites to get a fresh copy of the web page.
Just because it is technically feasible does not mean that you should always cache ‘un-cached’ pages on CDN. You need to carefully consider the reasons why these web pages are not cached on WebCenter Sites. If those reasons are still valid for CDN, you need to be careful before you decide to cache them on CDN. For very heavy traffic sites caching web pages for a short duration, say around 5 or 10 minutes, may also make a difference in the performance.
Your site may have some images that are managed by content editors using WebCenter Sites. These images are rendered using blob server. The blob server URL for the images can be quite unfriendly and CDN may or may not be able to cache them. Furthermore, the blob server URL changes after the image asset is modified and published. Some of the web pages that are cached on CDN however, may continue to use the old URL. In this case those web pages will continue to get the old image or may get a missing image. If the old image with the old URL is present in CDN cache the web page will get the old image. If the old image in not present in CDN cache and CDN makes a request to WebCenter Sites for the old URL WebCenter Sites will return an error and the web page will get a missing image.
It’s important to remember that not all content can be cached on CDN. For example, any ‘forms’ or pages that show any personal information should not be cached. Similarly, unless you have taken great care and your CDN allows it, one should not cache any ‘authicated’ pages.
Do not cache any forms on CDN. Even the blank form pages should not be cached or should be cached with great care. Usually the URL for a blank form and filled form submit is the same. In this case you should not cache the blank form. In my opinion, it’s best to avoid caching any type of forms.
You should not cache any web page that has personal information. The visibility of a page that has personal information is very limited. It is not worthwhile to cache a page on CDN that has very limited visibility.
You need to be very careful in caching authenticated web pages that are behind a login. For this to work the CDN should have a way to find out if the visitor has logged in or not. Some CDNs have mechanisms for doing this. You should also consider how much visibility the authenticated pages have. Usually the authenticated pages are not cached using CDN.
In short you can use a CDN with WebCenter Sites very effectively to cache both static and dynamic content. You should be careful about what you cache, how you invalidate and refresh the cache. You should not cache any form pages and pages that have personal information.