Exporting rendered assets from WebCenter Sites

In this blog post I explain how you can publish rendered assets from WebCenter Sites into other systems. This allows assets to be exported in a rendered form, so they can be consumed by external systems, like Oracle Commerce or a Content Distribution Network (CDN).

The initial focus of this work was to improve the existing WebCenter Sites and Oracle Commerce integration as published by the Oracle Commerce team by providing the capability to inject rendered and composed WebCenter Sites assets into Oracle Commerce for further use of unstructured content in Commerce catalogs. The productized Sites-Commerce integration is only using the raw (meta) data of assets. This has lead to loss of use of capabilities in WebCenter Sites for business users as well as complex rendering logic in Oracle Commerce as some of the internals of WebCenter Sites need to be known by the Commerce developers writing the rendering code. For more information on how to integrate WebCenter Sites and Oracle Commerce please read this A-Team produced whitepaper.

In this post I focus on the internals of Sites and not on the actual integration with the third party systems. The sample source code is included as an attachment to this post.

Why is here described exporting different that the (deprecated) Static Publishing that is available in WebCenter Sites? What problems does it solve? Let me go through the three most important ones.

Integration possibilities

The main issue that is not solved with Static Publishing is that the out-of-the-box exporting operation has no hooks to notify other systems what rendered versions of asset have changed and may need to be updated. This is important for any system that is expecting incremental updates. In this blog I (and more in the source code) I show how you can make use of WebCenter Sites internal compositional dependancy tracking to keep track of those relationships and use that to notify other systems. For instance if a Page asset is exported, a rendered form of that asset will include for instance Articles, links to other (rendered froms of) Pages, Images etc. If after that initial publish, one of the used Articles is changed and published, the third party system might want to know that the Page has changed and that it needs to invalidate that Page. In case of a CDN integration, you might integration with the API from the CDN to invalidate the URL for that rendered form of that Page.

Renditions

In the previous paragraph I introduced the concept of a ‘rendered form of an asset´. From now on I will call this a Rendition. An asset can be rendered in many different ways, in other words, it can be rendered into many Renditions. Examples are the full page view, the tablet device view, the binary images view, an XML representation of the (meta) data of an asset. This is the second advantage over Static Publishing: it is clearly defined (in code) what the Renditions of an asset can be as these are defined upfront. This makes the system more predictable. These Renditions do not have to be discovered through crawling like Static Publishing does.

Full featured and composed assets

The third improvement is, compared to XML export, that (almost) all of the rendering and editing features of WebCenter Sites can be used. Device Templates, Recommendations, InSite editing can all be used. On top of that, and maybe most important compared with XML export, is that assets can be exported in a composed and rendered way. In this way, the consuming third party system does not have to render the asset; it does not have to know about the internal relationships and complex meta-data semantics. It can just consume the rendered content. Imaging putting the Renditions with some meta data on a message bus, with other systems just listening for updates.

Technical details

I will not elaborate deeply on the technical details in this blog. I’ll urge you to study the source code closely to understand all the moving parts. They key concepts and design choices are explained in this post.

Approval and Publishing

The export function is integrated with WebCenter Sites through it’s publishing subsystem. It makes use of the approval and publishing functions in WebCenter Sites. For this, a custom publish type (Delivery Type) was implemented. This allows for custom approval rules and custom rendering. This implementation uses the same approval rules as XML export and RealTime publishing. The exporting/publishing function is completely rewritten. The PublishApprovedAssets element is calling the PublishApprovedAssets java class, that is doing the publishing function. This PublishApprovedAssets class is reading from the list of approved but not yet published assets and starts the exporting function per asset by calling the Exporter. Previous to exporting it collects the list of possibly changed Renditions from the RenditionRegistry and sends those to the Notifier. Both the RenditionRegistry and Notifier concepts are explained below.

Exporter

The Exporter is the main class that, based on the publish queue,  invokes the rendering engines to create the Renditions,  registers the Renditions and its compositional dependencies as well as the actual persistence of the Renditions and the asset  meta data. This is that class that you will need to modify to interact with other systems. The default implementation does a persistence to disk in JSON format, as well as the Rendition bodies them selves. For instance the Exporter also decides what Renditions to generate. It does so by, for instance creating name/value pairs and invoking the RenderEngine. It detects if a Vanity URL is defined for the assets and uses that data to create a Rendition. Your implementation might have different business rules to create Renditions. Each Rendition needs to have a unique URI.

To give some idea of the business logic, here is the sample method.

  protected AssetData export(final AssetId assetId) throws Exception {

        final AssetData data = load(assetId);
        final ExportedAsset exportedAsset = new ExportedAsset(data);
        boolean rendered = false;

        for (final WebReference ref : getWebReferences(data)) {
            if (ref.getHttpStatus() == 200 && ref.getTemplate() != null) {
                final Rendition rendition = render(data, ref);
                exportedAsset.add(rendition);
                rendered = true;

            }

        }
        if (!rendered) {
            final Collection<Rendition> renditions = renderDefault(data);
            exportedAsset.addAll(renditions);
        }
        // following 2 ops could be done async
        registerRenditions(exportedAsset);

        exportToCAS(exportedAsset);

        return data;

    }

RenderEngine

The RenderEngine is the third component implemented. It makes use of the native rendering capabilities of WebCenter Sites to create Renditions. It renders the various pagelets through calls to ics.ReadPage() mimicing SatelliteServer. This means that all the page cache and URL composition mechanisms in WebCenter Sites are used to correctly and quickly render assets. For this rendering standard WebCenter Sites Templates, CSElements and SiteEntries are used, as you would use in live delivery rendering. The rendering engine also collects all the compositional dependencies of the Rendition. For the compositional dependency tracking to work correctly only cached Templates and SiteEntries should be used to render the Rendition.

RenditionRegistry

To keep track of the exported Renditons and their compositional dependencies are these two mapped and stored in a RenditionRegistry. The RenditonRegistry is an inverted index of all the compositional dependencies (assets) and their Rendition URI. This allows for fast lookup of the Rendition URIs by a compositional dependency. At the start of the publish the PublishApprovedAssets asks the RenditionRegisty for all the known URIs of the list of assets to publish and sends those URIs to the Notifier. The sample implementation uses Lucene for the inverted index capabilities. A relational database might also be a choice to implement this inverted index capability.

Notifier

The Notifier’s job is to work with the third party systems to notify them about eminent changes. It is up to the implementer to write the interaction code. The Notifier is an interface. The sample implementation does nothing.

Installation

This work has been tested on WebCenter Sites 11.1.1.8.

  1. Unzip the attached commerce-publish.zip.
  2. Load the src/main/java folder in your IDE, attach all the jar from from WebCenter Sites’ WEB-INF/lib folder and compile the java files, export to a jar file and deploy that jar to WebCenter Sites’ WEB-INF/lib. Restart app server if needed.
  3. Import the WebCenter Sites catalogs from src/main/elements into WebCenter Sites.
  4. Create new publish destination on WebCenter Sites, choose ‘Export to Commerce’ as destination type.

Now the export is ready to use. To test you will need to approve some assets (Web Referenceable Assets) to the newly created destinaton, and publish the assets from the newly created destination.

You can see two new directories created in Sites/11.1.1.8.0/export, cas and lucene. The lucene directly holds the inverted index, the cas directory holds the exported Renditions.

 Further work

As indicated, this blog and sample code explain the concepts on how to export assets in any form to third party systems. The code does explain where to integrate but not how to integrate with, for instance, a CAS Record Store from Oracle Commerce. A developer with knowledge on how to interact with the CAS RecordStore API and the supported document formats, can now easily inject full rendered WebCenter Sites assets into Oracle Commerce.

 

 

Comments

  1. Phillip Knoll says:

    Hi Dolf,

    As mentioned in this excellent blog post, the pre-rendering of WCS assets is an excellent medium for ingestion into Endeca. What is missing here is the actual Endeca code, however I actually implemented this a while ago and would be happy to share my source code including the CAS record generation. In short, it is useful to produce a CAS record format using a flat map of properties that can be ingested corresponding to the content model in question, and generate either programmatic or XML records for the WCS crawl. This is what we are doing in our B2B Commerce POC asset and have additionally made plans to implement this approach at a customer in Canada. If you are interested in updating the code here with that component, let me know and I’ll be happy to provide it.

    • Navin Kumar says:

      Thanks Dolf for an excellent piece of work.

      Hi Philip, It will be great if you can share the missing piece here, the actual Endeca code.

      Thanks in advance. Much appreciated.

      –Navin

Add Your Comment