Improving Performance via Parallelism in Oracle Event Processing Pipelines with High-Availability

Introduction

This posting explains how to use parallelism to improve performance of Oracle Event Processing (OEP) applications with active-active high-availability (HA) deployments. Parallelism is exploited for performance gain in each one of the server instances of an HA configuration. This is achieved by identifying sections of an application’s processing pipeline that can operate in parallel, and, therefore, can be mapped to separate processing threads. Both pipeline and independent query parallelism are described.

Main Article

Pipeline Parallelism

A pipeline architecture has inherent concurrency because each of its stages works in parallel on different data elements flowing through it. For example, in the pipeline in figure 1, if each stage is assigned its own processing thread, the following actions can occur in concurrently: input JMS adapter reads event #3 from a JMS topic, CQL query processor handles event #2, and output JMS adapter writes event #1 to a queue.

 

Figure 1. OEP Pipeline with three concurrent stages

 

figure1

 

 

Although OEP HA pipelines are limited to one thread per stage, significant performance gains can be achieved by running each stage in a separate thread as compared to running all stages on one thread or in a number of threads smaller than the number of pipeline stages.

A key constraint in OEP when using active-active HA (see Oracle Fusion Middleware Developer’s Guide for Oracle Event Processing 11g Release 1 (11.1.1.7) for Eclipse, section 24) is that it requires the input streams to both the primary and the secondary instances to be identical and to maintain the same event ordering as events flow through the OEP Event Processing Network (EPN). This constrain limits the EPN topology to be either a linear pipeline, starting from an input adapter and ending with an output adapter, or a tree where, each node with downstream branching replicates every event to each of its branches.

The event ordering requirement also limits to one the number of threads assigned to each stage of the EPN. Having more than one thread in one stage, for example, in an input JMS adapter, would fail to assure that the order of events entering the following stage, such as input channel, is the same in both the primary and secondary instances.

The reason event order cannot be assured when using multiple threads on a pipeline stage is that a pipeline stage operates as a queue with multiple worker threads serving the queue. Since the execution times in serving each event and the thread scheduling order cannot be maintained in complete alignment across the primary server instance and the secondary instances, the order of events passed to the following pipeline stages could be out or order across HA instances. The mechanism recommended in OEP best practices for assuring the primary and secondary instances of an HA configuration receive identical input streams is a JMS topic.

Since processing speeds of consecutive stages can vary, buffers are used to couple stages and hold the output of one stage while the following stage can consume it.  In OEP these inter-stage buffers are the EPN channels. In addition to operating as buffers, EPN channels also are used as the configuration mechanism to specify the number of processing threads assigned to the stage following the channel. A channel’s buffer length and the number of threads assigned to its following stage are defined within a corresponding channel element in the EPN’s META-INF/wlevs/config.xml file by assigning values to the max-threads and the max-size parameters. For example the inputChannel element in the pipeline in figure 1 is configured as follows:

<channel>
    <name>inputChannel</name>
    <max-size>1000</max-size>
    <max-threads>1</max-threads>
</channel>

For input adapters, which don’t have a preceding channel, assignment is done by setting to one the concurrentConsumers property in the corresponding JMS input adapter element in the META-INF/spring/MonitoracaoTransacao.xml file:

<wlevs:adapter id="jmsInputAdapter" provider="jms-inbound">
    <wlevs:listener ref="inputChannel" />
    <wlevs:instance-property name="converterBean" ref="jmsMessageConverter" />
    <wlevs:instance-property name="concurrentConsumers" value="1" />	
    <wlevs:instance-property name="sessionTransacted" value="false" />
</wlevs:adapter>

Query Parallelism

Query parallelism refers to processing stages where there are multiple independent queries applied simultaneously to each event of the input stream. This is achieved by having a channel with multiple downstream elements, where each event flowing through the channel is broadcast to all of the channel’s downstream elements. This is illustrated in figure 2, where the input channel has five downstream processors, each one running a concurrent query. As explained above, the topology resulting from this type of scenario is a pipeline tree, as opposed to a linear pipeline topology.

 

Figure 2. Query parallelism

 

figure2

Each of the five concurrent queries in figure 2 is configured to be independent of the other because each one consumes a separate copy of every event that flows out of the input channel. This configuration forks the single pipeline of the JMS input adapter followed by the input channel into five independent pipelines comprising a CQL query processor followed by an output channel and followed by an HA and JMS output adapter pair.

To increase performance, each of these forked pipelines can be treated as an independent linear pipeline whose stages can be parallelized. In the example in figure 2, on each of the branch pipelines, the CQL query processor is assigned one thread, and the HA and JMS output adapter pair is assigned also one thread.

Thread assignment for the CQL query processor stage is defined in the input channel configuration element in META-INF/wlevs/config.xml by setting the max-threads property to 5 as follows:

<channel>
    <name>inputChannel</name>
    <max-size>1000</max-size>
    <max-threads>5</max-threads>
</channel>

max-threads should not be larger than the number of processors fanning out from the input channel. This configures a pool of threads capable of handling one event simultaneously on each of the forked pipelines. The remaining stages in each of the pipelines are assigned one thread as in the single linear pipeline case.

In summary, even tough HA OEP configurations have strong event ordering requirements that prevent parallelism on each stage, there is still end-to-end pipeline concurrency that can be effectively exploited by assigning at most one thread to each element on each of the linear pipelines in an OEP EPN.

 

 

Add Your Comment