X

Best Practices from Oracle Development's A‑Team

Implementing Upsert for Oracle Service Cloud APIs

Gerhard Drasch
Sr. Director Oracle A-Team

Introduction

Oracle Service Cloud provides a powerful, highly scalable SOAP based batch API supporting all the usual CRUD style operations. We have recently worked with a customer who wants to leverage this API in large scale but requires the ability to have 'upsert' logic in place, i.e. either create or update data in Oracle Service Cloud (OSvC) depending on whether an object already exists in OSvC or not. At this time the OSvC API does not provide native support for upsert, but this article will show an approach to accomplish the same leveraging Oracle SOA Suite. It also provides data points regarding the overhead and the scalability in the context of high-volume interfaces into OSvC.

Main Article

Why Upsert?

One might ask why would one need upsert logic in the first place. Aside that this is common practice in some well established applications such as Siebel, there are situations where upsert capabilities come in very handy. For example, if one cannot rely on the source system feeding data into a target application to be able to tell whether some data has been provided earlier or not, it's useful to be able to determine this on the target side and take the right action. I.e. create a new record or object in the target or update an existing record/object with new data. Clearly, creating duplicate information in the target is the one thing to be avoided most.

Maintaining Cross-References

In order to determine if a particular source record has already been loaded into OSvC previously or not, cross-reference information must be maintained at some place. There are different approaches to this, depending on system capabilities this could be either in the source system, the integration middleware, or the target system. There are specific advantages for each approach, but this is outside the scope of this article. In this case we want to leverage OSvC extensibility capabilities to provide additional attributes that can hold the references to the source record in the source system. A common practice is to use a pair of attributes such as (SourceSystem,SourceSystemId) for this purpose. With the OSvC object designer it's a straightforward task to do this, e.g as shown for the Contact object below:

Custom Cross-Reference Attributes

Performance and scalability really matter in this scenario, so we have to make sure that the queries to determine if a records already exists will perform well. We will ultimately construct ROQL queries that will translate to point lookup queries in the OSvC database in order to verify is a set of (SourceSystem,SourceSystemId) pairs exist in the OSvC database. Therefore, having a custom Index on these two custom attributes will allow the OSvC database to execute such queries in a performant way avoiding full table scans. In the OSvC object designer, defining a custom index is straight-forward:

Custom Index

With that in place (after deploying the updated object to the system) we have all we need to store, maintain, and query the cross-references to the source record in OSvC. In the next section we will discuss how a this can be leveraged in a SOA implementation to realise the upsert logic.

SOA Implementation

As we are looking at a batch-style interface here with the need to process large volumes of records, it certainly does not make sense to query OSvC for each record to determine wether we need to execute a Create or Update operation for each record. Instead, as we want to process a bulk of say 100 objects in one service invocation against OSvC, we rather design it in the following way to keep round trips at a minimum:

Step 1: SOA composite receives a bulk of 100 records.

Step 2: BPEL process constructs a single ROQL query to determine for all records in one go whether they already exist in OSvC or not. This ROQL will be executed via the queryCSV API method. Running individual object queries would not scale very well for this scenario.

Step 3: BPEL constructs the bulk API payload for OSvC by combining Create and Update operations.

Step 4: BPEL invokes the OSvC batch API and processes the response e.g. for a reply to the source system.

In other words, we have two interactions with OSvC. The first one is to retrieve the cross-referencing information held in custom attributes and the second one does the actual data processing taking the cross-referencing into account.

Upsert BPEL Process

As stated previously, in Step 2 above we need to construct a single ROQL query that takes care of looking up any cross-references for the list of records currently processed by the BPEL process. This is accomplished by string concatenation adding a criteria to the ROQL where clause per record. The condition ensures that the core structure of the query 'SELECT ... FROM ... WHERE' is set for the first record while for each subsequent records it will just add another OR clause.

  <xsl:variable name="whereclause">
    <xsl:for-each select="/ns0:LoadDataCollection/ns0:LoadData">
      <xsl:choose>
        <xsl:when test="position() = 1">
          <xsl:value-of select="concat (&quot;select c.id, c.CustomFields.CO.SourceSystemId from Contact c where (c.CustomFields.CO.SourceSystem='LegacyApp1' and c.CustomFields.CO.SourceSystemId='&quot;, ns0:cdiId, &quot;')&quot; )"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="concat (&quot; or (c.CustomFields.CO.SourceSystem='LegacyApp1' and c.CustomFields.CO.SourceSystemId='&quot;, ns0:cdiId, &quot;')&quot; )"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:variable>
  <xsl:template match="/">
    <tns:QueryCSV>
      <tns:Query>
        <xsl:value-of select="$whereclause"/>
      </tns:Query>
      <tns:PageSize>10000</tns:PageSize>
      <tns:Delimiter>,</tns:Delimiter>
      <tns:ReturnRawResult>false</tns:ReturnRawResult>
      <tns:DisableMTOM>true</tns:DisableMTOM>
    </tns:QueryCSV>
  </xsl:template>

This results in a ROQL query in the following structure:

select c.id, c.CustomFields.CO.SourceSystemId
from Contact c
where (c.CustomFields.CO.SourceSystem='LegacyApp1' and c.CustomFields.CO.SourceSystemId='15964985')
or (c.CustomFields.CO.SourceSystem='LegacyApp1' and c.CustomFields.CO.SourceSystemId='15964986')
or (c.CustomFields.CO.SourceSystem='LegacyApp1' and c.CustomFields.CO.SourceSystemId='15964987')
etc.

The corresponding result from running this ROQL against OSvC using the QueryCSV operation provides us entries for all source records that already exists based on the SourceSystemId criteria. Vice versa, for non-existing references in OSvC there isn't a result record in the queryCSV response:

         <n0:QueryCSVResponse xmlns:n0="urn:messages.ws.rightnow.com/v1_2">
            <n0:CSVTableSet>
               <n0:CSVTables>
                  <n0:CSVTable>
                     <n0:Name>Contact</n0:Name>
                     <n0:Columns>ID,SourceSystemId</n0:Columns>
                     <n0:Rows>
                        <n0:Row>12466359,15964985</n0:Row>
                        <n0:Row>12466369,15964987</n0:Row>
                        <n0:Row>12466379,15964989</n0:Row>
                        <n0:Row>12466387,15965933</n0:Row>
                        <n0:Row>12466396,15965935</n0:Row>
                        <n0:Row>12466404,15965937</n0:Row>
                     </n0:Rows>
                  </n0:CSVTable>
               </n0:CSVTables>
            </n0:CSVTableSet>
         </n0:QueryCSVResponse>

So in the case of the example we can conclude that for the record referencing 15964985, it would have to be an update, while it would be a create for reference 15964986.

In the next Step 3 this result needs to be merged with the actual data to construct the payload for the OSvC Batch API. We conditionally either construct a CreateMsg or UpdateMsg structure depending on wether the previous ROQL has retrieved the source application key or not. And if it's an update, it's essential to include the OSvC object identifier in the RNObjects structure so that the API is pointed to the right object in OSvC for update.

  <xsl:template match="/">
    <ns1:Batch>
      <xsl:for-each select="/ns0:LoadDataCollection/ns0:LoadData">
        <xsl:variable name="appKey" select="ns0:appKey"/>
        <xsl:choose>
          <xsl:when test="count($InvokeLookup_QueryCSV_OutputVariable.parameters/ns1:QueryCSVResponse/ns1:CSVTableSet/ns1:CSVTables/ns1:CSVTable/ns1:Rows/ns1:Row[contains(text(),$appKey)]) = 0 ">
            <ns1:BatchRequestItem>
              <ns1:CreateMsg>
                <ns1:RNObjects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="v13:Contact">
                  <v13:CustomFields xmlns:n3="urn:generic.ws.rightnow.com/v1_2">
                    <n3:GenericFields name="CO" dataType="OBJECT">
                      <n3:DataValue>
                        <n3:ObjectValue xsi:type="n3:GenericObject">
                          <n3:ObjectType>
                            <n3:Namespace/>
                            <n3:TypeName>ContactCustomFieldsCO</n3:TypeName>
                          </n3:ObjectType>
                          <n3:GenericFields name="SourceSystem" dataType="STRING">
                            <n3:DataValue>
                              <n3:StringValue>LegacyApp1</n3:StringValue>
                            </n3:DataValue>
                          </n3:GenericFields>
                          <n3:GenericFields name="SourceSystemId" dataType="STRING">
                            <n3:DataValue>
                              <n3:StringValue>
                                <xsl:value-of select="$appKey"/>
                              </n3:StringValue>
                            </n3:DataValue>
                          </n3:GenericFields>
                        </n3:ObjectValue>
                      </n3:DataValue>
                    </n3:GenericFields>
                  </v13:CustomFields>
                  <v13:Name>
                    <v13:First>
                      <xsl:value-of select="ns0:firstName"/>
                    </v13:First>
                    <v13:Last>
                      <xsl:value-of select="ns0:lastName"/>
                    </v13:Last>
                  </v13:Name>
                </ns1:RNObjects>
                <ns1:ProcessingOptions>
                  <ns1:SuppressExternalEvents>true</ns1:SuppressExternalEvents>
                  <ns1:SuppressRules>true</ns1:SuppressRules>
                </ns1:ProcessingOptions>
              </ns1:CreateMsg>
            </ns1:BatchRequestItem>
          </xsl:when>
          <xsl:otherwise>
            <ns1:BatchRequestItem>
              <ns1:UpdateMsg>
                <ns1:RNObjects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="v13:Contact">
                  <ID xmlns="urn:base.ws.rightnow.com/v1_2">
                    <xsl:attribute name="id">
                      <xsl:value-of select="substring-before($InvokeLookup_QueryCSV_OutputVariable.parameters/ns1:QueryCSVResponse/ns1:CSVTableSet/ns1:CSVTables/ns1:CSVTable/ns1:Rows/ns1:Row[contains(text(),$appKey)]/text(),',')"/>
                    </xsl:attribute>
                  </ID>
                  <v13:CustomFields xmlns:n3="urn:generic.ws.rightnow.com/v1_2">
                    <n3:GenericFields name="CO" dataType="OBJECT">
                      <n3:DataValue>
                        <n3:ObjectValue xsi:type="n3:GenericObject">
                          <n3:ObjectType>
                            <n3:Namespace/>
                            <n3:TypeName>ContactCustomFieldsCO</n3:TypeName>
                          </n3:ObjectType>
                          <n3:GenericFields name="SourceSystem" dataType="STRING">
                            <n3:DataValue>
                              <n3:StringValue>LegacyApp1</n3:StringValue>
                            </n3:DataValue>
                          </n3:GenericFields>
                          <n3:GenericFields name="SourceSystemId" dataType="STRING">
                            <n3:DataValue>
                              <n3:StringValue>
                                <xsl:value-of select="$appKey"/>
                              </n3:StringValue>
                            </n3:DataValue>
                          </n3:GenericFields>
                        </n3:ObjectValue>
                      </n3:DataValue>
                    </n3:GenericFields>
                  </v13:CustomFields>
                  <v13:Name>
                    <v13:First>
                      <xsl:value-of select="ns0:firstName"/>
                    </v13:First>
                    <v13:Last>
                      <xsl:value-of select="ns0:lastName"/>
                    </v13:Last>
                  </v13:Name>
                </ns1:RNObjects>
                <ns1:ProcessingOptions>
                  <ns1:SuppressExternalEvents>true</ns1:SuppressExternalEvents>
                  <ns1:SuppressRules>true</ns1:SuppressRules>
                </ns1:ProcessingOptions>
              </ns1:UpdateMsg>
            </ns1:BatchRequestItem>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each>
    </ns1:Batch>

The outcome of this transformation is a list of UpdateMsg and CreateMsg elements that are passed to the OSvC API in a single invocation.

From the perspective of pushing data into the SOA layer, this now transparently provides upsert logic and cross-referencing is maintained in OSvC. The next question one might ask is what is the performance overhead of the above? Or in other words: how does the extra round trip impact the overall throughput of the interface?

Performance Analysis

In order to understand the performance implications we have tested a batch interface into OSvC to synchronise Contact information with and without this upsert logic. The below diagram illustrates the throughput, i.e. number of processed Contact objects per time period for a different set of scenarios. We were executing batches of 1 million records each time with a concurrency of 50 parallel client threads.

Upsert performance results

The first, blue bar represents the throughput when the Upsert logic is not in place, i.e. there is no extra round-trip and all 1M records translate to Create operations in OSvC. The second bar represents also 1M create operations, but this time with the upsert logic in place. It turns out that the overhead for doing the extra round trip is negligible in such as scenario as all the heavy lifting is done during the actual data processing. The fast lookup queries (<1s for a batch of 100 records) are practically irrelevant for this specific use case.

We have conducted further tests with a growing proportion of update operation as opposed to create operations. The throughput keeps increasing as there are more updates and less creates. The simple reason is that the updates in our test case were rather light (updating 5 attributes of the object) compared to the creation of the full object with a much higher number of standard and custom attributes.

Conclusion

This article has provided an approach for implementing upsert capabilities for Oracle Service Cloud APIs. We have chosen to maintain cross-referencing information in OSvC and to use Oracle SOA Suite as the integration technology. We have also provided test results indicating the performance impact of the proposed design in high-volume scenarios.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha