X

Best Practices from Oracle Development's A‑Team

Configuring Oracle Data Integrator for Oracle Big Data Cloud: High-Availability Configuration

Introduction

 

This article discusses how to configure Oracle Data Integrator (ODI) for Oracle Big Data Cloud (BDC) using the ODI Enterprise or High-Availability installation.  ODI offers out of the box integration with Big Data technologies such as Apache Hadoop, Apache Spark, Apache Hive, Apache Pig, and Apache Kafka, among others.  ODI supports both distributions of Hadoop:  Hortonworks Data Platform (HDP), and Cloudera Enterprise Data Hub (CDH).  Additionally, ODI can also be used on other distributions of Hadoop such as Amazon Elastic MapReduce (EMR).

For additional information on how to use ODI with BDC, go to “Using Oracle Data Integrator with Oracle Big Data Cloud.”  A pre-recorded live demonstration that supports this discussion can be found at the following Oracle Data Integration webcast: “Mastering Oracle Data Integrator with Big Data Cloud.”

 

Configuring Oracle Data Integrator for Big Data Cloud

 

In order to use ODI with BDC, users can install and configure ODI in one of two ways:  ODI Standalone or ODI with High-Availability.  The ODI Standalone configuration requires the installation and configuration of the ODI Standalone agent in an instance of BDC.  The ODI with High-Availability configuration is an extension of the ODI Standalone configuration, but it uses the ODI J2EE agent as an orchestrator for Big Data workloads.  The following sections of this article provide a guideline for installing and configuring ODI on BDC using the ODI High-Availability configuration.

For additional information on how to install and configure ODI for BDC using the ODI Standard configuration, go to “Configuring Oracle Data Integrator for Big Data Cloud: Standard Configuration.”

 

ODI High-Availability Configuration for Big Data Cloud

 

The ODI High-Availability configuration is an extension of the ODI Standalone configuration.  Under this configuration, both ODI agents, the ODI Standalone and the ODI J2EE agent, are installed and configured on two cloud services in order to achieve high-availability.
The ODI Standalone agent is installed and configured on an instance of BDC – at least two ODI standalone agents are recommended.  The ODI J2EE agent is installed and configured on an instance of the Oracle Java Cloud Service (JCS)  – at least two WebLogic managed servers (one ODI J2EE agent on each managed server) are recommended.  The ODI High-Availability configuration allows users to submit both Big Data and non-Big Data ODI workloads directly to the load balancer on JCS.  The load balancer distributes the ODI workloads among the J2EE agents on JCS.  If the ODI workloads are Big Data workloads, then the J2EE agent sends the Big Data workloads to the Standalone agent on BDC for execution.  Figure 1, below, illustrates the ODI High-Availability configuration for Big Data Cloud.

The ODI High-Availability Configuration for BDC, on Figure 2, below, requires an on-premises license of Oracle Data Integrator for Big Data.  Thus, users must download the ODI installer from the Oracle Middleware Data Integrator Download Site.  The ODI Cloud Service (ODICS) found on Java Cloud Service cannot be used for this configuration.

 

 

Figure 1 – Configuring ODI High-Availability for Big Data Cloud

Figure 1 – Configuring ODI High-Availability for Big Data Cloud

 

To install and configure ODI for BDC, using the ODI High-Availability configuration, follow these steps:

 

Provision an Instance of Database Cloud Service (DBCS)

To host an ODI repository, users must provision an instance of a SQL database.  Oracle Cloud offers MySQL Cloud Service and Oracle Database Cloud Service.  The following instructions use the Oracle Database Cloud Service to host the ODI repository:

  • Provision an instance of DBCS to host the ODI repository. For information on how to provision an instance of DBCS, go to “Getting Started with Database Cloud Service (DBCS).”
  • Enable DBCS access rules on the new instance of DBCS to install the ODI repository. The ODI repository installation should be done from Compute Classic or BDC; thus, users must enable rules on DBCS to allow the installation of the ODI repository from a remote location such as Compute Classic or BDC.  For additional information on how to enable access rules on DBCS, go to “Using Oracle Database Cloud Service: Access Rules.”

Provision the Compute Classic Cloud Service Instance

The ODI Studio is the user interface that ODI offers to perform the ETL development.  It is recommended to perform this development in a compute resource such as Oracle Compute Classic.  By installing ODI Studio on Oracle Compute Classic, ODI users have the flexibility of having an ODI Studio installation that is independent of the ODI agent installation, and it provides more scalability when more developers are added into the ETL project.  Use the following instructions to provision an instance of Oracle Compute Classic and install the ODI Studio on this instance:

  • Create an instance of Compute Classic Cloud Service. For information on how to provision a new instance of Compute Classic Cloud Service, go to “Compute Classic: Create and Manage Instances.”
  • Download the ODI installer from the Oracle Middleware Data Integrator Download Site, and copy the ODI installer into the new Compute Classic instance.
  • Install ODI on the new Compute Classic instance, and create an ODI repository – use the new instance of DBCS to create the ODI repository. For information on how to install ODI and create a new ODI repository, go to “Installing Oracle Data Integrator.”
  • Use the new Compute Classic instance for all your ODI development work.

Provision an Instance of Java Cloud Service (JCS)

To configure ODI for high-availability , use the following instructions to provision an instance of JCS and install the ODI J2EE agents:

  • Before provisioning a new instance of JCS, ensure that the WebLogic and the Java version available in JCS is compatible and certified with the ODI version that is available on the Oracle download site. For a list of supported configurations, go to “Oracle Fusion Middleware Supported System Configurations.”
  • Provision an instance of JCS to host the ODI J2EE agent and the High-Availability environment. For information on how to create a new instance of JCS, go to “Oracle Java Cloud Service.”
  • The subscription of JCS requires an instance of Database Cloud Service (DBCS). You can use the same instance of DBCS that was provisioned on section “Provision an Instance of Database Cloud Service (DBCS)” in order to host the JCS metadata.
  • When creating the new JCS instance, add the Load Balancer option – Oracle Traffic Director – and provision the JCS instance with at least one load balancer.
  • Download ODI from the Oracle Middleware Data Integrator Download Site, and copy the ODI installer into JCS.  Make sure you download the same version of the ODI installer used in other sections of this article – the installer must always be the same one.
  • On JCS, install ODI and use the ODI Fusion Middleware Configuration wizard to modify the WebLogic domain. Create two new WebLogic managed servers to host the ODI J2EE agents.  For information on how to configure WebLogic managed servers for ODI, go to “Configuring the WebLogic Domain for the ODI Java EE Agent.”

Provision the Big Data Cloud Instance (BDC)

Use the following instructions to provision an instance of BDC and install the ODI Standalone agent:

Configuring Access Between Cloud Services

The ODI High-Availability configuration requires additional access rules between cloud instances, so the instances can communicate with each other.  For instance, the ODI Standalone agent, on BDC, must access the ODI repository on DBCS.  Also, the ODI J2EE agent must access the BDC instance in order to orchestrate Big Data workloads with the ODI Standalone agent.  Thus, follow these instructions in order to configure additional access rules between cloud instances:

  • On DBCS, enable access rules to allow JCS, Compute Classic, and BDC to access the ODI repository.
  • On JCS, enable access rules to access BDC.
  • On Compute Classic, enable access rules to access JCS, DBCS, and BDC.
  • On BDC, enable access rules to access the ODI repository on DBCS. 

Once the access rules have been configured, users can launch the ODI Studio on Compute Classic, and start their ETL development work.

 

Conclusion

 

ODI offers out of the box integration with Big Data technologies such as Apache Hadoop, Apache Spark, Apache Hive, Apache Pig, and Apache Kafka, among others.  ODI supports both distributions of Hadoop:  Hortonworks Data Platform (HDP), and Cloudera Enterprise Data Hub (CDH).  Additionally, ODI can also be used on other distributions of Hadoop such as Amazon Elastic MapReduce (EMR).  This article discussed how to configure ODI with BDC using the ODI High-Availability configuration.

For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-team Chronicles for Oracle Data Integrator (ODI).”

 

ODI Related Articles

Using Oracle Data Integrator with Oracle Big Data Cloud

Configuring Oracle Data Integrator for Big Data Cloud: Standard Configuration

Configuring Oracle Data Integrator for Big Data Cloud: Environment Configuration

Configuring Oracle Data Integrator for Big Data Cloud: Topology Configuration

Webcast: “Mastering Oracle Data Integrator with Big Data Cloud Service - Compute Edition.”

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha