X

Best Practices from Oracle Development's A‑Team

Configuring Oracle Data Integrator for Oracle Big Data Cloud: Standalone Configuration

Introduction

 

This article discusses how to configure Oracle Data Integrator (ODI) for Oracle Big Data Cloud (BDC) using the ODI Standalone installation.  ODI offers out of the box integration with Big Data technologies such as Apache Hadoop, Apache Spark, Apache Hive, Apache Pig, and Apache Kafka, among others.  ODI supports both distributions of Hadoop:  Hortonworks Data Platform (HDP), and Cloudera Enterprise Data Hub (CDH).  Additionally, ODI can also be used on other distributions of Hadoop such as Amazon Elastic MapReduce (EMR).

For additional information on how to use ODI with BDC, go to “Using Oracle Data Integrator with Oracle Big Data Cloud.”  A pre-recorded live demonstration that supports this discussion can be found at the following Oracle Data Integration webcast: “Mastering Oracle Data Integrator with Big Data Cloud.”

 

Configuring Oracle Data Integrator for Big Data Cloud

 

In order to use ODI with BDC, users can install and configure ODI in one of two ways:  ODI Standalone or ODI with High-Availability.  The ODI Standalone configuration requires the installation and configuration of the ODI Standalone agent in an instance of BDC.  The ODI with High-Availability configuration is an extension of the ODI Standalone configuration, but it uses the ODI J2EE agent as an orchestrator for Big Data workloads.  The following sections of this article provide a guideline for installing and configuring ODI on BDC using the ODI Standalone configuration.

For additional information on how to install and configure ODI for BDC using the ODI with High-Availability configuration, go to “Configuring Oracle Data Integrator for Big Data Cloud: High-Availability Configuration.”

 

ODI Standalone Configuration for Big Data Cloud

 

The ODI Standalone Configuration for Big Data requires that the ODI Standalone agent be hosted on the BDC cluster.  The ODI Standalone agent is installed and configured as a standalone lightweight Java application and it is hosted on the master node of the DBC cluster.  The ODI Standalone agent uses an ODI repository installed on an instance of the Oracle Database Cloud Service (DBCS)Figure 1, below, illustrates this configuration:

 

Figure 1 – Configuring ODI Standalone for Big Data Cloud

Figure 1 – Configuring ODI Standalone for Big Data Cloud

 

To install and configure ODI for BDC, using the ODI Standalone configuration, follow these steps:

 

Provision an Instance of Database Cloud Service (DBCS)

To host an ODI repository, users must provision an instance of a SQL database.  Oracle Cloud offers MySQL Cloud Service and Oracle Database Cloud Service.  The following instructions use the Oracle Database Cloud Service to host the ODI repository:

  • Provision an instance of DBCS to host the ODI repository. For information on how to provision an instance of DBCS, go to “Getting Started with Database Cloud Service (DBCS).”
  • Enable DBCS access rules on the new instance of DBCS to install the ODI repository. The ODI repository installation should be done from Compute Classic or BDC; thus, users must enable rules on DBCS to allow the installation of the ODI repository from a remote location such as Compute Classic or BDC.  For additional information on how to enable access rules on DBCS, go to “Using Oracle Database Cloud Service: Access Rules.”

Provision the Compute Classic Cloud Service Instance

The ODI Studio is the user interface that ODI offers to perform the ETL development.  It is recommended to perform this development in a compute resource such as Oracle Compute Classic.  By installing ODI Studio on Oracle Compute Classic, ODI users have the flexibility of having an ODI Studio installation that is independent of the ODI agent installation, and it provides more scalability when more developers are added into the ETL project.  Use the following instructions to provision an instance of Oracle Compute Classic and install the ODI Studio on this instance:

  • Create an instance of Compute Classic Cloud Service. For information on how to provision a new instance of Compute Classic Cloud Service, go to “Compute Classic: Create and Manage Instances.”
  • Download the ODI installer from the Oracle Middleware Data Integrator Download Site, and copy the ODI installer into the new Compute Classic instance.
  • Install ODI on the new Compute Classic instance, and create an ODI repository – use the new instance of DBCS to create the ODI repository. For information on how to install ODI and create a new ODI repository, go to “Installing Oracle Data Integrator.”
  • Use the new Compute Classic instance for all your ODI development work.

Provision the Big Data Cloud Instance (BDC)

Use the following instructions to provision an instance of BDC and install the ODI Standalone agent:

Configuring Access Between Cloud Services

The ODI High-Availability configuration requires additional access rules between cloud instances, so the instances can communicate with each other.  For instance, the ODI Standalone agent, on BDC, must access the ODI repository on DBCS.  Thus, follow these instructions in order to configure access rules between cloud instances:

  • On DBCS, enable access rules to allow Compute Classic and BDC to access the ODI repository.
  • On Compute Classic, enable access rules to access DBCS, and BDC.
  • On BDC, enable access rules to access the ODI repository on DBCS.

Once the access rules have been configured, users can launch the ODI Studio on Compute Classic, and start their ETL development work.

 

Conclusion

 

ODI offers out of the box integration with Big Data technologies such as Apache Hadoop, Apache Spark, Apache Hive, and Apache Pig, among others.  ODI supports both distributions of Hadoop:  Hortonworks Data Platform (HDP), and Cloudera Enterprise Data Hub (CDH).  Additionally, ODI can also be used on other distributions of Hadoop such as Amazon Elastic MapReduce (EMR).  This article discussed how to configure ODI with BDC using the ODI Standalone configuration.

For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-team Chronicles for Oracle Data Integrator (ODI).”

 

ODI Related Articles

Using Oracle Data Integrator with Oracle Big Data Cloud

Configuring Oracle Data Integrator for Big Data Cloud: High-Availability Configuration

Configuring Oracle Data Integrator for Big Data Cloud: Environment Configuration

Configuring Oracle Data Integrator for Big Data Cloud: Topology Configuration

Webcast: “Mastering Oracle Data Integrator with Big Data Cloud Service - Compute Edition.”

 

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha