Using Oracle Partition Exchange with Oracle Data Integrator (ODI)

December 17, 2014 | 18 minute read
Text Size 100%:

Validation

Content validated on 5/5/2021 with
  • ODI Version 12.2.1.4.200721.1540
  • Oracle DB System 19.7.0.0

Introduction

 

This article presents a data integration recipe on how to use Oracle Partition Exchange with Oracle Data Integrator (ODI) to upload data very fast into partitioned tables of a large Oracle data warehouse.

This article features a new ODI knowledge module (KM), the IKM Oracle Partition Exchange Load, which uses the Oracle Partition Exchange technology to upload data, by partition or subpartition, into a partitioned table of an Oracle database.

This KM supports three partitioning strategies:  range, hash, and list.  Single-level partitioning and composite partitioning (sub-partitions) are supported by this KM as well.

Additionally, the KM includes options to support table maintenance operations that may be required before or after a partition exchange operation.  Some of these KM options include flow control, disabling and enabling table constraints, rebuilding local and global indexes, and gathering table statistics.

If you would like to download a copy of this knowledge module, go to Oracle Data Integrator Code Samples and download "IKM Oracle Partition Exchange Load."  Additionally, an ODI 12.1.3 repository with examples can be found at “ODI Repository Sample for Oracle Partition Exchange.”

For a detail description of how to use the KM options, and how to configure ODI with Oracle Partition Exchange, go to “Configuring Oracle Data Integrator (ODI) with Oracle Partition Exchange.”

If your Oracle data warehouse has partitioned tables, Oracle Partition Exchange offers a fast method for uploading data into your large partitioned tables.

 

Using Oracle Partition Exchange with Oracle Data Integrator (ODI)

 

Oracle Partitioning is a feature of the Oracle database that allows the decomposition of very large database objects – such as tables and indexes – into smaller and more manageable pieces called partitions.

When tables and indexes are partitioned, users and applications can query and manage data by partition or sub-partition; thus, the database can return and process data much faster than if the entire table or index is scanned.

Oracle Partition Exchange is the ability to swap the data segment of a non-partitioned table with a partition of a partitioned table, or vice versa.   The benefit of Oracle Partition Exchange is that the exchange process is a data definition language (DDL) operation with no actual data movement; thus, the exchange operation is immediate – it only takes seconds.

On very large databases (VLDBs) such as Oracle data warehouses, Oracle Partition Exchange can facilitate high-speed uploads of new and incremental data into partitioned objects such as facts, cubes, and large dimension tables.

This article presents a new KM called the IKM Oracle Partition Exchange Load.  This new KM allows users to create and design ODI mappings that can take advantage of the Oracle Partition Exchange technology.

ODI mappings can now exchange or swap the content of a source dataset with a partition of a partitioned table.  The IKM Oracle Partition Exchange Load replaces traditional data upload techniques – such as insert, update, and merge operations – used in other ODI Integration KMs (IKMs) that have been designed for the Oracle technology.

This article focuses on how to use the IKM Oracle Partition Exchange Load with Oracle Data Integrator to accelerate the speed of your data uploads for your very large Oracle partitioned tables.

For more information on the benefits of using Oracle Partitioning on very large databases, see Oracle Database VLDB and Partitioning Guide and Using Oracle Partitioning in a Data Warehouse Environment.

For more information on how to use Oracle Partition Exchange, see section Exchanging Partitions of the Oracle Database and Partitioning Guide.

 

Benefits of Using Oracle Partition Exchange

 

Oracle Partition Exchange is a great database tool that can be used in data integration activities to upload data very fast into partitioned tables of an Oracle database.

Table 1 summarizes some of the most important benefits of using Oracle Partition Exchange for data integration activities:

 

 

Table 1 - Benefits of Using Oracle Partition Exchange

Table 1 - Benefits of Using Oracle Partition Exchange

 

 When is Oracle Partition Exchange a Suitable Option?

 

Table 2 shows examples where Oracle Partition Exchange can be a suitable option for an ELT data integration environment.

Table 2 - When is Oracle Partition Exchange a Suitable Option?

Table 2 - When is Oracle Partition Exchange a Suitable Option?

 

IKM Oracle Partition Exchange Load: Overview

 

The IKM presented in this article, the IKM Oracle Partition Exchange Load, uses the Oracle Partition Exchange technology to upload data into a partition of a partitioned table.

In an ODI mapping, the partitioned table is the target datastore of a data flow.  The source data for the partitioned table comes from the ODI flow table, also known as the ODI integration (I$) table.  The source data for the ODI flow table comes from the source-dataset defined in the ODI mapping.  The ODI flow table holds the transformed data for the partitioned table.

In most Oracle-based IKMs, the data in the ODI flow table is added into the target table by issuing additional database operations such as insert, update, and merge statements.  These additional database operations require more table space, memory, and database resources; thus, increasing the overall time that takes to run the ELT data integration activities.

The knowledge module presented in this article exchanges the partition of a partitioned table with the ODI flow table; thus, eliminating the need for additional database operations during the seeding of the partitioned table.

Figure 1 shows how the IKM Oracle Partition Exchange Load performs the partition exchange operation for a table called Orders Fact. This fact table has been partitioned by month.

 

Figure 1 - IKM Oracle Partition Exchange Load – Sample Load

Figure 1 - IKM Oracle Partition Exchange Load – Sample Load

 

In the above example, a staging table called Orders is joined with three warehouse dimensions – Customer, Product, and Status – and a filter is used to select data for the month of July 2014.  At a minimum, the knowledge module performs the following three tasks:

 

  • Joins the source datastores, applies the data filter, and transforms the source data.
  • Loads the transformed data into the ODI flow table.
  • Exchanges partition JUL2014 with the ODI flow table.

Some of the knowledge module tasks – such as disabling and enabling constraints, rebuilding indexes, and collecting statistics – are optional.  These optional tasks can be controlled by the user via the knowledge module options.  The next section of this article presents an overview of the KM tasks.

 

IKM Oracle Partition Exchange Load:  Tasks

 

Figure 2 shows a list of KM tasks.  All tasks are based on the Oracle technology.

 

Figure 2 - IKM Oracle Partition Exchange Load Tasks

Figure 2 - IKM Oracle Partition Exchange Load Tasks

 

Some of the knowledge module tasks above – such as Inserting data into flow table – are mandatory, and they are always executed by the knowledge module.  Other knowledge module tasks – such as Flow control – are optional, and they are controlled by the knowledge module options.

Although mandatory tasks are always executed by the knowledge module, they may not have any effect on the overall execution of the data flow.  For instance, if the partitioned table does not have a primary key, the Create PK on flow table task will not add a primary key to the flow table.

Only one task, Drop flow table, is configured to ignore errors.  All other tasks will stop running if errors are detected during the execution of the KM.

 

 

IKM Oracle Partition Exchange Load:  Execution Flow

 

Figure 3 shows the execution flow of the IKM Oracle Partition Exchange Load.  A summary of the KM execution flow follows:

 

  • The KM enables parallel DML (data manipulation language) for the current database session.  Thus, the Oracle database will attempt to execute all DML operations of the KM in parallel.
  • If the user chooses to enable incremental statistics on the partitioned table, the KM modifies the statistics preference of the partitioned table to incremental.  Also, if the user chooses to publish statistics on the partitioned table, the publish preference of the partitioned table is set to true.
  • If the user chooses to disable the constraints of the partitioned table, the constraints are disabled before creating the flow table.  If the flow table exists, the knowledge module drops it.
  • The flow table is created using the same table structure of the target partitioned table.  If the partitioned table has sub-partitions, the flow table is created with partitions that resemble the sub-partitions.
  • Data is inserted into the flow table by joining the source datastores, applying the data filters, and transforming the source data.
  • If the user chooses to activate flow control, the KM invokes the check knowledge module (CKM) for the Oracle technology (CKM Oracle) to validate the data in the flow table.  Invalid rows will be copied into error tables and removed from the flow table.
  • Once data is validated by the CKM, the constraints of the partitioned table are added into the flow table.  This includes primary keys, foreign keys, unique keys, and check constraints.  If the partitioned table has local indexes, they are added into the flow table as well.
  • If the user chooses to lock the partition of the partitioned table, the partition is locked before the exchange operation.  The partition exchange operation is performed by swapping a specified partition with the flow table.
  • If the user chooses to rebuild local or global indexes on the partitioned table, unusable indexes are rebuilt after the partition exchange operation.  If the user chooses to enable the constraints of the partitioned table, the constraints are enabled after the rebuild of the indexes.
  • If the user chooses to gather table statistics, statistics are gathered for the partitioned table.  Incremental statistics are gathered if the user enabled incremental statistics on the partitioned table.
  • Finally, the flow table is dropped, and the parallel DML session is disabled.

 

Figure 3 - IKM Oracle Partition Exchange Load Execution Flow

Figure 3 - IKM Oracle Partition Exchange Load Execution Flow

 

 

IKM Oracle Partition Exchange Load:  Options

 

Figure 4 shows a list of the KM options with default values.

 

Figure 4 - IKM Oracle Partition Exchange Load Options

Figure 4 - IKM Oracle Partition Exchange Load Options

 

The default values for the KM options are as follow:

 

  • The Partition Name option uses the ODI getPop method to get the partition name from the Partition/Sub-Partition option – a property of the datastore in the logical design of the ODI mapping.
  • The Exchange Type option has been set to PARTITION; thus, the partition exchange operation will be done at the partition level.  The user can modify this value to perform the exchange operation of a subpartition as well – by setting the value to SUBPARTITION.
  • The Degree of Parallelism option has been set to PARALLEL; thus, the Oracle database will determine the most optimum degree of parallelism to be used when loading data into the flow table.
  • The Select Optimizer Hint option has no value, but the user can specify an optimizer hint to speed up the query that selects data from the source datastores.
  • The Flow Control option has been set to true; thus, the data in the flow table will be validated before the partition exchange operation.
  • The Lock Partition option has been set to false.  The partition will not be locked before the exchange operation.
  • The Partition Exchange Options have been configured to perform the exchange operation with validation; thus, Oracle will ensure that the data to be exchanged belongs in the partition you exchange.
  • The Delete Temporary Objects option has been set to true.  All temporary objects will be deleted once the KM completes its execution successfully.
  • All other KM options have been set to false.

All these options can be configured, changed, and customized by the user.  Some of these options can be configured with ODI variables as well.

 

Loading Partitioned Tables with Oracle Partition Exchange and ODI

 

Using Oracle Partition Exchange to perform the data upload operation of large partitioned tables offer tremendous performance benefits.  Since the partition exchange operation only involves data dictionary updates, Oracle Partition Exchange is the fastest method for uploading data into large partitioned tables.

Oracle Partition Exchange can be used to orchestrate both the initial and the incremental data upload operations of partitioned tables in a very large data warehouse.

The following sections discuss how to use Oracle Partition Exchange with ODI to perform both the initial and the incremental data upload operations of large partitioned tables.

 

Initial Load of a Partitioned Table with Oracle Partition Exchange and ODI

 

Oracle Partition Exchange is a great tool for performing the initial data upload operation of a large partitioned table.  For instance, the initial upload operation of a large partitioned table can be orchestrated in parallel, and multiple partitions can be loaded and exchanged asynchronously.

This section illustrates how to design and orchestrate the initial data upload operation of a large partitioned table with ODI and the IKM Oracle Partition Exchange Load.

 

Initial Load of a Partitioned Table: The Logical Design

 

The ODI flow-based mapping editor offers great features that can be used to design the initial data upload operation of a partitioned table.  ODI components such as filters, joins, lookups, and datasets can be used to upload data into partitioned tables.

Figure 5 illustrates the logical design of an ODI mapping that uploads data by partition into a warehouse fact partitioned table called W_ORDERS_F.  This mapping is designed to perform the initial data upload operation for this partitioned table (W_ORDERS_F).

Figure 5 - ODI Mapping - Logical Design of an Initial Load

Figure 5 - ODI Mapping - Logical Design of an Initial Load

 

The ODI mapping, above, includes components such as datasets, lookups, and filters to select data from a group of source tables.  The target datastore, W_ORDERS_F, is the partitioned table, and it has been partitioned by month.

The source table called ORDERS is joined with three warehouse dimensions – W_CUSTOMER_D, W_PRODUCT_D, and W_STATUS_D.  The customer dimension, W_CUSTOMER_D, is a Type-2 slowly changing dimension , and it has a filter to select the current record of a given customer.  The condition for the filter is defined in an ODI variable, #A_TEAM.DefaultType2CurrentFlag.  The value for this ODI variable is set to 1 – only records with a current flag that equals to 1 are selected.

The ORDERS table has a filter to select data by month:  FILTER1.  Thus, the upload operation can be orchestrated by month.  The filter uses the month of the partition to be exchanged.    This guarantees that the selected source dataset belongs to the partition to be exchanged.

#A_TEAM.DefaultType2CurrentFlag, #A_TEAM.PARTITION_MONTH, and #A_TEAM.PartitionDateFormat are ODI variables that can be refreshed in an ODI package.  Section “Initial Load of a Partitioned Table: The ELT Orchestration” discusses the design of the ODI package and how the ODI variables are refreshed by this package.

 

Initial Load of a Partitioned Table: The Physical Design

 

ODI 12c offers the capability to design multiple implementations of the same ODI mapping – this is known as ODI Physical Deployment Specifications.

Figure 6 illustrates an ODI mapping with three physical deployment specifications:  First Partition Load, Next Partition Load, and Last Partition Load.  These three deployment specifications use the IKM Oracle Partition Exchange Load to perform the initial upload operation of the partitioned table (W_ORDERS_F).

 

Figure 6 - ODI Mapping – Three Physical Deployment Specifications

Figure 6 - ODI Mapping – Three Physical Deployment Specifications

 

Figure 7 shows the knowledge module options for the first deployment specification, First Partition Load.

 

Figure 7 - ODI Deployment Specification – First Partition Load

Figure 7 - ODI Deployment Specification – First Partition Load

 

The first deployment specification (First Partition Load), Figure 7 above, performs initial table maintenance operations such as disabling table constraints, enabling incremental statistics, and enabling publish statistics on the partitioned table.

Flow control is activated to validate the source dataset against the constraints of the partitioned table.  The data validation is done by the Oracle check knowledge module (CKM Oracle).  If invalid records are found in the flow table, they are copied into an error table and removed from the flow table before the exchange operation.

The logical design of this mapping, shown in Figure 5, includes a filter to select data by month – using the month of the partition.  Hence, the exchange operation can be done without validation, since the source dataset is guaranteed to belong to the partition to be exchanged.

This deployment specification loads the first partition, and it rebuilds the partitioned or local indexes of the first exchanged partition.  Figure 8 shows the knowledge module options for the next deployment specification, Next Partition Load.

 

Figure 8 - ODI Deployment Specification – Next Partition Load

Figure 8 - ODI Deployment Specification – Next Partition Load

 

The next deployment specification (Next Partition Load), Figure 8 above, performs the partition exchange operations for all subsequent partitions, except the last partition of the partitioned table.

Flow control is also used in this deployment specification.  Partitioned or local indexes are rebuilt after every partition exchange load.  Figure 9 shows the knowledge module options for the last deployment specification, Last Partition Load.

 

Figure 9 - ODI Deployment Specification – Last Partition Load

Figure 9 - ODI Deployment Specification – Last Partition Load

 

The last deployment specification (Last Partition Load), Figure 9 above, loads the last partition of the partitioned table.  Flow control is also used in this deployment specification.

Additional table maintenance operations are performed with this deployment specification:   local and global indexes are rebuilt, table constraints are enabled, and incremental statistics are gathered for the partitioned table.  All three deployment specifications use an ODI variable called #A_TEAM.PARTITION_MONTH.  This ODI variable will be refreshed in an ODI package at runtime.

For additional information on how to design deployment specifications with ODI 12c, go to Creating Deployment Specifications with ODI 12c.

 

 

Initial Load of a Partitioned Table: The ELT Orchestration

 

The IKM Partition Exchange Load can be used in conjunction with ODI packages and ODI load plans to orchestrate the entire data upload operation of a partitioned table.  For instance, an ODI package can be designed to upload multiple partitions of the same partitioned table in parallel.

Figure 10 shows the design flow of an ODI package called PARTITION_EXCHANGE_LOAD.  This package uses ODI scenarios and ODI variables to orchestrate the initial data upload operation for a partitioned table called W_ORDERS_F.

 

 

Figure 10 - ODI Package - Partition Exchange Load

Figure 10 - ODI Package - Partition Exchange Load

 

The package above invokes three ODI scenarios: Load First Partition, Load Next Partition, and Load Last Partition.  All three ODI scenarios have been generated from the same mapping, using the appropriate deployment specification each time.

The package performs a count on the total of partitions found in the partitioned table.  For each partition found, the package refreshes an ODI variable with the partition name, and it proceeds to upload the partition.

Scenario Load Next Partition runs in asynchronous mode; thus, this scenario performs the partition uploads in parallel.

Figure 11 shows the execution logs for this package.

 

Figure 11 - Partition Exchange Load Package – Execution Log

Figure 11 - Partition Exchange Load Package – Execution Log

 

In the example above, the first partition load (JAN2014, at the bottom of the list) was performed by the scenario called Load First Partition.  This scenario executed the mapping with the physical deployment specification called First Partition Load.

The next set of partition uploads (FEB2014 thru NOV2014) was performed by the scenario called Load Next Partition.  This scenario executed the mapping with the physical deployment specification called Next Partition Load.  This scenario executed in asynchronous mode; thus, all partitions were loaded in parallel.

The last partition load (DEC2014) was performed by the scenario called Load Last Partition.  This scenario executed the mapping with the physical deployment specification called Last Partition Load.

All three scenarios used the IKM Oracle Partition Exchange Load to upload data into the partitioned table.

Alternatively, ODI load plans can be used to orchestrate partition uploads as well.  ODI load plans offer additional features such as exception handling, parallelism, and restartability.  For additional information on how to use ODI load plans, go to “Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator - Using Load Plans.”

 

Incremental Load of a Partitioned Table with Oracle Partition Exchange and ODI

 

Oracle Partition Exchange can be used to perform incremental data uploads for partitioned tables.  For instance, Oracle Partition Exchange can be used to perform daily data uploads for tables that have been partitioned by month.

In ODI, components such as SET and LOOKUP can be used to merge daily datasets with the existing data of a partition, and Oracle Partition Exchange can be used to replace the content of the partition with the merged dataset.

This section illustrates how to design and orchestrate the incremental data upload operation of a large partitioned table with ODI and the IKM Oracle Partition Exchange Load.

 

Incremental Load of a Partitioned Table: The Logical Design

 

The ODI flow-based mapping editor offers additional components such as SET and LOOKUP to design incremental data uploads for partitioned tables.  For instance, if a table is partitioned by month, and data is uploaded once a day into the partitioned table, a SET component can be used to merge an incremental source dataset with data already stored in a partition.

Figure 12 illustrates the logical design of an ODI mapping that uploads incremental data by partition into the partitioned table called W_ORDERS_F.

This mapping uses the SET component to perform a union of the incremental source dataset with the existing data of a given partition.  The partitioned table is used as an input source for the SET component (W_ORDERS_F_AS_OF_TODAY).

W_ORDERS_F_AS_OF_TODAY has been configured to select data for a specific partition.   The partition name is set in the Partition/Sub-Partition property of the datastore, and an ODI variable (#A_TEAM.PARTITION_MONTH) is used to select the partition name dynamically.

 

Figure 12 - ODI Mapping – Incremental Load Design

Figure 12 - ODI Mapping – Incremental Load Design

 

Figure 13 illustrates how the SET component has been configured.   This component uses two input connector points:  DATASET, and W_ORDERS_F_AS_OF_TODAY.

 

 

Figure 13 - ODI Mapping – Set Component Attributes & Operators

Figure 13 - ODI Mapping – Set Component Attributes & Operators

 

The SET component (above) uses the UNION operator to merge the incremental dataset (DATASET) with the existing data of the partition (W_ORDERS_F_AS_OF_TODAY).   Thus, the flow table is populated with both datasets.

 

Incremental Load of a Partitioned Table: The Physical Design

 

Figure 14 shows the knowledge module options for the incremental data upload operation.

In this example, the Flow Control option is used to validate the new dataset against the constraints of the partitioned table.  The data validation is performed by the Oracle check knowledge module (CKM Oracle).  If invalid records are found in the flow table, they are copied into an error table and removed from the flow table before the exchange operation.

Local indexes are included during the partition exchange operation.  This is done by using the INCLUDING INDEXES value in the Partition Exchange Options.  Thus, there is no need to rebuild local indexes after the exchange operation, since they will stay in a usable state after the exchange operation.

The exchange operation is performed with validation.  This is done by using the WITH VALIDATION value in the Partition Exchange Options.

The KM does not disable or enable the table constraints for the partitioned table during the partition exchange operation.  However, in the database, the table constraints for the partitioned table are enabled.

Global indexes are updated – in parallel – during the partition exchange operation.  This is done by using the UPDATE INDEXES PARALLEL value in the Partition Exchange Options.  Thus, global indexes will stay usable during the exchange operation.

Incremental statistics will be gathered after the partition exchange operation.  The statistic and publish preferences of the partitioned table have been set to incremental during the initial upload operation.  Thus, table statistics are gathered incrementally.  If the partitioned table is composite, both subpartition-level and partition-level statistics are gathered for the exchanged partition.  If the partitioned table is single-level, only partition-level statistics are gathered for the exchanged partition.

 

Figure 14 - ODI Deployment Specification – Incremental Partition Load

Figure 14 - ODI Deployment Specification – Incremental Partition Load

 

The examples discussed in the above sections have been developed with ODI 12c.  If you would like to download a copy of these examples, go to “ODI Repository Sample for Oracle Partition Exchange.”  Additional examples are also available in this ODI repository.

For a detail description of how to use the KM options, and how to configure ODI with Oracle Partition Exchange, go to “Configuring Oracle Data Integrator (ODI) with Oracle Partition Exchange.”

 

 

Combining Oracle Partition Exchange with Oracle Data Pump

 

If the source data for your partitioned table is located in an Oracle data server, consider using Oracle Data Pump to extract and load data into your partitioned target table.  Oracle Data Pump is the fastest way to extract and load data between Oracle data servers.

The combination of using Oracle Data Pump with Oracle Partition Exchange offers tremendous performance benefits.  If you would like to learn more about using Oracle Data Pump with ODI, please visit “Using Oracle Data Pump with Oracle Data Integrator (ODI).

 

Conclusion

 

If your Oracle data warehouse has partitioned tables, Oracle Partition Exchange offers a fast method for uploading data into your last partitioned tables.

If you would like to download a copy of this knowledge module, go to Oracle Data Integrator Code Samples and download "IKM Oracle Partition Exchange Load."  Additionally, an ODI 12.1.3 repository with examples can be found at “ODI Repository Sample for Oracle Partition Exchange.”

For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit Oracle A-team Chronicles for Oracle Data Integrator (ODI).”

 

Related Articles

Configuring Oracle Data Integrator (ODI) with Oracle Partition Exchange

Using Oracle Data Pump with Oracle Data Integrator (ODI)

 

Benjamin Perez-Goytia


Previous Post

Fusion Applications WebCenter Content Integration – Automating File Import/Export

Jack Desai | 8 min read

Next Post


Configuring Oracle Data Integrator (ODI) with Oracle Partition Exchange

Benjamin Perez-Goytia | 23 min read