Content validated on 4/14/2021 with
This article discusses the best practices that users should follow when using Oracle Data Integrator Marketplace (ODI) - to perform incremental updates in the Oracle Autonomous Data Warehouse Cloud Service (ADW).
ADW is a fully-managed and pre-configured Oracle database cloud service that is tuned and optimized for data warehouse workloads. As an autonomous cloud service, ADW does not require database administration or database tuning – the service automatically optimizes the database for high-performance queries; thus, users do not need to consider any details about parallelism, partitioning, indexing, or compression. ADW is built upon the Oracle Database, hence, business intelligence (BI) applications and data integration tools that support the Oracle Database also support ADW. ADW is designed as a load-and-go cloud service – users provision the service, define tables, load data, and then run their queries.
To perform data loads in ADW, Oracle offers various tools such as ADW Packages, Data Pump, and SQL*Loader, among others. For customers wanting to utilize these ADW tools and integrate data with other data stores (on-premises or in the Cloud), Oracle offers Oracle Data Integrator (ODI) Marketplace. ODI Marketplace is a cloud-based unified data platform that offers batch data processing capabilities. To perform self-service data discovery, predictive analytics, and business intelligence reporting in ADW, Oracle offers Oracle Analytics Cloud. Figure 1, below, shows these three Oracle cloud services: ODI Marketplace, ADW, and OAC. In this example, DIPC, specifically ODI, is used to perform batch data loads in ADWC. OAC is then used for business intelligence (BI) analysis and end-user ad hoc queries.
Figure 1 - ODI Marketplace, Autonomous Data Warehouse Cloud and Analytics Cloud
ODI offers Knowledge Modules (KMs) to perform data loads, data transformations, and data validation. KMs are code templates designed to perform a specific task in the overall data integration process. KMs are technology-specific, and they are optimized for each of the technologies supported by ODI. Out-of-the-box, ODI provides numerous KMs to extract, load, and transform data in the Oracle database. Since ADWC is based on the Oracle database, ODI users can use KMs that are based on the Oracle technology to load and transform data in ADWC. KMs fall into six categories: Reverse-Engineering, Check, Loading, Integration, Journalizing, and Service. This article addresses the Integration KMs that provide the best performance options when performing incremental updates with ODI in ADW.
This article is one of three articles that discuss best practices when using ODI with ADW:
Oracle Data Integrator Best Practices: Loading External Datasets into the Oracle Autonomous Data Warehouse Cloud Service
The incremental-update operations of an ADW data warehouse includes adding new records and updating existing records on dimensions, facts, and aggregates. The new rows and the records to be updated are usually located on on-premises data sources, other database cloud services, or on cloud-based object stores such as Oracle Object Storage or Amazon S3. A staging area (database schema) is usually created on ADW to load and transform data before changes are applied to the ADW data warehouse. The staging area is then used as the source dataset for the incremental-update operations of the ADW data warehouse. When performing incremental-update operations in an ADW data warehouse, ODI recommends two Integration Knowledge Modules (IKMs): IKM Oracle Merge, and IKM Oracle Update. Table 1, below, shows these two IKMs.
Table 1 - ODI Integration Knowledge Modules for Incremental Updates in ADWC
The IKM Oracle Merge offers the best performance option when both operations – inserts and updates - are required. This IKM uses the Oracle MERGE statement to select rows from an ADWC source table and perform the insert and the update operations with a single SQL statement. Figure 2, below, shows an example.
Figure 2 - Incremental Upload of an ADWC Table with ODI
Figure 2 above, shows how the IKM Oracle Merge performs the incremental update operation of a warehouse dimension called w_product_d. In this example, the IKM uses a condition to determine whether to update or to insert into the dimension table. The condition is based on the natural key, prod_id, of the dimension table. When there is no match, new rows are inserted into the dimension table using a database sequence. The database sequence has been previously created by the user. When there is a match, the update operation updates the target columns of the dimension table with the new values.
The IKM Oracle Merge can also be used for updates-only operations. Alternatively, ODI users can use the IKM Oracle Update for updates-only operations, but the IKM Oracle Merge offers the best performance when updating tables in an ADWC data warehouse. The IKM Oracle Update uses the Oracle UPDATE statement to perform the update operation based on an update key, and it uses subqueries to detect the rows that needs to be updated. Figure 3, below shows an example of how the IKM Oracle Update performs an update operation for the w_product_d dimension:
Figure 3 - Updates-Only Operation with the IKM Oracle Update
The IKM Oracle Update can be used for small target tables where updates are only required. However, as a best practice, it is recommended to test both the IKM Oracle Merge and the IKM Oracle Update to determine which IKM provides the best performance.
This article discusses the best practices that users should follow when using Oracle Data Integration Marketplace to perform incremental updates in an ADWC data warehouse. For more Oracle Data Integrator best practices, tips, tricks, and guidance that the A-Team members gain from real-world experiences working with customers and partners, visit “Oracle A-team Chronicles for Oracle Data Integrator (ODI).”