* This blog was last tested on Oracle Analytics Cloud Data Sync (2.6.1) + ADW 18c by Jay Pearson (Oracle A-Team) *
The Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into Oracle Autonomous Data Warehouse (ADW), and other relational databases. The functionality to run Post Load SQL and Stored Procedures was added in Data Sync v2.2 .
Currently this functionality is only available for Oracle ADW, Oracle DBCS, and Oracle DB target databases. It will NOT work for a Schema Service database target - although this article provides details of a workaround when the target is a schema service database.
This article will walk through an example to set up both a post load SQL command, and to execute a stored procedure on the target database.
Be sure to download and install the latest version of the Data Sync Tool from Oracle Analytics Cloud Downloads through this link.
Data Sync requires JDK8. You can download that through this link.
(Note - customers still using the legacy/limited DBCS-Schema database, can not use this version of Data Sync. The version for BICS is available to download here, but does not have all the functionality of the OAC / OAAC version.)
For further instructions on configuring Data Sync, see this article. If a previous version of Data Sync is being upgraded, use the documentation on OTN.
This article will present a simple use case that can be expanded for real world load scenarios.
A Post Load Processing session will be set up to run both a SQL statement, and a stored procedure on the target database once the load has completed.
In this example, a summary table will be loaded with a row of data once the underlying fact table has been refreshed in ADW. Because the summary table only exists in the target database, we need to create it as a target in data sync.
1. Under 'Project' -> 'Target Objects' -> click 'New'
In this example the summary table is called 'AUDIT_EVENT_SUMMARY', and consists of just 2 fields.
Click Details -> New -> Add an 'AUDIT_RECORD_COUNT' numeric field, and a 'CAPTURE_DATE' date field.
2. Create the columns as shown, then 'Save'
Now that we have the target table defined, we can set up the post-load SQL, and the stored procedure.
1. From 'Project' -> 'Script Processing' -> lick 'New'
2. Enter an appropriate name, hit 'Save', then select the 'Source Objects' tab
Data Sync offers the ability to execute the SQL and Stored Procedure either at the end of the entire load process, or after the load completion of one or more individual tables. This is controlled within the 'Source Objects' section.
If the post load processing is to be run after all tables have been loaded, then no source tables need to be added. If this section is left empty, then by default the data sync tool will run the post load processing only after all tables are loaded.
If the post load processing can be run after one or more tables have been loaded, then that dependency can be set up here.
3. Select 'Add/Remove' and then the 'Go' search button to generate a list of table sources being used.
In this example we will trigger the load after the fact table ('AUDIT_EVENT_ADW') has been loaded.
4. Select the table, then hit 'Add', and finally 'Save' to close out of the screen.
There is a 'Target Objects' tab as well. This is useful if the target table needs to be truncated as part of the update process.
Truncating and reloading tables with indexes and large record volumes can result in performance issues. The data sync tool will handle this by having the target database perform the following steps:
If the target table is always going to be loaded incrementally, then select the 'Truncate for Full Load' check box, else 'Truncate Always'.
For demonstration purposes, we will select our target summary table.
5. Select 'Target Objects' -> 'Add/Remove'
6. Select 'Go' to list the available target tables
7. Select the table(s) and 'Add'. Then chose the appropriate option as to whether to 'Truncate Always' or only 'Truncate For Full Load'.
The next steps will be used to define the SQL and Stored Procedure.
8. Select 'OK' to return to the 'Edit' tab, hit 'Save', and then select the check box within the 'Scripts' box
9. In the next screen select 'Add', enter an appropriate name, and then select whether this step is to run a 'SQL' statement, or a 'Stored Procedure'. In this first example we will set up a post load SQL command.
10. There is also the option to run this post load process on just an 'Initial Load', an 'Incremental Load' or 'Both'. In this example we select 'Both'.
11. In the section below, as shown, enter the valid SQL statement to be run on the target database. In this case a single row is added to the summary table that we had created previously.
12. Click 'OK' to return to the previous screen.
To create a Stored Procedure follow similar steps. In this example we will set up the post load processing entry to run both the SQL, and a Stored Procedure.
13. Select 'Add', enter a suitable name, and select the 'Stored Procedure' type.
14. Enter the name of the procedure in the entry box. You do not need to type in 'execute' - the data sync tool will take care of that - just enter the name of the stored procedure, then click 'OK' and 'OK' again to exit out of the Post Load Processing set-up.
When the Job is next run, the SQL and Procedure will be run after the fact table has been loaded.
It is possible to set up multiple post load processes, with different dependencies. Each will be run independently once the source tables defined have been loaded.
This article walked through the steps to create a Post Load SQL and Stored Procedure within the Data Sync tool.
For other A-Team articles about OAC and Data Sync, click here.