The Oracle Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into Oracle Analytics Cloud Service (OAC), Oracle Autonomous Analytics Cloud Service (OAAC), and other oracle relational databases, or into Essbase.
In the most recent 2.5 release, Data Sync added the ability to pull data directly from Fusion SaaS view objects. The same view objects that other BI reporting, and extract tools use.
In this article I'll cover how the Data Sync tool can be set up to pull the same data, directly from the Fusion view objects.
There are pros and cons to both approaches. Your use case will dictate which best fits your needs:
Licensing for the Data Sync tool was intended originally for customers of BICS, OACS, or OAAC.
However, if the customer licenses Oracle products, then they can use the Data Sync tool based on the OTN license agreement.
This means the product is available to other customers, for instance Fusion SaaS customers, who would like to set up data extracts from Fusion and load that to an oracle DBCS instance, or on premise.
For more details, check the OTN Licensing Agreement link on this download page.
Be sure to download and install the latest version of the Data Sync Tool from OTN through this link.
Data Sync requires JDK8. You can download that through this link.
(Note - customers still using BI Cloud Service (BICS) and loading into the schema service database, can not use this version of Data Sync. The version for BICS is available to download here, but does not have all the functionality of the OAC / OAAC version.)
For further instructions on configuring Data Sync, see this article. If a previous version of Data Sync is being upgraded, use the documentation on OTN.
As covered in the recent article, in Fusion applications, it is possible to delete data. Perhaps a transaction was entered in error and was subsequently deleted, or a new product was added, but then a decision made to remove it. If Fusion data is being extracted to populate a downstream warehouse, it's important to be able to identify new, changed and deleted data.
This article will show how to create both a Data Extract, and a Deleted Record (Primary Key) Extract via a view object in Fusion SaaS.
The Data extract, as the name suggests, is the data from the selected view object(s). This can either be a full extract of all data in the view object(s), or an incremental extract, which includes all the changes in the data from the view object(s) since the last extract.
The Deleted record (primary key) extract lists all the primary keys that still exist in the view object. This can then be used to cross-reference with previously extracted data, to see which primary keys no longer exist. This allows the warehouse data to be marked as deleted for accurate downstream reporting.
This is likely to be the most time-consuming part of the process.
In this case, we're going to pull the Financial Business Unit view object: 'FscmTopModelAM.FinFunBusinessUnitsAM.BusinessUnitPVO'
7. Use the spreadsheet to work out the primary key, which is needed for the 'Deleted Record Extract'.
In this case, the primary key is the ID field - BUSINESSUNITID
This is an optional step, although it helps troubleshoot any issues with the BI Server SQL that you will be running within Data Sync. If there is an issue with your SQL, you will get more meaningful error messages by testing it within your OTBI environment in Fusion. Note, you need to have a Fusion user who has administration rights within the OTBI portal.
1. Log into your Fusion OTBI analytics portal, and select the 'Administration' menu, and then the 'Issue SQL' option.
2. In the SQL box, enter the SQL you are planning on running - and then 'Issue SQL' to test this.
Note - because we are going against the physical layer of the BI Server, we need to use the 'select_physical' command to select the columns we need.
You can use the * wildcard character with the select_physical statement, to select all columns:
or explicitly name the fields you want to return:
The second approach is recommended where possible to:
(a) Avoid errors - for large view objects, the 'select_physical *' approach may give errors - you can use the 'Error Details' link to provide more information to troubleshoot the error
(b) To limit the amount of data you are pulling back to only fields that are relevant to you. This will improve read and load performance.
3. In this case the 'Issue SQL' returned results, so we can be confident the SQL will work from Data Sync:
The locale of the user that is being used to run the BI report is important.
The recommended approach is to set the locale to be 'English - United States' within the OTBI analytics tool. That way the format of any dates that you read into data sync are correct, without further changes being necessary.
To set the locale - within the Analytics portal of the OTBI reporting tool, select the dropdown next to the user name, and then select 'My Account'
then within the 'Locale (location)' setting, select 'English - United States'
If it's not possible to change the locale, then run the SQL in OTBI as described above, look at the format of the dates being returned, and then within the Data Sync tool, go to the OTBI connection (details on how to configure this are in the next section), and in the advanced properties of the connection, edit the 'Timestamp format for bulk loads' option to match the date format.
1. Within the Data Sync Tool, create a new 'Connection' for the BI Server source, give it a suitable name, and choose the 'Oracle BI Connector' type.
2. Within the properties of the newly created OTBI connection, select the URL from the BI Server and paste in. For more information on how to identify the URL - see this previous article.
3. Complete the connection by entering in the username and password for a user with access to the OTBI reporting portal. Then save the connection and test it to confirm it was configured correctly.
4. Take a look at the 'Advanced Properties' tab, and confirm that 'Bulk Read' is set to 'true'. This will improve read performance. You can also edit the timestamp format in this section. This is useful if English - United States is not your preferred locale. You will need to change the timestamp format here to match the output from OTBI for your date format. Otherwise you may hit data issues.
1. Within Data Sync, select 'Project', 'Source Data' and 'Manual Entry' and select the OTBI connection created in the previous step.
2. Give the mapping a suitable name, and a target. In this case we will create a new table in the target database called 'BUSINESS_UNIT'.
3. In the following screen you need to select either 'SQL' or 'Day Based Partitioning Read from SQL' in the 'Data From' options The choice will depend on your use case and whether you need to chunk the data from the view object by date to reduce the amount of data, or not.
4. In the next screen we need to define the properties of the view object extract.
Examples of both the 'SQL', and 'Day Based Partition Read from SQL' ,versions are provided:
a. in the 'SQL' version, enter the SQL statement tested previously in the 'Initial SQL' value:
b. Data Sync can optionally do incremental updates for the view object. To show the syntax for this example, the following SQL would be entered into the 'Incremental SQL' value. It is the same as the statement above, but with an additional where close to filter past on a suitable date field:
where "FscmTopModelAM.FinFunBusinessUnitsAM.BusinessUnitPVO"."BUSINESSUNITLASTUPDATEDATE" > timestamp '%LAST_REPLICATION_DATETIME%'
Note - if you start with the regular 'SQL' version, and later decide you want to add day based partitioning, you can do that. Right click on the source, select 'Attributes', then change the 'Read type'.
'Day Based Partition Read from SQL' Version
a. Use the same SQL statement above for the 'Initial SQL'.
b. While you can do both date based partitioning and incremental loads (in which case you could also re-use the incremental SQL statement from above), in this example we are just going to use date based partitioning, so will not edit the default incremental SQL statement.
c. Define a Periodicity Column with a suitable Date field. Note - this can not contain null values, and ideally is indexed on the source side. In this example:
d. Enter the SQL query to find the minimum date value of that field, in this case:
select_physical min ("BUSINESSUNITLASTUPDATEDATE") from "FscmTopModelAM.FinFunBusinessUnitsAM.BusinessUnitPVO"
e. Enter the SQL query to find the maximum date value of that field, in this case:
select_physical max ("BUSINESSUNITLASTUPDATEDATE") from "FscmTopModelAM.FinFunBusinessUnitsAM.BusinessUnitPVO"
f. Enter the Partition Read period in days. This is the 'chunk' size that Data Sync will use when breaking the data down. There is no one value that can define 'Partition Read (Number of days)'. It depends on the volume of data, and also the time period in the data to be extracted. You need to assess the increase in volume of data in a object per day. Let's say there are 10,000 new transactions per day on a view object - then you might consider 10 days as the partition range. In this example of a Business Unit, it does not make a lot of sense to use, unless that table has many rows. For demonstration purposes though, we have entered '10' to break the data into chunks of 10 day periods.
g. Consider limiting the number of rows by using the optional Min Date. This will only look for data older than that date. For example, if a view object has 10 years worth of data, consideration should be given as to whether all of that data is needed. The Optional Max Date can be set to a static value or 'sysdate'. Setting these two, will also mean data sync will not attempt to discover the date range from the data as part of the set-up.
5. Click 'OK'. Data Sync will confirm that the entries are correct. If there are any errors, update until resolved.
If you did use 'select_physical * from....', then it is recommended you go to the 'Target Object', select the target created, the in the 'Columns' menu, inactivate the fields you don't need :
If you also created a Primary Key extract to identify deleted data, you will need to set up a separate mapping to handle that.
1. Within the Project, select Source Data and Manual Entry, and give a suitable logical name and details on the target. In this case we will create a new table called 'BUSINESS_UNIT_PK' in the target database.
2. For a primary key query, we need to select all the data - so select the 'SQL' option, and in the initial SQL, enter the SQL to select the primary key from the view object. You only need to select the primary key, identified earlier, from the view object.
3.The other options can be left as the default.
While this article just covers loading the Data and Primary Key files into the target Database, it is likely you will need to run a downstream process in the database to process the Primary Key file to identify existing data in the target database, that has now been deleted in Fusion, so those rows can be marked as deleted.
This could be done in a stored procedure in the target database, or via post load SQL. Data Sync can call automate this process by calling the stored procedure, or post load SQL, once the data is loaded. For more information on this functionality, see this previous article.
4.Run the Data Sync Job and confirm the load was successful.
In this example there were 3 mappings. One for the SQL version, one for the Date Based Partitioning SQL, and one for the Primary Key extract.
This article walked through the steps to set up the Data Sync tool to extract directly from Fusion SaaS view objects.