Oracle AI Data Platform brings your data and AI work into one place. It helps teams move, organize, and use data for analytics and machine learning.
This blog focuses on getting data out of Oracle Fusion so it can be prepared and merge with other dataset on Oracle AI Data Platform. We will look at practical options to extract data, move it safely, and make it ready for use. You will see what each option needs, how it fits in an end-to-end flow, and when to pick one over another.
BI Cloud Connector
BI Cloud Connector (BICC) is the recommended built-in bulk extract service for Oracle Fusion Applications. It exports data from curated view objects into flat files for downstream use in your lake or warehouse. It handles large volumes, supports full and incremental loads, and can be scheduled and monitored through ESS.
Oracle AI Data Platform has a built-in Spark connector for BICC. Before invoking it, let’s setup a few parameters for reusability.
oci_bucket=oidlUtils.parameters.getParameter("OCI_BUCKET", "oci://<bucket_name>@<namespace>")
folder_name=oidlUtils.parameters.getParameter("FOLDER_NAME", "bicc")
file_name=oidlUtils.parameters.getParameter("FILE_NAME", "ItemExtractPVO")
catalog_name=oidlUtils.parameters.getParameter("CATALOG_NAME", "hive")
schema_name=oidlUtils.parameters.getParameter("SCHEMA_NAME", "default")
external_volume_name=oidlUtils.parameters.getParameter("EXTERNAL_VOLUME_NAME", "ext_volume")
external_table_name=oidlUtils.parameters.getParameter("EXTERNAL_TABLE_NAME", "ext_table")
BICC_PVO=oidlUtils.parameters.getParameter("BICC_PVO", "FscmTopModelAM.PrcExtractAM.PozBiccExtractAM.SupplierExtractPVO")
# Define paths
oci_file_path = f"{oci_bucket}/{folder_name}/{file_name}"
external_volume_path = f"/Volumes/{catalog_name}/{schema_name}/{external_volume_name}/{folder_name}/{file_name}"
external_table_path = f"{oci_bucket}/{folder_name}/{file_name}"
We can now use the Oracle AI Data Platform’s BICC connector API to trigger a stateless job in BICC. It will land the data on the external storage defined in BICC and passed under fusion.external.storage option and load it into a Spark dataframe.
df = spark.read.format("aidataplatform") \
.option("type", "FUSION_BICC") \
.option("fusion.service.url", "https://<fusion_instance>.fa.ocs.oraclecloud.com/") \
.option("user.name", "<username>") \
.option("password", "<password>") \
.option("schema", "Financial") \
.option("fusion.external.storage", "<bucket>") \
.option("datastore", f"{BICC_PVO}") \
.load()
df.show()
We can then use the dataframe to enrich and transform the data. For example, merging with data from different sources or format it in json as shown below.
df.write.mode("overwrite").format("json").save(oci_file_path)
print(f"Data written directly to {oci_file_path}")
Incremental mode is also supported through the fusion.initial.extract-date option.
BICC_initial_extract_date=oidlUtils.parameters.getParameter("BICC_initial_extract_date", "2013-11-01")
df2 = spark.read.format("aidataplatform") \
.option("type", "FUSION_BICC") \
.option("fusion.service.url", "https://<fa-pod>.fa.ocs.oraclecloud.com/") \
.option("user.name", "<username>") \
.option("password", "<password>") \
.option("schema", "Financial") \
.option("fusion.external.storage", "ateamsaas-bucket-bicc") \
.option("datastore", f"{BICC_PVO}") \
.option("fusion.initial.extract-date", f"{BICC_initial_extract_date}") \
.load()
df2.show()
Here are the other options available for the BICC connector:
| Parameter name | Valid values | Mandatory | Description |
|---|---|---|---|
type | FUSION_BICC | Y | Data source type. |
fusion.service.url | Valid Fusion Apps hostname | Y | Fusion Apps service URL or hostname. |
user.name | Valid username | Y | Username for Fusion access. |
password | Valid password | Y | Password for Fusion access. |
schema | Valid offering name | Y | BICC offering. |
fusion.external.storage | Valid configured storage name | Y | External storage configured in BICC. |
datastore | Fully qualified datastore or PVO name | Y | Fully qualified datastore name to extract. |
extract.as.string | true, false | N | Extract all columns as strings. |
fusion.column.filter.type | ALL, DEFAULT, PRIMARY, DEFAULT_AND_PRIMARY | N | Filters columns based on primary or default metadata. |
fusion.bi-broker.enabled | true, false | N | Enable BI Broker mode. |
fusion.last.extract-date | yyyy-MM-dd | N | Enables incremental extraction starting from this date. |
fusion.initial.extract-date | yyyy-MM-dd | N | Enables full extraction from this initial date. |
is.effective.date.disabled | true, false | N | Include historical data instead of filtering to current effective rows only. |
proxy.host | Valid hostname or IP | N | Proxy host for Fusion or BICC access. |
proxy.port | Valid port number | N | Proxy port for Fusion or BICC access. |
row.limit | Positive integer | N | Limits the number of rows read. |
bicc.preview | true, false | N | Return the supported BICC preview row set instead of using row.limit for preview reads. |
For more details about BICC configuration, check this blog: Bring Fusion Data into Oracle AI Data Platform Workbench Using BICC
Batch Extraction using Fusion Business Objects
Fusion released a new set of REST APIs allowing for batch extraction. We can use these REST APIs in Oracle AI Data Platform to create a job that will extract data in json format on OCI Object Storage.
First, we need to get a Token to pass to our batch extract job
import requests
import base64
host = '<host>'
podname = '<podname>'
# Define username and password
username = '<username>'
password = '<password>'
# Create Basic Authorization header
auth_string = f"{username}:{password}"
encoded_auth = base64.b64encode(auth_string.encode()).decode()
authorization_header = f"Basic {encoded_auth}"
# Get a Token for Batch Extraction
url = f"{host}/oauth2/v1/token"
params = {
'grant_type': 'password',
'scope': 'urn:opc:resource:fusion:<pod-name>:saas-batch'
}
headers = {
'Authorization': authorization_header
}
response = requests.get(url, params=params, headers=headers)
print("Token Response status code:", response.status_code)
token = response.text
Now we can invoke the saas-batch APIs to schedule a job. The job id will be stored in the Location header of the response.
url = f"{host}/api/saas-batch/jobscheduler/v1/jobRequests"
authorization_header = f"Bearer {token}"
payload = json.dumps({
"jobDefinitionName": "AsyncDataExtraction",
"serviceName": "boss",
"requestParameters": {
"boss.module": "oraHcmHrCoreEmployment",
"boss.resource.name": "workerAssignmentExtracts",
"boss.resource.version": "v1",
"boss.outputFormat": "json",
"boss.advancedQuery": "{\"collection\": {\"filter\": \"primaryFlag = true and timeUpdated > '2025-05-01T00:00:00Z' and assignmentType = 'E' and !(legalEmployer.id in (10001, 10002))\"},\"fields\": [ \"id\", \"assignmentType\", \"assignmentStatusType\", \"effectiveStartDate\", \"effectiveEndDate\", \"businessTitle\", \"workAtHomeFlag\", \"assignmentNumber\", \"timeUpdated\" ],\"accessors\": {\"personDetail\": {\"fields\": [ \"personNumber\" ]},\"globalName\": {\"fields\": [ \"firstName\", \"lastName\" ]},\"department\": {\"fields\": [ \"id\", \"name\", \"title\" ]},\"legalEmployer\": {\"fields\": [ \"id\", \"name\" ]},\"workRelationship\": {\"fields\": [ \"id\" ]}}}"
}
})
headers = {
'Content-Type': 'application/json',
'Authorization': authorization_header
}
response = requests.request("POST", url, headers=headers, data=payload)
jobLocation = r.headers['Location']
print("Job created:", jobLocation)
We can now reuse jobLocation to monitor the execution.
payload = {}
headers = {
'Authorization': authorization_header
}
response = requests.request("GET", jobLocation , headers=headers, data=payload)
print(response.json.["jobDetails"]["jobStatus"])
jobId = response.json.["jobDetails"]["jobRequestId"]
When the job is completed, we can list the output files and download them.
url = f"{host}/api/saas-batch/jobfilemanager/v1/jobRequests/{jobId}/outputFiles"
response = requests.request("GET", url, headers=headers, data=payload)
Conclusion
Oracle AI Data Platform provides a solid foundation to extract Fusion data, land it securely in OCI Object Storage or other 3rd party could storage, and prepare it for analytics and AI. Depending on the use case this post explored a couple of options to achieve this. For standard batch style exports, the AIDP BICC API is the recommended default. For use cases that demand more flexibility with data models or a higher frequency, the Fusion Business Objects batch REST APIs are a strong option. Choose based on data volume and operational needs—then leverage AIDP workbench to standardize, enrich, and govern the data for downstream consumption.
If you need a curated, high-volume, operationally proven approach, BI Cloud Connector (BICC) remains the go-to option—especially when you want scheduled bulk exports (full or incremental) and a straightforward path into Spark for transformation and publishing to Object Storage in formats like JSON/Parquet. If you need API-driven extraction with JSON output and fine-grained filtering/field selection, the newer Fusion Business Objects batch REST extraction provides a flexible alternative, particularly for targeted use cases and modern integration patterns.
Whichever path you choose, the end goal is the same: land Fusion data securely in OCI Object Storage, standardize and enrich it in Oracle AI Data Platform, and publish it into your lakehouse/warehouse layer so it’s ready for consumption by downstream analytics, ML pipelines, and governance controls. As you productionize, prioritize enterprise fundamentals—credential management (OCI Vault/secret scopes), least-privilege access, encrypted storage, reliable scheduling/monitoring, and repeatable incremental load strategies—so your Fusion data supply chain stays both scalable and compliant.
