Oracle AI Data Platform brings your data and AI work into one place. It helps teams move, organize, and use data for analytics and machine learning.

This blog focuses on getting data out of Oracle Fusion so it can be prepared and merge with other dataset on Oracle AI Data Platform. We will look at practical options to extract data, move it safely, and make it ready for use. You will see what each option needs, how it fits in an end-to-end flow, and when to pick one over another.

BI Cloud Connector

BI Cloud Connector (BICC) is the recommended built-in bulk extract service for Oracle Fusion Applications. It exports data from curated view objects into flat files for downstream use in your lake or warehouse. It handles large volumes, supports full and incremental loads, and can be scheduled and monitored through ESS.

Oracle AI Data Platform has a built-in Spark connector for BICC. Before invoking it, let’s setup a few parameters for reusability.

oci_bucket=oidlUtils.parameters.getParameter("OCI_BUCKET", "oci://<bucket_name>@<namespace>")
folder_name=oidlUtils.parameters.getParameter("FOLDER_NAME", "bicc")
file_name=oidlUtils.parameters.getParameter("FILE_NAME", "ItemExtractPVO")
catalog_name=oidlUtils.parameters.getParameter("CATALOG_NAME", "hive")
schema_name=oidlUtils.parameters.getParameter("SCHEMA_NAME", "default")
external_volume_name=oidlUtils.parameters.getParameter("EXTERNAL_VOLUME_NAME", "ext_volume")
external_table_name=oidlUtils.parameters.getParameter("EXTERNAL_TABLE_NAME", "ext_table")
BICC_PVO=oidlUtils.parameters.getParameter("BICC_PVO", "FscmTopModelAM.PrcExtractAM.PozBiccExtractAM.SupplierExtractPVO")

# Define paths
oci_file_path = f"{oci_bucket}/{folder_name}/{file_name}"
external_volume_path = f"/Volumes/{catalog_name}/{schema_name}/{external_volume_name}/{folder_name}/{file_name}"
external_table_path = f"{oci_bucket}/{folder_name}/{file_name}"

We can now use the Oracle AI Data Platform’s BICC connector API to trigger a stateless job in BICC. It will land the data on the external storage defined in BICC and passed under fusion.external.storage option and load it into a Spark dataframe.

df = spark.read.format("datalake") \
    .option("type", "FUSION_BICC") \
    .option("fusion.service.url", "https://<fusion_instance>.fa.ocs.oraclecloud.com/") \
    .option("user.name", "<username>") \
    .option("password", "<password>") \
    .option("schema", "Financial") \
    .option("fusion.external.storage", "<bucket>") \
    .option("datastore", f"{BICC_PVO}") \
    .load()

df.show()

We can then use the dataframe to enrich and transform the data. For example, adding data from different sources or as below format it in json.

df.write.mode("overwrite").format("json").save(oci_file_path)
print(f"Data written directly to {oci_file_path}")

Incremental mode is also supported through the fusion.initial.extract-date option.

BICC_initial_extract_date=oidlUtils.parameters.getParameter("BICC_initial_extract_date", "2013-11-01")

df2 = spark.read.format("datalake") \
    .option("type", "FUSION_BICC") \
    .option("fusion.service.url", "https://<fa-pod>.fa.ocs.oraclecloud.com/") \
    .option("user.name", "<username>") \
    .option("password", "<password>") \
    .option("schema", "Financial") \
    .option("fusion.external.storage", "ateamsaas-bucket-bicc") \
    .option("datastore", f"{BICC_PVO}") \
    .option("fusion.initial.extract-date", f"{BICC_initial_extract_date}") \
    .load()

df2.show()

For more details about BICC configuration, check this blog: Bring Fusion Data into Oracle AI Data Platform Workbench Using BICC

Batch Extraction using Fusion Business Objects

Fusion released a new set of REST APIs allowing for batch extraction. We can use these REST APIs in Oracle AI Data Platform to create a job that will extract data in json format on OCI Object Storage.

First, we need to get a TokenRelay to pass to our batch extract job

import requests
import base64

host = '<host>'
podname = '<podname>'
# Define username and password
username = '<username>'
password = '<password>'

# Create Basic Authorization header
auth_string = f"{username}:{password}"
encoded_auth = base64.b64encode(auth_string.encode()).decode()
authorization_header = f"Basic {encoded_auth}"

# Get a Token Relay for Batch Extraction
url = f"{host}/api/rwdtools/tokenrelay"
params = {
    'username': '<username>',
    'audience': 'urn:opc:resource:fusion:<pod-name>:saas-batch',
    'scope': '/'
}
headers = {
    'Authorization': authorization_header
}
response = requests.get(url, params=params, headers=headers)
print("Token Relay Response status code:", response.status_code)
tokenrelay = response.text

Now we can invoke the saas-batch APIs to schedule a job. The job id will be stored in the Location header of the response.

url = f"{host}/api/saas-batch/jobscheduler/v1/jobRequests"

payload = json.dumps({
  "jobDefinitionName": "AsyncDataExtraction",
  "serviceName": "boss",
  "requestParameters": {
    "boss.module": "oraHcmHrCoreEmployment",
    "boss.resource.name": "workerAssignmentExtracts",
    "boss.resource.version": "v1",
    "boss.outputFormat": "json",
    "boss.advancedQuery": "{\"collection\": {\"filter\": \"primaryFlag = true and timeUpdated > '2025-05-01T00:00:00Z' and assignmentType = 'E' and !(legalEmployer.id in (10001, 10002))\"},\"fields\": [ \"id\", \"assignmentType\", \"assignmentStatusType\", \"effectiveStartDate\", \"effectiveEndDate\", \"businessTitle\", \"workAtHomeFlag\", \"assignmentNumber\", \"timeUpdated\" ],\"accessors\": {\"personDetail\": {\"fields\": [ \"personNumber\" ]},\"globalName\": {\"fields\": [ \"firstName\", \"lastName\" ]},\"department\": {\"fields\": [ \"id\", \"name\", \"title\" ]},\"legalEmployer\": {\"fields\": [ \"id\", \"name\" ]},\"workRelationship\": {\"fields\": [ \"id\" ]}}}"
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': authorization_header
}

response = requests.request("POST", url, headers=headers, data=payload)

jobLocation = r.headers['Location']
print("Job created:", jobLocation)

We can now reuse jobLocation to monitor the execution.

payload = {}
headers = {
  'Authorization': authorization_header
}

response = requests.request("GET", jobLocation , headers=headers, data=payload)

print(response.json.["jobDetails"]["jobStatus"])

jobId = response.json.["jobDetails"]["jobRequestId"]

When the job is completed, we can list the output files and download them.


url = f"{host}/api/saas-batch/jobfilemanager/v1/jobRequests/{jobId}/outputFiles"
response = requests.request("GET", url, headers=headers, data=payload)

Conclusion

Oracle AI Data Platform provides a solid foundation to extract Fusion data, land it securely in OCI Object Storage or other 3rd party could storage, and prepare it for analytics and AI. Depending on the use case this post explored a couple of options to achieve this.  For standard batch style exports, the AIDP BICC API is the recommended default. For use cases that demand more flexibility with data models or a higher frequency, the Fusion Business Objects batch REST APIs are a strong option. Choose based on data volume and operational needs—then leverage AIDP workbench to standardize, enrich, and govern the data for downstream consumption.

If you need a curated, high-volume, operationally proven approach, BI Cloud Connector (BICC) remains the go-to option—especially when you want scheduled bulk exports (full or incremental) and a straightforward path into Spark for transformation and publishing to Object Storage in formats like JSON/Parquet. If you need API-driven extraction with JSON output and fine-grained filtering/field selection, the newer Fusion Business Objects batch REST extraction provides a flexible alternative, particularly for targeted use cases and modern integration patterns.

Whichever path you choose, the end goal is the same: land Fusion data securely in OCI Object Storage, standardize and enrich it in Oracle AI Data Platform, and publish it into your lakehouse/warehouse layer so it’s ready for consumption by downstream analytics, ML pipelines, and governance controls. As you productionize, prioritize enterprise fundamentals—credential management (OCI Vault/secret scopes), least-privilege access, encrypted storage, reliable scheduling/monitoring, and repeatable incremental load strategies—so your Fusion data supply chain stays both scalable and compliant.