In part 2 of this blog series, we explored how to load and transform OCI FOCUS cost data in Autonomous Database (ADW) and prepare it for AI based analysis. Now, we take the next step – using Oracle’s in-database machine learning to automatically detect cost anomalies using unsupervised learning.
Why Anomaly Detection Matters
Manual tracking of cost spikes is not scalable. Unusual resource behavior—whether from a misconfiguration, unauthorized usage, or budget overrun can go unnoticed. With anomaly detection, we can flag these patterns early and act faster. Using Oracle Autonomous Database, we can do this entirely in SQL without needing external tools or complex pipelines.
Step 1: Create a Training View from Historical Data
CREATE OR REPLACE VIEW COST_USAGE_TRAIN_V AS
SELECT
STANDARD_HASH(REGION || '|' || RESOURCETYPE || '|' || TO_CHAR(TRUNC(ChargePeriodStart, 'IW'), 'YYYYIW'), 'SHA256') AS RECORD_ID,
REGION,
RESOURCETYPE,
SUM(NVL(UsageQuantity, 0)) AS TOTAL_USAGE_QUANTITY,
SUM(NVL(EffectiveCost, 0)) AS TOTAL_EFFECTIVE_COST
FROM
OCI_COST_DATA
WHERE
CAST(ChargePeriodStart AS DATE) BETWEEN
ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 6) AND
ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 1)
GROUP BY
REGION,
RESOURCETYPE,
TRUNC(ChargePeriodStart, 'IW');
Purpose: This creates a normalized, hashed identifier for each record and ensures the model is trained on recent, relevant data (last 6 months, excluding the current month).
Step 2: Train the ExpectationMaximization Model
DECLARE
v_settings DBMS_DATA_MINING.SETTING_LIST;
BEGIN
v_settings('ALGO_NAME') := 'ALGO_EXPECTATION_MAXIMIZATION';
v_settings('PREP_AUTO') := 'ON';
v_settings('EMCS_OUTLIER_RATE') := '0.2'; Flag top 20% as anomalies
DBMS_DATA_MINING.CREATE_MODEL2(
MODEL_NAME => 'OCI_COST_ANOMALY_MODEL',
MINING_FUNCTION => 'CLASSIFICATION',
DATA_QUERY => 'SELECT * FROM COST_USAGE_TRAIN_V',
CASE_ID_COLUMN_NAME => 'RECORD_ID',
SET_LIST => v_settings,
TARGET_COLUMN_NAME => NULL
);
END;
Purpose: Trains an unsupervised anomaly detection model on the last 5 month cost/usage view.
For more details on building and managing models in Autonomous Database, refer to the page.
Key Points:
- The model is unsupervised (no target column).
- The EM algorithm is wellsuited for identifying unusual patterns in multivariate data.
- The `EMCS_OUTLIER_RATE` parameter controls the sensitivity.
Step 3: Score the Latest Cost Data
CREATE OR REPLACE VIEW COST_USAGE_SCORE_V AS
SELECT
STANDARD_HASH(
REGION || '|' || RESOURCETYPE || '|' || TO_CHAR(TRUNC(ChargePeriodStart, 'IW'), 'YYYYIW'),
'SHA256'
) AS RECORD_ID,
REGION,
RESOURCETYPE,
TO_CHAR(ResourceName) AS RESOURCE_NAME,
ChargeDescription,
oci_CompartmentName,
ResourceId,
SUM(NVL(UsageQuantity, 0)) AS TOTAL_USAGE_QUANTITY,
SUM(NVL(EffectiveCost, 0)) AS TOTAL_EFFECTIVE_COST
FROM
OCI_COST_DATA
WHERE
ChargePeriodStart >= TRUNC(SYSDATE) 30
GROUP BY
REGION,
RESOURCETYPE,
TO_CHAR(ResourceName),
ChargeDescription,
oci_CompartmentName,
ResourceId,
TRUNC(ChargePeriodStart, 'IW');
Purpose: Generates a scoring dataset from the most recent 30 days, in the same format as the training data, enabling the model to detect current anomalies.
Step 4: Extract and Rank Anomalies
CREATE OR REPLACE VIEW EM_COST_ANOMALOUS_RESULTS AS
SELECT
FROM (
SELECT
RECORD_ID,
prediction(OCI_COST_ANOMALY_MODEL USING ) AS ANOMALOUS,
ROUND(prediction_probability(OCI_COST_ANOMALY_MODEL, '0' USING ), 2) AS PROB_ANOMALOUS,
REGION,
RESOURCETYPE,
RESOURCE_NAME,
ResourceId,
ChargeDescription,
oci_CompartmentName,
TOTAL_USAGE_QUANTITY,
TOTAL_EFFECTIVE_COST,
RANK() OVER (
ORDER BY prediction_probability(OCI_COST_ANOMALY_MODEL, '0' USING ) DESC
) AS RNK
FROM COST_USAGE_SCORE_V
)
WHERE RNK <= 10
ORDER BY PROB_ANOMALOUS DESC;
The view returns the top 10 most anomalous records, with a probability score indicating the likelihood of being an outlier.
SELECT * FROM EM_COST_ANOMALOUS_RESULTS;
Purpose: Fetches the top flagged anomalies for further analysis.
Step 5: Drill Down into Specific Resources
SELECT
TO_CHAR(ChargePeriodStart, 'YYYYMM') AS USAGE_MONTH,
ResourceId,
oci_CompartmentName,
ResourceType,
ServiceName,
ChargeDescription,
SUM(NVL(UsageQuantity, 0)) AS TOTAL_USAGE_QUANTITY,
SUM(NVL(EffectiveCost, 0)) AS TOTAL_EFFECTIVE_COST
FROM
OCI_COST_DATA
WHERE
ResourceId = resource_ocid_value --replace it with resource OCID value
AND ChargePeriodStart >= ADD_MONTHS(TRUNC(SYSDATE, 'MM'), 6)
GROUP BY
TO_CHAR(ChargePeriodStart, 'YYYYMM'),
ResourceId,
oci_CompartmentName,
ResourceType,
ServiceName,
ChargeDescription
ORDER BY
ChargeDescription, USAGE_MONTH DESC;
Purpose: Step 4 provided a list of anomalies based on the trained model. In this step, you will dive deeper into each flagged resource to analyze its cost and usage patterns over the last 6 months. This helps understand the context behind it.
Conclusion
This workflow enables you to proactively detect unusual cost or usage patterns in OCI, empowering FinOps teams to investigate and remediate potential issues before they escalate. By leveraging Autonomous Database and builtin machine learning, you can automate anomaly detection and drive actionable cost intelligence in your cloud environment.
