In this blog series we are going to discuss ways to utilize Oracle Cloud Infrastructure (OCI) Observability and Management services and apply them to network resources with examples. Below are 3 common requests that I've received from customers that we will go over in more detail in this and future blogs in this series.
Before we dive in to Part One for this blog, let's briefly review some of the relevant OCI services that we'll be covering.
The OCI Audit service automatically records calls to all supported OCI public Application Programming Interface (API) endpoints and logs them to the Audit Log. This includes all API calls made by the OCI console, Command Line Interface (CLI), Software Development Kits (SDK), other OCI services. Information in the log events include:
The Audit Log supports GET, POST, PUT, PATCH, DELETE actions, which are the five most common Hypertext Transfer Protocol (HTTP) methods for sending and receiving data to a server and map to Create, Read, Update, and Delete (CRUD) operations.
See the below links for more detailed information on the OCI Audit Service and the Audit Log:
The Audit log can be checked for changes to all types of OCI resources, but for this example we are going to focus on a rule change to a security list. Let's say that this morning we were able to establish a Secure Shell (SSH) session to our bastion host over the public Internet from our on-premise location, but now it is not working. We've checked the bastion host instance and it is operational and running, we've checked our Internet connection on-premise and it's working and passing traffic and we're able to ping the bastion host public Internet Protocol (IP) address. We suspect something has changed but we are not sure what changed. The steps below will outline how to look at the OCI Audit Log to see if a change happened that may have caused this disruption. First we'll look finding the change in the Audit Log itself, and then we'll look at using a JSON Diff tool to highlight what has changed for more complicated scenarios.
As you can see above, the output from our filter has identified a "PUT" action on the security list that happens to be applied to the public subnet the bastion host resides in so this could be the cause. Note that we also see the User and Event time that this "PUT" action took place so we can tell who and when, in addition to what was changed.
You can see with the above steps we've determined at 19:34:39 UTC time user email@example.com changed the security list for the public subnet in which the bastion host resides, restricting the SSH source to 10.0.0.0/8 when it was previously allowing all sources 0.0.0.0/0 and could be the reason why our SSH from the Internet is no longer working.
In the previous example, it was very simple to identify what was changed as it was in the very first security list rule that we checked (labeled "0"). However, there may be scenarios where the amount of data in the audit log entry to check is very large and it will take a long time to manually click through each item to visually see what changed. For example, if our security list had 100 rules and the rule that was changed was the last one, you would have to click on all 100 rules, both the current and previous, before you see the one that changed. For this scenario, you can use a JSON Diff tool that will show you very quickly what has changed between two JSON outputs. JSON Diff tools are typically available online and for free, for the below example I am using one from extendsclass.com but you can use another one if you prefer. We will copy the "current" section from our audit log, and paste it into the JSON Diff tool, and then copy the "previous" section from our audit log, and paste it into the JSON Diff tool and it will show us exactly which part of the JSON output is different between the two.
Stay tuned for part two of this series where I will go over how to set up a notification for this change so you can be proactively notified in the future when these changes happen.