All Your (Python) Logs Are Belong to Us

April 1, 2024 | 6 minute read
KC Flynn
Senior Cloud Engineer
Text Size 100%:

Any good application should have logs. Network logs, audit logs, access logs, error logs, system logs, etc. They're so common that one would think they grow on trees (yes, this was a logging joke). Many applications send their log data to standard files in a directory structure, such as /var/log/foo.log, but these are the days of cloud native applications. Modern applications are often run in ephemeral VM instances, containers, and serverless functions. There are tools available to help make logging in these applications easier, but I prefer point to point communication whenever possible. The fewer layers between my logs and their destination, the better.

The OCI Logging service aggregates logs into OCI for storage and query. Logs in OCI fall into three catagories. First is audit logs, which are located in the _Audit Log Group within each compartment. Second are service logs, which can be enabled to log activity for supported services inside OCI. An example would be Activity Stream for Oracle Integration Cloud. Finally, custom logs cover all other use cases. Custom logs can be mapped to agents to collect logs from compute instances, collecting and duplicating logs from the instance(s) back into OCI Logging. They can also be populated via one of the supported OCI SDKs or REST APIs, which requires no agent.

As someone who enjoys running containerized services via Oracle Kubernetes Engine, Container Instances, or occasionally just via podman on an Oracle Linux instance, log storage is something that I've spent a considerable time thinking about. There are tools that can be implemented to control logging from various sources, but I generally prefer to use native capabilities before adding new moving parts to any system. In this case, we will utilize the Python standard logging library to get our logs from our application into OCI Logging.

The Python logging library has several standard classes that it uses to take a log message and write it to the chosen destination. For our purposes, the class type that we're going to use is the Handler. There are several handlers available via the standard library, and we're going to subclass one to take advantage of it's existing internal logic. A diagram of the complete logging flow of the Python logging library is below from the Python documentation.

Python logging library log event flow

 

 

The diagram below gives a few use cases that are easily implemented once the logs reside in the logging service. Log data can be send to streaming to be consumed by one or more endpoints, such as a SIEM. Log data can be stored in Object Storage to meet data preservation requirements. Finally, we can query the OCI logging data directly for further analysis.

Logging flow diagram of sending logs to OCI from container
End to end log flow example

To enable OCI Logging, we will add logic to a Python Logging Handler using the OCI Python SDK. To start, we will subclass the abstact class BufferingHandler to take advantage of batching to upload our log data. Once the data is uploaded, we can perform other operations on it as well.

from oci import loggingingestion as oci_log

   class OciLoggingHandler(BufferingHandler):
      def __init__(self, log_ocid: str, config: dict, capacity: int):
         super().__init__(capacity)
         self.log_ocid = log_ocid
         self.client = oci_log.LoggingClient(config)

We see the initialization variables in the __init__ definition; log_ocid will be the OCID of the OCI Log (which needs to be created if it does not exist) we want to write to. The config contains authentication information that is used to create the client object. Finally, capacity is the buffer size for the handler. When the buffer is full, the handler will flush the buffer, sending all log entries to OCI. This helps manage the number of requests made to the target log and takes advantage of the SDKs batch entry feature. We can fall back on the BufferingHandler logic being called via the super function to send logs to the buffer and check when the buffer needs to be flushed. We will next need to define the flush method on the OciLoggingHandler.

The methods implemented by BufferingHandler will handle adding log entries to the buffer and checking when the buffer is full, but we will need to create the flush method. This is where the magic will happen, and where we will utilize the OCI SDK.

def flush(self):
      entries = []
         for entry in self.buffer:
            entries.append(oci_log.models.LogEntry(
            data=self.format(entry),
            id=str(uuid4()),
            time=datetime.fromtimestamp(entry.created).astimezone(UTC)
            ))

In this snippet, we create an entries list to hold the log entries (of type LogEntry) we generate from data in the buffer. We append each entry to the list as after calling the format method, assigning a random uuid to the entry, and formatting the timestamp of the entry.

After generating our log entries, we add the entries to a LogEntryBatch to send the log entries in batches. If we tried to send the logs individually, we would be looking at significant IO and likely some unhappy neighbors on our network.

details = oci_log.models.PutLogsDetails(
      specversion="1.0",
      log_entry_batches = [oci_log.models.LogEntryBatch(
            defaultlogentrytime=datetime.now(UTC),
            source="Python OCI Logger",
            type="python_application",
            entries=entries
         )]
      )

Here, the important bit is that we are adding our entries to the LogEntryBatch. The source and type are configurable as needed and will be included in the log data in OCI Logging. Finally, we will send the logs to OCI Logging and clear the buffer, which is expected of the flush method.

self.client.put_logs(self.log_ocid, details)
  self.buffer.clear()

We now have our class, the OciLoggingHandler. Putting the new handler into use is incredibly simple and requires minimal effort to integrate into existing code bases. We simply need to point existing loggers to use the new handler instead of existing handlers. In practice, it will look something like this:

formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')
   handler = OciLoggingHandler(log_ocid, config, buffer_size)
   handler.setFormatter(formatter)

   log = logging.getLogger('Web')
   log.setLevel(level=logging.DEBUG)
   log.addHandler(handler)

We initialize our handler, associate a formatter for defining message formats, and then add it to our logger using the addHandler method. That's really all there is to it. Let's generate some data and see what the final result looks like.

We start with an empty log:

An empty OCI log
Nothing to see here

Generate some logs and check again:

OCI log containing many entries
Ta-da!

Inspecting an individual entry gives us additional information:

OCI log entry data

Our data is now centeralized in the cloud via OCI Logging service! This was done by utilizing the OCI SDK and the Python standard logging library. All that needed to be done was defining a log handler class to send logs to OCI by subclassing and abstract log handler class. Utilizing this handler is as simple as assinging it to the logger being used by the application or service. Just like that, we are able to send logs to OCI with a minimum of effort!

KC Flynn

Senior Cloud Engineer


Previous Post

Analyze CPQ Data Tables with Pandas DataFrames

Shea Nolan | 5 min read

Next Post


Multicast on OCI - High Availability, wXcked Eye and multicast traffic testing