A production pattern for frequent updates with PGX

June 28, 2018 | 3 minute read
Michael J. Sullivan
Principal Cloud Solutions Architect
Text Size 100%:

Oracle’s PGX (Property Graph analytiX) is an in-memory representation of Oracle Property Graph data. Analytics against the data are parallelized and as such can be extremely fast (especially compared to typical database queries). Such powerful analytics are ideal for recommendations or real-time accept/reject financial transactions for example. A list of the built-in analytics can be found at: https://docs.oracle.com/cd/E56133_01/latest/reference/algorithms/index.html.

This blog focuses on how to leverage Oracle PGX’s in-memory parallel analytics in a live production environment where there are frequent updates to the graph data.

There are a couple of assumptions/requirements we need to get out of the way first:

  • This pattern requires a properly configured Oracle RDBMS to enable transactional auto-sync from the graph store to the in-memory snapshots.
  • The client application will have to poll PGX — let’s say every 60secs for this example — to determine if there have been any changes to the data (i.e. “am I using the latest data?”) — needless to say this is up to the developer to implement
  • True real-time updates to time-critical data like inventory or pricing should likely be handled externally to any PGX queries, requiring some sort of mash-up to be implemented by the developer either in the control or presentation layer

Note: This pattern won’t work with non-RDBMS storage.

Now that we have that out of the way let’s look at a typical flow involved in updating a graph store and its associated in-memory copy:

STEP 1

Load your graph store into PGX

see: https://docs.oracle.com/cd/E56133_01/2.4.1/reference/api/load-graphs.html

STEP 2

Enable Auto Refresh with Delta Update parameter in property-graph/pgx/conf/pgx.conf:

{    …    “allow_user_auto_refresh”: true    … }

STEP 3

Configure each of your PGX instance’s Auto Refresh Parameters:

  • auto_refresh (set to true as the default is false)
  • update_interval_sec (default = 60 seconds )
  • fetch_interval_sec (default = -1 )
  • update_threshold (default = -1 )
  • create_edge_id_mapping

Here’s a diagram of the algorithm that these parameters impact:

additionally you will want to set the following properties to clean up stale snapshots:

  • release_memory_threshold (default = 85%)
  • memory_cleanup_interval (default = 10 minutes)

STEP 4 

Update your Property Graph data (i.e. vertices and edges) using one of the following methods:

  • DAL API
  • DAL REST API
  • Direct writes to the RDBMS using SQL

see: https://docs.oracle.com/en/database/oracle/oracle-database/18/spgdg/using-property-graphs-oracle-database.html

STEP 5 (automatic with this pattern)

Create a change set

STEP 6 (automatic with this pattern)

Add updates to the change set

STEP 7 (automatic with this pattern)

Apply the change set to the graph (which in turns creates a snapshot)

STEP 8

Have your application reference the new updated graph (see Note at the end of this blog)

STEP 9 (automatic with this pattern)

delete the stale graph(s)

That's it! The key is leveraging all the built-in power of the Oracle database. Otherwise, you would need to implement all the steps listed above manually. Pretty sweet if you ask me.

A Note about Snapshots

    • A loaded graph is often a subset of a persistent Property Graph store
    • A loaded graph can have multiple copies called snapshots
    • Snapshots are full copies of the loaded graph
    • Each snapshot has a unique identifier (UNIX timestamp). Queries will return this id in the response. e.g.:
      pgx> pg = readGraphWithProperties(cfg) pgx> pg = session.getGraphs() pgx> pg = getAvailabeSnapshots(pg)
    • Analytics run against this snapshot
    • Every time a graph is updated (as constrained by the parameters) a new snapshot is created
    • Applications are responsible for switching to the new most current snapshot using the DAL API (i.e. there is no messaging mechanism). A typical query might look like the following:
      pgx> if (!pg.isFresh()) {\ pgx>   gm = session.getAvailableSnapshots(pg) \ pgx>   session.setSnapshot(pg.gm[0]); \ pgx> }
    • Unused snapshots are automatically deleted from memory

     


    Many thanks to Albert Godfrind for his insight and knowledge with regards to Oracle's graph technology.

Michael J. Sullivan

Principal Cloud Solutions Architect


Previous Post

Using SSSD with Kerberos and Active Directory to Terminal into an OCI Linux Machine

Tim Melander | 8 min read

Next Post


Engagement Cloud - an Introduction to calling REST Services

Tim Bennett | 6 min read