Introduction
This article is applicable to Oracle GoldenGate Classic installations, version 19c and prior.
Oracle GoldenGate (OGG) version 12.2 introduced an automatic heartbeat integrated into the application. In this article we shall demonstrate how to enable the OGG heartbeat in an Oracle to Oracle replication environment, demonstrate the additional information made available via heartbeat, and demonstrate how to manage and modify heartbeat across the replication environment.
The concepts, scripts and information presented in this article are for educational purposes only. They are not supported by Oracle Development or Support, and come with no guarantee or warrant for functionality in any environment other than the test system used to prepare this article. Before applying any changes presented in this article to your environment, you should thoroughly test to assess functionality and performance implications.
Main Article
The figure below depicts my very simple OGG replication environment used for this discussion.
Before we can enable OGG heartbeat we need to identify a database schema for the associated tables and views. In my Oracle target database I already have a database user, ggadmin, configured which is used for data apply. However, on my source Oracle database I only have a container user, c##ggadmin, which is used for data capture from a pluggable database. To keep things simple and consistent, I created the database user ggadmin in my source pluggable database, giving the user sufficient privileges to create tables, views, and scheduler jobs.
Now that the source and target database schemas are defined, we can create the heartbeat tables, views, and scheduler jobs.
First, we must set an Oracle GoldenGate GLOBALS parameter that identifies the schema where the heartbeat components will be created. In the source and target OracleGoldenGate installation directory, edit the file GLOBALS and specify the parameter ggschema.
[oracle@centos0ra12 goldengate]$ vi GLOBALS
ggschema ggadmin
~
Save and close the file.
If you have not used Oracle GoldenGate GLOBALS before, this is a special parameter file that defines instance wide runtime options. The file name must be uppercase and its contents are activated in the runtime environment each time GGSCI is started.
The ggschema parameter is used to specify the name of the schema that contains the database objects owned by Oracle GoldenGate.
To create the heartbeat objects in the source database, start GGSCI, execute the dblogin command to establish a database connection to the source Pluggable Database, and then execute the add heartbeattable command.
[oracle@centos0ra12 goldengate $ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.1.1 OGGCORE_12.2.0.1.0_PLATFORMS_151211.1401_FBO
Linux, x64, 64bit (optimized), Oracle 12c on Dec 12 2015 02:56:48
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.
GGSCI (centos0ra12) 1> dblogin userid ggadmin@pdborcl, password Oracle1
Successfully logged into database PDBORCL.
GGSCI (centos0ra12 as ggadmin@orcl/PDBORCL) 2> add heartbeattable
2016-08-04 11:10:48 INFO OGG-14001 Successfully created heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-08-04 11:10:48 INFO OGG-14032 Successfully added supplemental logging for heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-08-04 11:10:48 INFO OGG-14000 Successfully created heartbeat table ["GG_HEARTBEAT"].
2016-08-04 11:10:48 INFO OGG-14033 Successfully added supplemental logging for heartbeat table ["GG_HEARTBEAT"].
2016-08-04 11:10:48 INFO OGG-14016 Successfully created heartbeat history table ["GG_HEARTBEAT_HISTORY"].
2016-08-04 11:10:48 INFO OGG-14023 Successfully created heartbeat lag view ["GG_LAG"].
2016-08-04 11:10:48 INFO OGG-14024 Successfully created heartbeat lag history view ["GG_LAG_HISTORY"].
2016-08-04 11:10:48 INFO OGG-14003 Successfully populated heartbeat seed table with [ORCL].
2016-08-04 11:10:49 INFO OGG-14004 Successfully created procedure ["GG_UPDATE_HB_TAB"] to update the heartbeat tables.
2016-08-04 11:10:49 INFO OGG-14017 Successfully created procedure ["GG_PURGE_HB_TAB"] to purge the heartbeat history table.
2016-08-04 11:10:49 INFO OGG-14005 Successfully created scheduler job ["GG_UPDATE_HEARTBEATS"] to update the heartbeat tables.
2016-08-04 11:10:49 INFO OGG-14018 Successfully created scheduler job ["GG_PURGE_HEARTBEATS"] to purge the heartbeat history table.
GGSCI (centos0ra12 as ggadmin@orcl/PDBORCL) 3>
In the above command execution we took the defaults of the add heartbeattable command. We could have overridden the defaults by adding command line options:
ADD HEARTBEATTABLE [, FREQUENCY number in seconds] [, RETENTION_TIME number in days] | [, PURGE_FREQUENCY number in days]
FREQUENCY
Specifies how often the heartbeat seed table and heartbeat table are updated. For example, how frequently heartbeat records are generated. The default is 60 seconds.
RETENTION_TIME
Specifies when heartbeat entries older than the retention time in the history table are purged. The default is 30 days.
PURGE_FREQUENCY
Specifies how often the purge scheduler is run to delete table entries that are older than the retention time from the heartbeat history . The default is 1 day.
To create the heartbeat objects in the target database, execute the same GGSCI commands:
GGSCI (centos0ra12) 6> dblogin userid ggadmin@pdbnoreast, password Oracle1
Successfully logged into database PDBNOREAST.
GGSCI (centos0ra12 as ggadmin@orcl/PDBNOREAST) 7> add heartbeattable
2016-08-04 12:37:53 INFO OGG-14001 Successfully created heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-08-04 12:37:53 INFO OGG-14032 Successfully added supplemental logging for heartbeat seed table ["GG_HEARTBEAT_SEED"].
2016-08-04 12:37:53 INFO OGG-14000 Successfully created heartbeat table ["GG_HEARTBEAT"].
2016-08-04 12:37:53 INFO OGG-14033 Successfully added supplemental logging for heartbeat table ["GG_HEARTBEAT"].
2016-08-04 12:37:53 INFO OGG-14016 Successfully created heartbeat history table ["GG_HEARTBEAT_HISTORY"].
2016-08-04 12:37:54 INFO OGG-14023 Successfully created heartbeat lag view ["GG_LAG"].
2016-08-04 12:37:54 INFO OGG-14024 Successfully created heartbeat lag history view ["GG_LAG_HISTORY"].
2016-08-04 12:37:54 INFO OGG-14003 Successfully populated heartbeat seed table with [ORCL].
2016-08-04 12:37:54 INFO OGG-14004 Successfully created procedure ["GG_UPDATE_HB_TAB"] to update the heartbeat tables.
2016-08-04 12:37:54 INFO OGG-14017 Successfully created procedure ["GG_PURGE_HB_TAB"] to purge the heartbeat history table.
2016-08-04 12:37:54 INFO OGG-14005 Successfully created scheduler job ["GG_UPDATE_HEARTBEATS"] to update the heartbeat tables.
2016-08-04 12:37:54 INFO OGG-14018 Successfully created scheduler job ["GG_PURGE_HEARTBEATS"] to purge the heartbeat history table.
GGSCI (centos0ra12 as ggadmin@orcl/PDBNOREAST) 8>
If any Oracle GoldenGate Groups were running when you executed the add heartbeattable command, they must be restarted. Heartbeat is now active and the heartbeat data is automatically captured and replicated; you do not need to make any modifications to the OGG parameter files.
Since we took the defaults, every 60 seconds an update is performed against the gg_heartbeat_seed table. This record is captured and replicated to the target tables gg_heartbeat_seed, gg_heartbeat, and gg_heartbeat_history. The target table gg_heartbeat contains the data from the last heartbeat record processed by Replicat while the table gg_heartbeat_history contains all hearbeat records.
The source side gg_heartbeat_seed table records the local (source) database name and the timestamp of the last heartbeat.
LOCAL_DATABASE | HEARTBEAT_TIMESTAMP | REMOTE_DATABASE | INCOMING_EXTRACT |
ORCL | 04-AUG-16 04.58.52.111031000 PM | ||
INCOMING_ROUTING_PATH | INCOMING_REPLICAT | INCOMING_HEARTBEAT_TS | INCOMING_EXTRACT_TS |
INCOMING_ROUTING_TS | INCOMING_REPLICAT_TS | OUTGOING_EXTRACT | OUTGOING_ROUTING_PATH |
OUTGOING_REPLICAT | OUTGOING_HEARTBEAT_TS | OUTGOING_EXTRACT_TS | OUTGOING_ROUTING_TS |
OUTGOING_REPLICAT_TS |
The target side gg_heartbeat table contains the last heartbeat record, with additional information added by every Oracle GoldenGate Group in the replication stream.
LOCAL_DATABASE | HEARTBEAT_TIMESTAMP | REMOTE_DATABASE | INCOMING_EXTRACT |
ORCL | 04-AUG-16 05.34.52.379802000 PM | ORCL | ETPC |
INCOMING_ROUTING_PATH | INCOMING_REPLICAT | INCOMING_HEARTBEAT_TS | INCOMING_EXTRACT_TS |
PTPC | RTPC | 04-AUG-16 05.34.52.379802000 PM | 04-AUG-16 05.34.53.561655000 PM |
INCOMING_ROUTING_TS | INCOMING_REPLICAT_TS | OUTGOING_EXTRACT | OUTGOING_ROUTING_PATH |
04-AUG-16 05.34.54.961135000 PM | 04-AUG-16 05.34.57.632114000 PM | ||
OUTGOING_REPLICAT | OUTGOING_HEARTBEAT_TS | OUTGOING_EXTRACT_TS | OUTGOING_ROUTING_TS |
OUTGOING_REPLICAT_TS | |||
The target side gg_heartbeat_history table contains all retained heartbeat records.
HEARTBEAT_RECEIVED_TS | REMOTE_DATABASE | INCOMING_EXTRACT | INCOMING_REPLICAT |
04-AUG-16 04.48.40.927614000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.48.40.931696000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.48.40.932235000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.48.40.933346000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.48.40.934062000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.49.09.943003000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.50.10.073445000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.51.10.209285000 PM | ORCL | ETPC | RTPC |
04-AUG-16 04.52.09.107407000 PM | ORCL | ETPC | RTPC |
INCOMING_HEARTBEAT_TS | INCOMING_EXTRACT_TS | INCOMING_REPLICAT_TS | OUTGOING_EXTRACT |
04-AUG-16 04.43.52.075803000 PM | 04-AUG-16 04.43.56.971288000 PM | 04-AUG-16 04.48.38.295487000 PM | |
04-AUG-16 04.44.52.284489000 PM | 04-AUG-16 04.44.54.094577000 PM | 04-AUG-16 04.48.38.295487000 PM | |
04-AUG-16 04.45.52.094065000 PM | 04-AUG-16 04.45.53.226088000 PM | 04-AUG-16 04.48.38.295487000 PM | |
04-AUG-16 04.46.52.105854000 PM | 04-AUG-16 04.46.53.427845000 PM | 04-AUG-16 04.48.38.295487000 PM | |
04-AUG-16 04.47.52.097313000 PM | 04-AUG-16 04.47.53.552932000 PM | 04-AUG-16 04.48.38.295487000 PM | |
04-AUG-16 04.48.52.072194000 PM | 04-AUG-16 04.48.53.682091000 PM | 04-AUG-16 04.48.57.412599000 PM | |
04-AUG-16 04.49.52.265018000 PM | 04-AUG-16 04.49.53.807273000 PM | 04-AUG-16 04.49.57.688499000 PM | |
04-AUG-16 04.50.52.130920000 PM | 04-AUG-16 04.50.53.953876000 PM | 04-AUG-16 04.50.57.904505000 PM | |
04-AUG-16 04.51.52.097373000 PM | 04-AUG-16 04.51.53.050948000 PM | 04-AUG-16 04.51.57.085295000 PM | |
OUTGOING_ROUTING_PATH | OUTGOING_REPLICAT | OUTGOING_HEARTBEAT_TS | OUTGOING_EXTRACT_TS |
OUTGOING_ROUTING_TS | OUTGOING_REPLICAT_TS | ||
On the target, the views gg_lag and gg_lag_history retain information relevant to lag statistics. These views are queried whenever the GGSCI lag command is executed.
Contents of the gg_lag view will be similar to this:
LOCAL_DATABASE | CURRENT_LOCAL_TS | REMOTE_DATABASE | INCOMING_HEARTBEAT_AGE |
ORCL | 04-AUG-16 05.54.02.747524000 PM | ORCL | 70.568669 |
INCOMING_PATH | INCOMING_LAG | OUTGOING_HEARTBEAT_AGE | OUTGOING_PATH |
ETPC ==> PTPC ==> RTPC | 4.896061 | ||
OUTGOING_LAG | |||
Contents of the gg_lag_history table will be similar to this
LOCAL_DATABASE | HEARTBEAT_RECEIVED_TS | REMOTE_DATABASE | INCOMING_HEARTBEAT_AGE |
ORCL | 04-AUG-16 04.48.40.927614000 PM | ORCL | 288.851811 |
ORCL | 04-AUG-16 04.48.40.931696000 PM | ORCL | 228.647207 |
ORCL | 04-AUG-16 04.48.40.932235000 PM | ORCL | 168.83817 |
ORCL | 04-AUG-16 04.48.40.933346000 PM | ORCL | 108.827492 |
ORCL | 04-AUG-16 04.48.40.934062000 PM | ORCL | 48.836749 |
ORCL | 04-AUG-16 04.49.09.943003000 PM | ORCL | 17.870809 |
ORCL | 04-AUG-16 04.50.10.073445000 PM | ORCL | 17.808427 |
ORCL | 04-AUG-16 04.51.10.209285000 PM | ORCL | 18.078365 |
ORCL | 04-AUG-16 04.52.09.107407000 PM | ORCL | 17.010034 |
INCOMING_PATH | INCOMING_LAG | OUTGOING_HEARTBEAT_AGE | OUTGOING_PATH |
ETPC ==> PTPC ==> RTPC | 286.219684 | ||
ETPC ==> PTPC ==> RTPC | 226.010998 | ||
ETPC ==> PTPC ==> RTPC | 166.201422 | ||
ETPC ==> PTPC ==> RTPC | 106.189633 | ||
ETPC ==> PTPC ==> RTPC | 46.198174 | ||
ETPC ==> PTPC ==> RTPC | 5.340405 | ||
ETPC ==> PTPC ==> RTPC | 5.423481 | ||
ETPC ==> PTPC ==> RTPC | 5.773585 | ||
ETPC ==> PTPC ==> RTPC | 4.987922 | ||
OUTGOING_LAG |
The GGSCI lag command is used to obtain lag statistics for running Oracle GoldenGate Groups. On the target server, executing this command for the Replicat returns the following information:
GGSCI (centos0ra12) 2> lag rtpc
2016-08-04 14:03:57 INFO OGG-14054 Lag from heartbeat table requires DBLOGIN.
Sending GETLAG request to REPLICAT RTPC ...
Last record lag 4 seconds.
Low watermark lag: 73.
High watermark lag: 14.
Low watermark position: 3858356.
High watermark position: 3858411.
At EOF, no more records to process.
If I dblogin to the database, I am able to obtain additional lag information from the lag views:
GGSCI (centos0ra12) 5> dblogin userid ggadmin@pdbnoreast, password Oracle1
Successfully logged into database PDBNOREAST.
GGSCI (centos0ra12 as ggadmin@orcl/PDBNOREAST) 6> lag rtpc
Lag Information From Heartbeat Table
LAG AGE FROM TO PATH
5.77s 10m 22.87s ORCL ORCL ETPC ==> PTPC ==> RTPC
Sending GETLAG request to REPLICAT RTPC ...
Last record lag 561 seconds.
Low watermark lag: 76.
High watermark lag: 16.
Low watermark position: 3859364.
High watermark position: 3859418.
Now we are able to see details about lag for the entire replication path. Overall lag as determined by the last record Replicat read from the trail (current process timestamp minus record commit timestamp) is 561 seconds. The lag information from the heartbeat table is showing that this Replicat has not processed a heartbeat record for over 10 minutes and the last heartbeat record processed incurred 5.77 seconds of lag across the replication stream. Details about the replication path, listing each Oracle GoldenGate Group that processed the heartbeat record, are also shown.
You may notice four additional lines of data in the output: Low watermark lag, High watermark lag, Low watermark position, and High watermark position. The watermark position data provides SCN information for data applied by the Replicat at the time the lag command was executed. In the above example, the low watermark position for the set of transactions processed by Replicat was SCN 3859364. Basically this means that all transactions for SCNs less than this number have already been applied to the database. While transactions greater than SCN 3859418, the high watermark position have yet to be applied. Consequently, the lag encountered at the transaction at the low watermark position, SCN 3859364, was 76 seconds (Low watermark lag) and the lag encountered at the high watermark position, SCN 385418, was 16 seconds. So, we can determine that lag is reducing as the Replicat processes its backlogged data.
If you notice above, the last record lag reported by my Replicat was 561 seconds. That's great to know, but where in the replication stream did the lag come from? For that, we can query the gg_heartbeat_history table; however there are a couple of important things to note first:
1. The timestamps recorded are in UTC format.
2. Lag computations require that the source and target server clocks be setup correctly; i.e., regularly synced with a network time service. A negative lag means your target server clock is ahead of the source server; while, and inordinately high lag could mean the source is not synced with the target.
To compute lag at each point in the replication stream, we want to get information for these columns in the gg_heartbeat_history table:
(a) incoming_heartbeat_ts – source side heartbeat timestamp
(b) incoming_extract_ts – timestamp when extract processed the heartbeat record
(c) incoming_routing_ts – timestamp when data pump read the heartbeat record from the extract trail
(d) incoming_replicat_ts – timestamp when replicat read the heartbeat record from the remote trail
(e) heartbeat_received_ts - timestamp when replicat applied the heartbeat record to the target
With this information, we can now compute the following:
(a) heartbeat_received_ts - incoming_heartbeat_ts = total end-to-end lag
(b) incoming_extract_ts - incomning_heartbeat_ts = cdc extract lag
(b) incoming_routing_ts – incoming_extract_ts = data pump read lag
(c) incoming_replicat_ts – incoming_routing_ts = replicat read lag
(d) heartbeat_received_ts – incoming_replicat_ts = replicat apply lag
Then create a database query to return lag across the replication stream in seconds:
set hea on set wrap off set lin 300 column Extract format a9 column Data_Pump format a10 column Replicat format a9 select to_char(incoming_heartbeat_ts,'DD-MON-YY HH24:MI:SSxFF') Source_HB_Ts , incoming_extract Extract , extract (day from (incoming_extract_ts - incoming_heartbeat_ts))*24*60*60+ extract (hour from (incoming_extract_ts - incoming_heartbeat_ts))*60*60+ extract (minute from (incoming_extract_ts - incoming_heartbeat_ts))*60+ extract (second from (incoming_extract_ts - incoming_heartbeat_ts)) Extract_Lag , incoming_routing_path Data_Pump , extract (day from (incoming_routing_ts - incoming_extract_ts))*24*60*60+ extract (hour from (incoming_routing_ts - incoming_extract_ts))*60*60+ extract (minute from (incoming_routing_ts - incoming_extract_ts))*60+ extract (second from (incoming_routing_ts - incoming_extract_ts)) Data_Pump_Read_Lag , incoming_replicat Replicat , extract (day from (incoming_replicat_ts - incoming_routing_ts))*24*60*60+ extract (hour from (incoming_replicat_ts - incoming_routing_ts))*60*60+ extract (minute from (incoming_replicat_ts - incoming_routing_ts))*60+ extract (second from (incoming_replicat_ts - incoming_routing_ts)) Replicat_Read_Lag , extract (day from (heartbeat_received_ts - incoming_replicat_ts))*24*60*60+ extract (hour from (heartbeat_received_ts - incoming_replicat_ts))*60*60+ extract (minute from (heartbeat_received_ts - incoming_replicat_ts))*60+ extract (second from (heartbeat_received_ts - incoming_replicat_ts)) Replicat_Apply_Lag , extract (day from (heartbeat_received_ts - incoming_heartbeat_ts))*24*60*60+ extract (hour from (heartbeat_received_ts - incoming_heartbeat_ts))*60*60+ extract (minute from (heartbeat_received_ts - incoming_heartbeat_ts))*60+ extract (second from (heartbeat_received_ts - incoming_heartbeat_ts)) Total_Lag from ggadmin.gg_heartbeat_history order by heartbeat_received_ts desc;
Which returned the following in my test environment:
This now allows me to see the overall health of my entire replication stream. For example, in the data I see a total lag of 250.241 seconds was recorded at 15:28 on 8 Dec 2017. If this lag exceeds the service level agreement requirements for this environment, I can then do some further analysis to determine why the Replicat had issues due to the 244 second read lag recorded for that process.
In this article we introduced you to the internal Oracle GoldenGate heartbeat, detailed how to enable heartbeat across the replication environment, and provided information about the tables and views that store heartbeat statistics and how to obtain that information via the GGSCI lag command and direct database queries.
For more information on what other articles are available for Oracle GoldenGate please view our index page.
Previous Post
Next Post