X

Best Practices from Oracle Development's A‑Team

Oracle GoldenGate: Understanding OGG-01161 Bad Column Index Error

Introduction

The "OGG-01161 Bad Column Index" error is possibly one of the most reviled and misunderstood Oracle GoldenGate (OGG) error messages ever created. In this article, we shall attempt to demystify this error message.

Main Article

Bad column index errors tend to occur in OGG heterogeneous; i.e., unlike database, environments and most frequently where change control procedures do not require an OGG sign-off for database maintenance tasks. To demonstrate, I have an environment where I replicate a table from Oracle Database 11.2.0.4 to Microsoft SQL Server 2008. The source and target tables are defined as follows:

Oracle SQL Server
create table ogg01161 (
pkey_id number(11) NOT NULL,
first_name varchar2(50),
surname    varchar2(50),
date_added timestamp,
last_modified timestamp,
PRIMARY KEY (pkey_id) using index
);
create table ogg01161 (
pkey_id decimal(11,0) not null,
first_name varchar(50),
surname varchar(50),
date_added datetime,
last_modified datetime,
primary key (pkey_id)
);

 

With these table definitions, data replication runs fine but then all of a sudden Replicat fails with an error:

2014-05-19 13:32:37  ERROR   OGG-01161  Bad column index (5) specified for table EAST.OGG01161, max columns = 5.

Viewing the Replicat discard file, I do not see anything that tells me there is a data issue:

Oracle GoldenGate Delivery for SQL Server process started, group R_ORA12 discard file opened: 2014-05-19 13:20:19.816000 Process Abending : 2014-05-19 13:32:37

Ok, so lets look at the error message again. It says "max columns = 5", which I interpret to mean there are 5 columns defined for this table. My table definition lists 5 columns for this table, so why am I getting this error?

Lets look at the error message again. The first part of the message says "Bad column index (5)". The key word here is "index".

Replicat creates a pre-compiled SQL statement for each table defined as a MAP statement in the parameter file. This statement is created once for each table as it processes data from the OGG Trail by executing a meta data query. When the meta data query is executed, you will see a statement in the Replicat parameter file similar to this:

  map "EAST"."OGG01161", target dbo.OGG01161; Using following columns in default map by name:   pkey_id, first_name, surname, date_added, last_modified

This column map statement is maintained in Replicat's memory space as an array. For non-computer programmers, an array is a variable that holds multiple values of the same type, and is referenced by an index. The first element of an array is always at index number 0.

So in the memory space associated with this Replicat, there will be an array that contains the following data:

Array Index Data Value
0 pkey_id
1 first_name
2 surname
3 date_added
4 last_modified

 

According to this analysis, the max column index is 4, so why is Replicat telling me it is 5?

Let's look at the data coming from the source via the Logdump utility.

When the Replicat failed, it reported its read location at RBA (relative byte address) 4584 in the trail ./dirdat/pe000001. The report file shows:

Reading ./dirdat/pe000001, current RBA 4584

So in Logdump, I open the trail for read, set "detail on" so I can see all of the columnar data, and position to the relative byte address just before the record that caused the failure. Then I can use the scanforheader command to view the failure record.

Logdump 12 >open ./dirdat/pe000001 Current LogTrail is C:\OGG12\dirdat\pe000001 Logdump 13 >ghdr on Logdump 14 >detail data Logdump 15 >pos 4580 Reading forward from RBA 4580 Logdump 16 >sfh ___________________________________________________________________ Hdr-Ind    :     E  (x45)     Partition  :     .  (x04) UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41) RecLength  :   119  (x0077)   IO Time    : 2014/05/19 13:32:28.513.201 IOType     :     5  (x05)     OrigNode   :   255  (xff) TransInd   :     .  (x03)     FormatType :     R  (x52) SyskeyLen  :     0  (x00)     Incomplete :     .  (x00) AuditRBA   :        110       AuditPos   : 150935568 Continued  :     N  (x00)     RecCount   :     1  (x01) 2014/05/19 13:32:28.513.201 Insert               Len   119 RBA 4584 Name: EAST.OGG01161 After  Image:                                             Partition 4   G  s  0000 000a 0000 0000 0000 0000 0003 0001 000b 0000 | ....................  0007 4368 6172 6c65 7300 0200 0900 0000 054a 6f6e | ..Charles........Jon  6573 0003 001f 0000 3230 3134 2d30 352d 3139 3a31 | es......2014-05-19:1  333a 3332 3a32 372e 3430 3532 3136 3030 3000 0400 | 3:32:27.405216000...  1fff ff31 3930 302d 3031 2d30 313a 3030 3a30 303a | ...1900-01-01:00:00:  3030 2e30 3030 3030 3030 3030 0005 0003 0000 4b   | 00.000000000......K Column     0 (x0000), Len    10 (x000a)  0000 0000 0000 0000 0003                          | .......... Column     1 (x0001), Len    11 (x000b)  0000 0007 4368 6172 6c65 73                       | ....Charles Column     2 (x0002), Len     9 (x0009)  0000 0005 4a6f 6e65 73                            | ....Jones Column     3 (x0003), Len    31 (x001f)  0000 3230 3134 2d30 352d 3139 3a31 333a 3332 3a32 | ..2014-05-19:13:32:2  372e 3430 3532 3136 3030 30                       | 7.405216000 Column     4 (x0004), Len    31 (x001f)  ffff 3139 3030 2d30 312d 3031 3a30 303a 3030 3a30 | ..1900-01-01:00:00:0  302e 3030 3030 3030 3030 30                       | 0.000000000 Column     5 (x0005), Len     3 (x0003)  0000 4b                                           | ..K

Logdump shows me the record, which is a source table insert operation and contains data for six table columns. So it appears my source table was altered, which can be verified via sqlplus.

SQL> describe ogg01161;  Name                                      Null?    Type  ----------------------------------------- -------- ----------------------------  PKEY_ID                                   NOT NULL NUMBER(11)  FIRST_NAME                                         VARCHAR2(50)  SURNAME                                            VARCHAR2(50)  DATE_ADDED                                         TIMESTAMP(6)  LAST_MODIFIED                                      TIMESTAMP(6)  MIDDLE_INITIAL                                     CHAR(1)

Reviewing the definition for my tables, I see a column "MIDDLE_INITIAL" was added to the source that does not exist in the target. Now that we know the source of our Replicat failure, we can fix the problem.

The first thing we need to do is to generate a new set of defines for the source. Execute the OGG DEFGEN utility and copy the output to the target location. You may need to update the Replicat parameter file if the defines file name is different that what is specified by the Replicat SOURCEDEFS parameter setting.

Next, we need to determine if this new column data needs to be replicated to the target. If not, we can restart Replicat and the column will be ignored. Otherwise, alter the target table and add the new column with a data type that corresponds with the source and restart Replicat.

Summary

OGG Bad Data Index errors can be very confusing for novice users. In this article we presented details about the error and provided an example for troubleshooting and correcting the error condition.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha