OIM Clustering: Keeping separate environments separate

April 17, 2014 | 7 minute read
Text Size 100%:

Oracle
Identity Manager 11g incorporates several clustering technologies in
order to ensure high-availability across its different components.
Several of these technologies use multicast to discover other cluster
nodes on the same subnet. For testing and development purposes, it is
common to have multiple distinct OIM environments co-existing on the
same subnet. In that scenario, it is essential that the distinct
environments utilise separate multicast addresses, so that they do not
talk to each other – if they do, they will confuse one another, and many
things can go wrong. This problem is less common with production
environments, since best practice dictates that the production
environment should be on a separate subnet from development and test,
and multicast traffic cannot transverse subnet boundaries without
special configuration.

Overview of OIM Clustering

Here’s a rough diagram of the clustering components inside OIM:

Quartz Scheduler Cluster

Data Caching Cluster

EclipseLink
(11.1.2.0.x and earlier only)

OSCache

Application Server Cluster

(WebLogic or WebSphere)

There are three basic layers of clustering in OIM:

  • Application Server Clustering: This is the
    clustering layer of the underlying Java EE Application Server (Oracle
    WebLogic or IBM WebSphere). This is responsible for replication of the
    JNDI tree, EJBs, HTTP sessions, etc.
  • Data Caching: This provides in-memory caching of
    data to improve performance, while ensuring that database updates made
    on one node are propagated promptly to the others. OIM uses OSCache
    (OpenSymphony Cache) as the underlying technology for this.
  • Scheduler Clustering: This is used to ensure that
    in a cluster each execution of a scheduled job only runs on one node.
    Otherwise, if a job is scheduled to start at 9am, every node in the
    cluster might try to start it at the same time, resulting in multiple
    simultaneous executions of that job

Clustering layers present in older versions only

  • In OIM 11gR1, and 11gR2 base release, OIM used EclipseLink data
    caching, which included its own multicast clustering layer. From OIM
    11.1.2.1.0 onwards, while EclipseLink is still being used for data
    access, its caching features are no longer used, so this form of
    multicast clustering is no longer present.
  • As well as using JGroups for OSCache, OIM 9.x also used JGroups for a
    couple of additional functions (forcibly stopping scheduled tasks and
    diagnostic dashboard JMS test.) In OIM 11g, JGroups is now used for
    OSCache only.

Underlying technologies used

Different clustering components in OIM use different technologies:

Component Technology Details
Application Server Cluster Unicast or Multicast Consult Application Server documentation:
EclipseLink
(OIM 11.1.2.0.x and earlier only)
  • Multicast for node discovery
  • T3 JNDI for node-to-node communication (WebLogic)
  • RMI for node-to-node communication (WebSphere)
Multicast is only used to find other nodes
in the cluster. With WLS, JNDI connections are opened between the nodes
for the cache coordination traffic. On WebSphere, RMI is used instead.
OSCache
  • Multicast using JGroups package
 
Quartz Scheduler
  • Database tables
Unlike other clustering components, Quartz
does not use direct network communication between the nodes. Database
tables are used for inter-cluster communication

Relevant Configuration Settings

I’m only going to talk about the OIM-specific clustering settings
here. So I won’t go into the configuration of the WebLogic/WebSphere
clustering layer, only the data cache and scheduler clustering layers.
All configuration relevant to these can be found in the
/db/oim-config.xml file in MDS. So let’s discuss the settings in this
file which are relevant to clustering.

Setting Explanation
<cacheConfig clustered=”…”> Must be set to true in a clustered install, and false for a single-instance install. This controls whether OSCache operates in a clustered mode.
<cacheConfig>/<xLCacheProviderProps multicastAddress=””> Multicast address which is used for
OSCache. (Also used by EclipseLink in versions 11.1.2.0.x and earlier;
the same address is used for both.) Make sure this address is unique for each distinct OIM environment on the same subnet.
<xLCacheProviderProps>/<properties> Can be used to manually override JGroups configuration used by OSCache. Not recommended.
<schedulerConfig clustered=”…”> Must be set to true in a clustered install, and false for a single-instance install.
<schedulerConfig multicastAddress=”…”> In OIM 9.x, JGroups was used to forcibly
stop jobs. In OIM 11g, a different mechanism is used instead. This
configuration setting is a left-over from OIM 9.x, and is now ignored.
However, to avoid confusion, it is recommended to set this to the same multicastAddress as the xLCacheProviderProps above.
<deploymentConfig>/<deploymentMode> In a clustered install, should be set to clustered; in a single instance, should be set to simple. This is used to control whether EclipseLink operates in a clustered mode.
<SOAConfig>/<username> As its name implies, this is the username used by OIM to login to SOA. However,
in OIM 11.1.2.0.0 and earlier, it also serves an additional purpose –
on WebLogic, this username is used by EclipseLink clustering for
inter-node communication. By default, this is weblogic; if you
have renamed the weblogic user, you must change it; you are free to use
another user if you wish, so long as they are a member of the Administrators group. (On WebSphere, this user is used for OIM-SOA integration only, not for EclipseLink clustering.)To change this, see “2.6
Optional: Updating the WebLogic Administrator Server User Name in
Oracle Enterprise Manager Fusion Middleware Control (OIM Only)”
. (If step 11 in those steps gives you a permissions error, just skip that step.)
<SOAConfig>/<passwordKey> This is the name of the CSF Credential
which stores the password for the <SOAConfig> user. You should
never change this setting in oim-config.xml from its default of SOAAdminPassword, but you will need to change the corresponding CSF entry whenever you change that user’s password.

What can go wrong

As I’ve mentioned, it is important that you have the correct
clustering configuration for your environment. If you do not, many
things can go wrong. I don’t propose to provide an exhaustive list of
potential problems in this blog post, but just give one example I
recently encountered at a customer site.

This customer was preparing to go live with Oracle Identity Manager
11.1.2.0. As part of their pre-production activities, they needed to
document and test the procedure for periodic change of the weblogic
password. They began by their testing by changing the weblogic password
in one of their development environments. Restarting the OIM managed
server, they saw this message multiple times in their WebLogic log: <Authentication of user weblogic failed because of invalid password>. They also found that the WEBLOGIC user in OIM was locked.

What went wrong here? Well, several things were wrong in this environment:

  • They had <SOAConfig>/<username> set to weblogic,
    but they had not updated the SOAAdminPassword credential in CSF to the
    new weblogic password. This customer does not currently use any of the
    OIM functionality which requires SOA, so they normally leave their SOA
    server down, including for this test. You would think therefore that the<SOAConfig> would not be relevant to them; but, as I have pointed out above, it is also used for EclipseLink clustering.
  • Even though their development environments were single instance installs, they all had <deploymentConfig>/<deploymentMode> set to cluster instead of simple. As a result, EclipseLink clustering was active even though it did not need to be.
  • <cacheConfig>/<xLCacheProviderProps multicastAddress=””>
    was set to the same address in multiple development environments on the
    same subnet. As a result, even though these environments were meant to
    be totally separate, they were formed into a single EclipseLink cluster.

So, what would happen, was that this environment (let’s call it DEV1) at startup would initialise EclipseLink clustering (since <deploymentConfig>/<deploymentMode> is set to cluster.) It would then add itself to the multicast group configured in <cacheConfig>/<xLCacheProviderProps multicastAddress=””>.
At this point, DEV1 becomes visible to the other development
environments (say DEV2 and DEV3). DEV2 tries to login to DEV1 over T3,
using the <SOAConfig>/<username> user (weblogic) and
the SOAAdminPassword password from CSF. However, the weblogic password
having changed, both DEV2 and DEV3 will receive an invalid credential
error, and DEV1 will experience <Authentication of user weblogic failed because of invalid password>. Setting <deploymentConfig>/<deploymentMode> to simple resolved this.

All site content is the property of Oracle Corp. Redistribution not allowed without written permission

Simon Kissane


Previous Post

ODI Agents: Standalone, JEE and Colocated

Christophe Dupupet | 6 min read

Next Post


How To Display A Custom Error Page When the Access Server Is Down?

Tim Melander | 6 min read