Oracle Identity Manager 11g incorporates several clustering technologies in order to ensure high-availability across its different components. Several of these technologies use multicast to discover other cluster nodes on the same subnet. For testing and development purposes, it is common to have multiple distinct OIM environments co-existing on the same subnet. In that scenario, it is essential that the distinct environments utilise separate multicast addresses, so that they do not talk to each other – if they do, they will confuse one another, and many things can go wrong. This problem is less common with production environments, since best practice dictates that the production environment should be on a separate subnet from development and test, and multicast traffic cannot transverse subnet boundaries without special configuration.
Here’s a rough diagram of the clustering components inside OIM:
Quartz Scheduler Cluster
Data Caching Cluster
Application Server Cluster
There are three basic layers of clustering in OIM:
Different clustering components in OIM use different technologies:
|Application Server Cluster||Unicast or Multicast||Consult Application Server documentation:|
(OIM 18.104.22.168.x and earlier only)
| ||Multicast is only used to find other nodes in the cluster. With WLS, JNDI connections are opened between the nodes for the cache coordination traffic. On WebSphere, RMI is used instead.|
|Quartz Scheduler|| ||Unlike other clustering components, Quartz does not use direct network communication between the nodes. Database tables are used for inter-cluster communication|
I’m only going to talk about the OIM-specific clustering settings here. So I won’t go into the configuration of the WebLogic/WebSphere clustering layer, only the data cache and scheduler clustering layers. All configuration relevant to these can be found in the /db/oim-config.xml file in MDS. So let’s discuss the settings in this file which are relevant to clustering.
|<cacheConfig clustered=”...”>||Must be set to true in a clustered install, and false for a single-instance install. This controls whether OSCache operates in a clustered mode.|
|<cacheConfig>/<xLCacheProviderProps multicastAddress=””>||Multicast address which is used for OSCache. (Also used by EclipseLink in versions 22.214.171.124.x and earlier; the same address is used for both.) Make sure this address is unique for each distinct OIM environment on the same subnet.|
|<xLCacheProviderProps>/<properties>||Can be used to manually override JGroups configuration used by OSCache. Not recommended.|
|<schedulerConfig clustered="...">||Must be set to true in a clustered install, and false for a single-instance install.|
|<schedulerConfig multicastAddress=”...”>||In OIM 9.x, JGroups was used to forcibly stop jobs. In OIM 11g, a different mechanism is used instead. This configuration setting is a left-over from OIM 9.x, and is now ignored. However, to avoid confusion, it is recommended to set this to the same multicastAddress as the xLCacheProviderProps above.|
|<deploymentConfig>/<deploymentMode>||In a clustered install, should be set to clustered; in a single instance, should be set to simple. This is used to control whether EclipseLink operates in a clustered mode.|
|<SOAConfig>/<username>||As its name implies, this is the username used by OIM to login to SOA. However, in OIM 126.96.36.199.0 and earlier, it also serves an additional purpose – on WebLogic, this username is used by EclipseLink clustering for inter-node communication. By default, this is weblogic; if you have renamed the weblogic user, you must change it; you are free to use another user if you wish, so long as they are a member of the Administrators group. (On WebSphere, this user is used for OIM-SOA integration only, not for EclipseLink clustering.)To change this, see “2.6 Optional: Updating the WebLogic Administrator Server User Name in Oracle Enterprise Manager Fusion Middleware Control (OIM Only)”. (If step 11 in those steps gives you a permissions error, just skip that step.)|
|<SOAConfig>/<passwordKey>||This is the name of the CSF Credential which stores the password for the <SOAConfig> user. You should never change this setting in oim-config.xml from its default of SOAAdminPassword, but you will need to change the corresponding CSF entry whenever you change that user’s password.|
As I’ve mentioned, it is important that you have the correct clustering configuration for your environment. If you do not, many things can go wrong. I don’t propose to provide an exhaustive list of potential problems in this blog post, but just give one example I recently encountered at a customer site.
This customer was preparing to go live with Oracle Identity Manager 188.8.131.52. As part of their pre-production activities, they needed to document and test the procedure for periodic change of the weblogic password. They began by their testing by changing the weblogic password in one of their development environments. Restarting the OIM managed server, they saw this message multiple times in their WebLogic log: <Authentication of user weblogic failed because of invalid password>. They also found that the WEBLOGIC user in OIM was locked.
What went wrong here? Well, several things were wrong in this environment:
So, what would happen, was that this environment (let’s call it DEV1) at startup would initialise EclipseLink clustering (since <deploymentConfig>/<deploymentMode> is set to cluster.) It would then add itself to the multicast group configured in <cacheConfig>/<xLCacheProviderProps multicastAddress=””>. At this point, DEV1 becomes visible to the other development environments (say DEV2 and DEV3). DEV2 tries to login to DEV1 over T3, using the <SOAConfig>/<username> user (weblogic) and the SOAAdminPassword password from CSF. However, the weblogic password having changed, both DEV2 and DEV3 will receive an invalid credential error, and DEV1 will experience <Authentication of user weblogic failed because of invalid password>. Setting <deploymentConfig>/<deploymentMode> to simple resolved this.