This post is part of a larger series on Oracle Access Manager 11g called Oracle Access Manager Academy. An index to the entire series with links to each of the separate posts is available.
I recently was working on one of my virtual environments that had three servers, which included OAM 11gR2PS2, though this could happen with pretty much any version of OAM. It started with my browser requesting a protected resource, and I was presented with the login page as expected, but after logging in I was issued a HTTP 404 right at the page I wanted to get to. So I ran a HTTP Trace and realized it was in an infinite loop that abruptly stopped. I was able to quickly get this fixed, but I know for many this can be very puzzling because it is very difficult to determine what the problem is even though it is pretty simple. If this has ever happened to you let me explain what is going on, how to fix it, and how to prevent this behavior along with other problems that can be a result of the root cause.
To get to the crux of it the root cause was that one of the system clocks of the OAM or the WebGate servers was out of sync. In my example one server was accidently on a different time zone, but this could very well happen if the clocks are out of sync by even a minute.
When the time is out of sync on a WebGate or OAM Server, the time stamp in the OAMAuthnCookie token is basically expired and thereby causes OAM to reject the token and redirect the browser back to the OAM server to get a new one, and again, and again, and again; hence the infinite loop.
This fix is pretty simple. All that is required is to reset all the dates and time on the WebGate and OAM Servers to make sure they are all in sync within 60 seconds. It may require restarting the web server for the WebGates or the OAM Servers, but it is fairly painless to solve.
When the system clocks are out of sync on any of the WebGate or OAM Servers other odd behavior can bubble up such as:
Setting up NTP (Network Time Protocol) on all the WebGate and OAM Servers is one way to ensure the clocks are all in sync. Even if you would setup OAM for MDC (Multi Data Center) syncing the clocks across the data centers is critical. You may think that there is some product defect, but actually OAM is doing its job to be certain a rogue user is not replaying an old token in order to try and get in. Basically part of the SSO token in the OAMAuthnCookie contains a time stamp and the WebGates and OAM Servers pay very close attention to this time stamp because if it gets something that seems to look like it is old, OAM will automatically reject it; hence all the symptoms I described.