Thursday, August 23, 2018

KEY_MISMATCH as Last AgentURL is null in "emoms.trc"


Every once-in-a-while I'll be reviewing a logfile or trace file in search of one thing and I'll come across something else which doesn't look quite right.  As the title of this blog suggests, I found a frequently repeating error in our development OEM's "emoms.trc" file, specifically:

[2018-08-23T17:26:18.819-04:00] [EMGC_OMS1] [ERROR:32] [] 
[oracle.sysman.core.pbs.receiver.AbstractOMSHandshake] [tid: [ACTIVE].ExecuteThread: '15' for queue: 
'weblogic.kernel.Default (self-tuning)'] [userId: <anonymous>] [ecid: 
0000MJPAUGO1JfV0U3J7Fs1RMaUD00000B,0:7540:1] [APP: empbs] [URI: /empbs/upload] OMSHandshake 
failed.(AGENT URL = https://<agent host>:3872/emd/main/)(ERROR = KEY_MISMATCH as Last AgentURL is 
null and the Agent Key doesn't exist in repos)

A few things caught my attention with this message.  First, the error was from an agent on a server that I know has been decommissioned, or at least removed from OEM.  Second, the error was coming in every minute, which to me seemed like a real waste since the agent shouldn't exist.

It turns out that not only did the server still exist but the management agent was up.  Apparently in-between our team shutting down all Oracle processes plus removing targets from OEM and the final step of decommissioning the hardware, the server itself suffered a reboot which caused all processes to start automatically.  From then on, the management agent tried every minute to upload information to the repository yet the repository couldn't find a match for the agent.

Shutting down the management agent stopped the flow of error messages to "emoms.trc" (this would be the domain server's file under "/user_projects/domains/GCDomain/…/sysman/log") but the quickest way to avoid any repeats of this in the future was to also rename the top-level directory for the agent.