Fixing OVM server “Starting” state

Environment

Oracle VM Server 3.4 installed on Cisco UCS blade

Oracle VM Manager 3.4

Problem Summary

1) This server has the same IP address as another server. Please correct that on the servers. Or,
2) The SMBIOS UUID of this server has changed due to a server motherboard change. Please delete the server from the manager and then re-discover it. Or,
3) The SMBIOS UUID has changed due to moving the blade in t he chassis and there is an incorrect blade chassis SMBIOS UUID setting which allows the UUID of the server to change with the slot. Please update the blade chassis’s SMBIOS UUID settings and re-discover.

In our situation we replaced failed RAM module. I don’t know why server started with it’s original UUID instead of the one from the attached Service Profile.

[root@ovm-manager ~]# ssh admin@localhost -p 10000
admin@localhost’s password:
OVM> list server
Command: list server
Status: Success
Time: 2019-10-10 09:56:20,186 EDT
Data:
id:12:a8:01:4c:e9:ab:e1:11:00:00:00:00:00:00:00:01 name:ovm01.domain.com
id:12:a8:02:4c:e9:ab:e1:11:00:00:00:00:00:00:00:02 name:ovm02.domain.com
id:12:a8:03:4c:e9:ab:e1:11:00:00:00:00:00:00:00:03 name:ovm03.domain.com
id:12:a8:04:4c:e9:ab:e1:11:00:00:00:00:00:00:00:04 name:ovm04.domain.com

OVM> refresh server name=ovm04.domain.com
Command: refresh server name=ovm04.domain.com
Status: Failure
Time: 2019-10-10 10:01:41,685 EDT
JobId: 1570716101312]
Error Msg: Job failed on Core: OVMAPI_6000E Internal Error: OVMAPI_4021E Server discover conflict at IP address: 10.10.10.10. The manager already has a server: ovm04.domain.com, at this IP address, with SMBIOS UUID: 12:a8:04:4c:e9:ab:e1:11:00:00:00:00:00:00:00:04 .
But the server now being discovered: unknown, at that same IP address, has a different SMBIOS UUID: 34:b9:15:5d:f0:cd:f2:22:00:00:00:00:00:00:00:05. This can happen in these cases:
1) This server has the same IP address as another server. Please correct that on the servers. Or,
2) The SMBIOS UUID of this server has changed due to a server motherboard change. Please delete the server from the manager and then re-discover it. Or,
3) The SMBIOS UUID has changed due to moving the blade in t he chassis and there is an incorrect blade chassis SMBIOS UUID setting which allows the UUID of the server to change with the slot. Please update the blade chassis’s SMBIOS UUID settings and re-discover.

Solution

Actually, we need to set the UUID of OVM Server in a way so that it shouldn’t change irrespective of any network changes
1. Get the OVM server  UUID  from OVM manager Under the Advance section by choosing the perspective as “Info”.
2. Now add the UUID to the file  /etc/ovs-agent/agent.ini on Oracle VM server to the starting with  “fakeuuid” line as there was no UUID present:
# cat /etc/ovs-agent/agent.ini
[server]
fakeuuid= 12:a8:04:4c:e9:ab:e1:11:00:00:00:00:00:00:00:04
3. Started the ovs-agent services of the OVM server  : 
# service ovs-agent restart 
4.  Refresh Server via OVM cli
 4.1 Login to CLI from the manager server
       #ssh admin@localhost -p 10000
 4.2 List the servers and then do a refresh 
       OVM>list server
        OVM>refresh server name=<Name of server found by “list server” command>

The starting status of the server will  change to “Running”

Links:
https://k10technical.blogspot.com/2018/10/oracle-vm-server-hangs-with-starting.html

Leave a comment

You must be logged in to post a comment.