Saturday, October 13, 2012

Oracle Cloud Control

I had the occasion recently to deploy an OVM stack at a customer site.  Initially it was my intention to only deploy Enterprise Manager 12c and take advantage of the cloud control features within.  As it turned out, this provides merely a "remote control" function of an existing OVM Manager.  With this in mind, I could not find many advantages to using EM12c for Oracle virtual machine management.

The next hurdle was with the hardware.  Of course, the physical servers we deployed the OVS 3.1.1. hyper-visor on were not "Oracle Certified."  Specifically DELL PowerEdge M620 Blade Servers.  (http://www.dell.com/us/enterprise/p/poweredge-m620/pd?~ck=anav)

The customer had selected the Broadcom®  57810S-k Dual Port 10Gb KR blade NDC on board Network Adapter as well as the Mezzanine Broadcom 5719 Serdes Quad Port 1Gb providing a total of 6 ethernet ports, two of them 10GB.

The storage array is Equalogic iSCSI SAN.  (Sorry the exact make / model of the SAN escapes me right now.)

The OVM 3.1.1. Server installation went perfectly well.  We selected the 3rd port for management (1st port on the 1g mezzanine card) knowing that ports 1 and 2 were to be used for iSCSI and possibly Live Migration.

The blades were able to be discovered without difficulty into OVM manager.  Apon inspection, however, eth0 was showing as DOWN in all three blade servers.  The iDRAC console also showed that the link was UP but the OS State was reported as DOWN in each case.  I was able to successfully configure eth1 and create a single port bond with the appropriate IP address for the SAN and discover the storage.  Just without redundancy at this stage.  Of course this had to be resolved before production.

I did some research and discovered that the OEM 3.1.1. hyper-visor is based on RHEL / OL 5.7 and that the broadcom driver (bnx2x) needed an option to force all ports back to legacy state.

From: ftp://ftp.supermicro.com/CDR-X9_1.11_for_Intel_X9_platform/Broadcom/Server/Linux/Driver/README.bnx2x.TXT

Driver Parameters
=================

Several optional parameters can be supplied as a command line argument
to the insmod or modprobe command. These parameters can also be set in
modprobe.conf. See the man page for more information.

The optional parameter "int_mode" is used to force using an interrupt mode
other than MSI-X. By default, the driver will try to enable MSI-X if it is
supported by the kernel. In case MSI-X is not attainable, the driver will try
to enable MSI if it is supported by the kernel. In case MSI is not attainable,
the driver will use legacy INTx mode. In some old kernels, it's impossible to
use MSI if device has used MSI-X before and impossible to use MSI-X if device
has used MSI before, in these cases system reboot in between is required.

Set the "int_mode" parameter to 1 as shown below to force using the legacy
INTx mode on all NetXtreme II NICs in the system.

   insmod bnx2x.ko int_mode=1

or

   modprobe bnx2x int_mode=1


The way I achieved this was to add a file into /etc/modprobe.d as follows:

/etc/modprobe.d/bxn2x.conf
options bxn2x int_mode=1

Reboot the blades and viola, eth0 shows status UP, ethtool eth0 indicates 10gb speed on the port and the iDRAC console shows all good on the port.
I then added eth0 to the bond I created earlier and left it in Active / Backup configuration.

I was unable to make multipathing work through the GUI.  I suspect that OVM manager only discovers a single path to storage and does not attempt anything further.  I am confident that the Active Backup bond configuration will provide sufficient storage redundancy for the storage pool, server pool repository and PVM Guest LUNS.

I hope this helps anyone who is also struggling with this issue.