EqualLogic PS4100 and ESXi 5.5 VMkernel Setup

Read the first blog post here about Dell MEM and ESXi 5.5

As part of the EqualLogic MEM setup.pl script it creates a new vSwitch and also its iSCSI VMkernel’s, but due to that script not currently being compatible with ESXi 5.5 U2 we had to manually create the connections. Below is the setup we have used and a few differences compared to the Dell EqualLogic best practice document.

In the document it states that the heartbeat connection is no longer required for ESXi servers of version 5.1 or above. I had issues with this and found that if a switch that one of the adapters connected to failed, the pings would stop to the storage device and ESXi would report that all paths were down for the EqualLogic array, and therefore lose access to the datastore. I added a heartbeat connection in and was able to keep a path active when an adapter failed. Below will cover the settings I have in place and the tests I did to determine successful failover in various scenarios

The Storage Heartbeat VMkernel port is no longer required for ESX servers running version 5.1 and greater

vSwitch Setup

  • Create a new Standard VMkernel vSwitch on your host and assign two or more network adapters to it (these are obviously the adapters that connect to your storage network where the EqualLogic sits)
  • Name the adapter “EQL Heartbeat”
  • Assign an IP to the EQL Heartbeat VMkernel port
  • Finish the setup screen
  • Find the vSwitch in the Networking window and select Properties
  • Add a new VMkernel port to the vSwitch; press Add, select VMkernel, label it EQL_1
  • Assign an IP address (same subnet as your heartbeat port), finish the setup
  • Repeat the process again and label the new VMkernel EQL_2, again this needs to be in the same subnet as the other two VMkernel ports

Once you have done that you should have something that looks similar to this;

vswitch1

We need to make a few more changes to this vSwitch;

  • Edit each of the items in the screen shot to have an MTU of 9000 (your switch must also support this!)
  • Edit vSwitch > NIC teaming tab, all adapters should be active
  • Edit EQL Heartbeat > NIC teaming tab, all adapters should be active (will most likely be grayed out as inherits settings from vSwitch)
  • Edit EQL_1 > NIC teaming tab, under Failover Order tick Override switch failover adapter, keep one in Active and move the other to unused
  • Edit EQL_2 > NIC teaming tab, under Failover Order tick Override switch failover adapter, keep one in Active and move the other to unused – this needs to be different to the one used in EQL_1

eql_1 eql_2

iSCSI Software Adapter

If you don’t yet have an iSCSI software adapter now is the time to add one in, once you have done that select it and press Properties. Under Network Configuration and press Add, select the two adapters you just configured (EQL_1 and EQL_2).

While you are here also add in your EqualLogic’s group IP address in to the Dynamic Discovery tab

Setup VMware Round Robin

The next steps I have documented in the post Dell MEM v1.2 and ESXi 5.5 U2, so I wont cover them again here. These steps setup cover the remaining recommendations from Dell.

Testing Failover

To confirm the above settings I did the followings tests to confirm failover – during these tests you should not have all paths down.
These tests make the assumption that you have at least two switches, two network adapters and two controllers in your EqualLogic, and that you return to default before running the next test.

  • Remove a network cable from the host (which connects to the storage network)
  • Remove the other network cable
  • Power off 1 switch
  • Power off other switch
  • Disconnect controller
  • Disconnect other controller

If you do encounter an APD scenario double-check your cabling, as per this diagram from Dell, a long with the heartbeat adapter the cabling also caused a problem. This was resolved with the help of this diagram.

ps4100-cabling

 

Dell MEM v1.2 and ESXi 5.5 U2

It would seem that the Dell MEM v1.2 will not install on ESXi 5.5 U2 using esxcli. It can apparently be deployed using vMA or VUM but as these aren’t used we had to look for an alternative.

The issue was with Dell who pointed us towards the following best practice document;
http://en.community.dell.com/cfs-file/__key/telligent-evolution-components-attachments/13-4491-00-00-20-43-46-01/TR1091-Best-Practices-with-EqualLogic-and-VMware-1.1.pdf?forcedownload=true

In summary they recommend the following.
– Use Round Robin
– Change from 1000 IO’s per path to 3 IO’s per path
– Change iSCSI Timeout values from default to 60 seconds
– Disable Delayed ACK

Dell have provided some handy scripts to get you started, here they are for ESXi 5.x (they can all be found in the PDF attached earlier in the post so please refer to this for full details);

Set all EqualLogic volumes to Round Robin and set IOPS value to 3 – this must be run on all hosts;

esxcli storage nmp satp set --default-psp=VMW_PSP_RR --satp=VMW_SATP_EQL ; for i in `esxcli storage nmp device list | grep EQLOGIC|awk '{print $7}'|sed 's/(//g'|sed 's/)//g'` ; do esxcli storage nmp device set -d $i --psp=VMW_PSP_RR ; esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 3 -t iops ; done

Set a default so that new EQL volumes will inherit the correct settings;

esxcli storage nmp satp rule add -s "VMW_SATP_EQL" -V "EQLOGIC" -M "100E-00" -P "VMW_PSP_RR" -O "iops=3"

The above command will require a restart of the host before it becomes effective, once you have restarted you can verify the correct settings using the below;

esxcli storage nmp device list

The output will be similar to what is shown below, the parts in bold are what you are looking for;

naa.6090a098703e5059e3e2e483c401f002
Device Display Name: EQLOGIC iSCSI Disk
(naa.6090a098703e5059e3e2e483c401f002) 
Storage Array Type: VMW_SATP_EQL
Storage Array Type Device Config: SATP VMW_SATP_EQL does not support device configuration. 
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=iops,iops=3,bytes=10485760,useANO=0;lastPathIndex=3: NumIOsPending=0,numBytesPending=0}

VMW_SATP_EQL (indicates its an EqualLogic)
VMW_PSP_RR (path selection is set to Round Robin)
policy=iops,iops=3 (shows IOPS have been set to 3)

Next up is to change the default iSCSI timeout values. “By default, the MEM configuration script will make an attempt to set each of these timeout values to 60 seconds which is the recommendation.”, so we will copy that with the below;

esxcli iscsi adapter param set --adapter=vmhba## --key=LoginTimeout --value=60

Replace ## with that of your iSCSI software adapter, for example vmhba33.

After you have done the above you will need to disable Delayed ACK, Dell don’t provide a command line for disabling this, so it needs to be done through the vCentre GUI – please refer to Page 10 in the PDF for steps on how to do this.

Dell do list a few other recommendations, around such things as LRO and SIOC but I have not applied these so I wont go in to detail on what they mention.

Hopefully this post will come in handy for a few people, as I know it held us back a bit!

Please check the commands before you apply them and also work with your vendor if you are unsure of the implications of changes some of the settings listed above will have towards other storage arrays, etc. 🙂

I will soon be posting about the VMkernel/NIC setup for this, as we have had some issues relating to that too – events such as APD! I will update this post with a link to that once it is ready

UPDATE: Post relating to iSCSI setup is here!