Friday, January 31, 2014

The case of the disappearing LUNs!

The case of the disappearing LUNs!

We ran into an interesting situation at work this week. For one cluster of ESXi hosts, LUNs were disappearing seemingly at random. Now, we had been using this cluster as a 'Jump' cluster to facilitate moving from one environment to another, so it had lots and lots of LUNs attached. (162, in fact).

We found that every time we would 'Rescan' for new storage, some LUNs would be discovered and others would disappear. It seemed pretty obvious that we were running into some sort of storage maximums, but we were well below the ESXi published maxiums of 256 attached LUNs and 1024 physical paths.  We had previously run into another issue with VMFS heap, so every host in this cluster already had VMFS3.MaxHeapSizeMB set to its maximum. A real stumper.

A call to VMware and we found that we were, in fact, running into a heap issue, but it was not the VMFS heap that was filling.  The lvmdriver also has a heap setting, which by default is 42000KB (~42MB). This lvmdriver heap was completely full.  From the hosts' consoles, this was easily fixed by running an esxcli-module command.  That's great, but not going to work in an environment with 100s of hosts.  PowerCLI to the rescue!!

For today's fun, I wrote a quick function to get the lvmdriver heap size for ESXi 5.0 hosts.  Since I know the default from the VMware case, if the option in the lvmdriver is not explicitly set then the function returns the default value. 

function Get-lvmDriverHeapSize {
<#
.SYNOPSIS
 Gets the lvmDriverHeapSize configured for a 5.0 host
.DESCRIPTION
 This function will get the lvmDriverHeapSize for a 5.0 host if it
 has been configured.
.NOTES
 Author: Cheryl L. Barnett
 .PARAMETER VMhost
 The host object to check
.EXAMPLE
 PS> Get-lvmDriverHeapSize -VMhost (Get-VMHost Test01.fqdn.local)
.Example
 PS> Get-VMHost Test01.fqdn.local | Get-lvmDriverHeapSize
#>
  param(
      [parameter(valuefrompipeline = $true, mandatory = $true,
     HelpMessage = "Enter a VMHost entity")]
    [VMware.VimAutomation.ViCore.Impl.V1.Inventory.VMHostImpl] `
        $VMhost)
  process{
    #Note that if multiple options are set then this entire function breaks
    if (!($_.extensiondata.config.product.version -match "5.0"))
        {
        Throw "$($VMHost) is not an ESXi 5.0 host! Please use this function for` 5.0 hosts only!"
        }
    $maxHeapSizeString = (get-vmhost $_ | get-vmhostmodule ` lvmdriver).options.tostring()
    if($maxHeapSizeString -match "maxHeapSizeKB=")
        {
        $maxHeapSizeKB = [int]($maxHeapSizeString.split("="))[1]
        }
        else
            {
            $maxHeapSizeKB = 42000
            }
    $maxHeapSizeKB
    }
}


So, with this function you can sweep through your environment relatively quickly and get the configured lvmdriver heap size. Note the caveat in the code - if the lvmdriver has multiple options set then this code will break.  That's not an issue I want to tackle right now, since it's way beyond the scope of what I need to do to fix my problem.  Also, watch the line breaks there. I think I put all the backticks in the right place but no guarantees.

If all goes well, next week I'll have a function to set the lvmdriver heap size.