Sunday, May 31, 2015

CIMC Storage log filled with Unexpected sense errors

We have many standalone Cisco UCS C240 M3S hosts, and on all of them the CIMC storage log is being filled with 'Unexpected sense' errors every 5 minutes.

According to the release notes (http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/release/notes/OL-32046-01.html), this is due to VMware treating all storage devices the same way, regardless of whether they are SAS disks or just enclosures.  They list a manual workaround that will stop these messages from being logged in the vmkernel.log and in the CIMC storage log that involves ssh'ing into the host and running an esxcli command.

Of course, I don't want to turn on ssh and remote to the host to run the commands one host at a time - PowerCLI to the rescue!

These steps will  "disconnect" the enclosure from the viewpoint of ESXi.  The parameter set is correct for ESXi 5.5. You may need to adjust the parameters for different ESXi versions.

connect-viserver myhost.mylab.com
$esxcli = get-esxcli -vmhost myhost.mylab.com
$device = ($esxcli.storage.core.device.list() | where-object {$_.DeviceType -match "enclosure"}).device
$esxcli.storage.core.device.set($null,$device,$false,$null,$false,$null,$null,$null,"off")

As always, watch the line breaks when you copy/paste. From here, it should be easy to script an automated solution for an entire environment. Note that if you have lockdown mode enabled there may be other steps required to allow you to connect via get-esxcli.

Saturday, May 30, 2015

Get boot time of ESXi host in local time with PowerCLI

A co-worker came to me yesterday with a request for something that seemed quite easy but came with an unexpected twist.  He needed to get the boot time for a group of ESXi hosts.  Finding the code for this wasn't hard with a quick google search.

For a single host:
PowerCLI C:\script> (get-vmhost esxhost01.example.local).extensiondata.runtime.boottime

Tuesday, March 31, 2015 1:13:07 AM

Using select makes it easier to report on a group of hosts:

PowerCLI C:\script> get-vmhost | select name,@{Name="BootTime";Expression={$_.extensiondata.runtime.boottime}}

Name                                          BootTime              
----                                          --------          
esxhost01.example.local                       5/1/2015 2:57:32 PM 
esxhost02.example.local                       5/29/2015 1:54:30 PM


Piece of cake!  Except this was UTC time, and it makes more sense for us to report our boot time in local time.  Here's the thing - a [datetime] object in Powershell has a method toUniversalTime() but there is no reverse method to convert UTC to local time.

What's the answer? Use New-Timespan to determine your offset from UTC time and the addhours() method of the [datetime] object.  Your offset from UTC can be calculated as:

$offset = (new-timespan -start (get-date).touniversaltime() -end (get-date)).hours

For one host:
(get-vmhost esxhost01.example.local).extensiondata.runtime.boottime.addhours((new-timespan -start (get-date).touniversaltime() -end (get-date)).hours)

For a group of hosts:
get-vmhost | select name,@{Name="BootTime";Expression={$_.extensiondata.runtime.boottime.addhours((new-timespan -start (get-date).touniversaltime() -end (get-date)).hours)}}

Sorry for the line breaks :)

Tuesday, May 12, 2015

One-Liner to get the VMs restarted by HA after an ESXi host PSOD

PSODs suck.  We know that.  We've been getting the occasional PSOD after going to 5.5 U2, and when a prod host crashes the first thing everyone wants to know is: Which VMs were affected?

I put together a one-liner to get this information based on the event that is recorded on the VM when HA restarts it on another host.

get-vmhost ProdHost01.example.local | get-cluster | get-vm | Get-VIEvent -Start (Get-Date).AddHours(-24) -types warning | where-object {$_.fullformattedmessage -match "restarted"} | select objectname,createdtime,fullformattedmessage

Breakdown:
get-vmhost : get the vm host object of the host that went down with a PSOD
get-cluster : gets the cluster that host was in
get-vm : gets all the VMs in that cluster
Get-VIEvent -Start (Get-Date).AddHours(-24) -types warning : gets all the events of type 'warning' within the past 24 hours (adjust the # according to how long ago the PSOD was.
where-object {$_.fullformattedmessage -match "restarted"}: filters the events for those that match 'restarted'.

I'm working on turning this into an actual function that would save some typing.