Finally upgraded to vSphere ESXi to 7.0u3c (build 19193900)
Upgraded vCenter Server Appliance from 7.0.2.00100 to latest 7.0.3.0030 (vcsa 7.0u3c)
Upgraded Veeam from 22.214.171.1247 to 126.96.36.1991 which support latest vSphere 7.03u3
Upgraded ESXi from 6.7u1ep06 to 7.0u3c...
ESXi upgrade was harder than I thought it would be. I had to research and document my plan for recovery incase the ESXi upgrade failed. I've not performed a recovery for a failed ESXi host before. I ended up retrying over and over again and each time I'd recover and restore config back to ESXi v6.7u1, probably a million times before finally succeeding to ESXi v7.0u3c.
I did this in my off hours over the course of about two weeks. Before each day was over I'd recover back to v6.7u1 just incase I couldn't make time to get back to it again for awhile. I've written myself a nice book/doc with detailed notes and screenshots of the entire ordeal including the failures, diagnosis, Supermicro support communication and the final success of the upgrade. If interested I can post that doc here later. I''ll just summarize for now.
- first attempt to upgrade was to use VCSA (vCenter Server Appliance) Lifecycle Manager (Update Manager in older versions). I found out I cannot use that method. Remediation said I cannot use vcsa to perform the ESXi upgrade if that vcsa is running from a VM of the ESXi host being upgraded.
- second attempt to upgrade was to use ESXi linux shell command line. That looked alot better but then...
After rebooting, ESXi v7.0u3c starts up (as viewed from IPMI html5 console) and shows latest version in ESXi yellow console. But then I notice the back of the server, the on board network Intel x722 10GBASE-T ports LEDs are off and the links are down. And cannot get to the ESXi GUI login url page or vCenter or any VMs.
Using IPMI which allows me access to interact with the ESXi yellow Console. The management network show as down. The two NICs show as being deactivated.
I shell into the ESXi linux command line from the ESXi Console
Below are two screenshots. one of the old v6.7u1 with working adapters. the other showing the adapters after the v7.0u3c upgrade
checking vmware compatibility guide I thought the firmware version v3.33 (firmware, not the driver) might be too low
we can see in the above chart that the v7.0u3 vmware inbox driver is 188.8.131.52 and that is also what shows as my i40en driver version after the upgrade. Notice the firmware version for that driver in the chart is N/A (Not Applicable). But I was worried that the reason for link being down might be from the firmware 3.33 being too low. So i opened a ticket with Supermicro and they send me an Intel NVM firmware update for the x722 that would bring it to firmware v4.11
The utility sent from Supermicro is a small .zip file containing uefi shell scripts. I was instructed to format a FAT32 USB thumb drive and copy the scripts to it. Doesn't have to be bootable. Then reboot into BIOS and the UEFI shell and run the scripts per their instructions...
But that didn't solve the problem.
I then removed the i40en v184.108.40.206 driver and downgraded to driver version 220.127.116.11... then to 1.10.6... all the way down to 1.8.6. None of the drivers with firmware 4.11 (or the old 3.33 firmware) worked. The link status remained down. In other words, the adapters didn't think they had cables connected. really weird.
Then, about ready to recover back to v6.7 again, I figured, why don't I try a fresh install instead of an Upgrade. Perhaps it's not a driver or firmware issue. Perhaps it is something else in my v6.7 configuration that the v7 vmnic0 and vmnic1 physical adapter settings didin't like.
Instead of doing a fresh install, the fastest way to get back to a vanilla ESXi default settings and configuration is to use the yellow ESXi Console via IPMI HTML5.
That doesn't mess up the VMs or datastores or anything, it just clears out customized configuration such as a screwed up network.
That fixed the link status!!! From there I decided to restore back to ESXi v.6.7u1ep06 and take screenshot captures of ALL the ESXi GUI screens, especially the vNetworks, vSwitchs, the physical adapter configs for vmnic0, vmnic1 (the two onboard x722 10GBASE-T ethernet ports).
Then I did a fresh install again. The network adapters remained up. I began to rebuild my ESXi configuration from the screenshots. This is what I found.
NETWORK DIAGNOSIS AND SOLUTION:
in the old ESXi v6.7 configuration, there were two link speeds I can choose from for the 10GBASE-T adapters:
I had set the vmnic0 and vmnic1 link speed to be 10000Mbps (10Gbps). But I only have a 1000Mbps (1Gbps) physical switch, v6.7 isokay with that selection and automatically auto-negotiate down to 1000Mbps even though I selected 10000Mbps link speed for those ports.
*The reason I had set it as 10000Mbps in v6.7 was for when I hope to eventually get a switch to handle 10Gbps. Note: in v6.7 did not have a specific "Auto-Negotiate" option.
In esxi v7, there is a specific option for Auto-Negotiate link speed. But in v6.7 there wasn't, and I had it set to 10000Mbps. And if I Upgrade from v6.7 to v7, that configuration is carried over to v7. v7 will set the physical NIC to Link Speed of "10000Mbps". The i40en Driver will NOT auto-negotiate down to 1Gbps because the configuration isn't explicitly set to Auto-Negotiate... it is at 10000Mbps. And since that speed cannot be achieved because of 1Gb hardware switch , the driver sets Link Status to be DOWN
Once that was figured out, we're now at a fully working v7.0u3c. Honestly I don't notice any difference between 6.7 and v7 yet. Was it worth it?