VM shutting down because of passthrough device

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
I have a FreeNAS VM with LSI 9217-8i card that's set to passtrough.
I've had this server for month and it's been running flawlessly until very recently when I found out the VM was offline a few times.
Only today I started digging into it, found out where ESXi stores the VM logs, and I am not happy at all.

Code:
2020-03-07T17:55:40.569Z| vmx| I125: [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vmx)
2020-03-07T17:55:40.569Z| vmx| I125+ PCI passthru device 0000:02:00.0 caused an IOMMU fault type 6 at address 0xc0000000.  Powering off the virtual machine.  If the problem persists please contact the device's vendor.
2
I have updated FreNAS like a week ago but I have no idea if this started happening before that or not.
I also have absolutely no idea what to do now.

The server is running ESXi 6.7U3 or whatever the last version is.
Any ideas? :(
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
I noticed yesterday that one of my FreeNas boxes had rebooted a couple of times - timing matches after the upgrade to 11.3 or very close around.
Have not checked closer, just wanted to provide a data point
 
  • Like
Reactions: T_Minus

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
Hm, could be related, but since you're running standalone box(es) rather than virtualized setup, who knows.
I guess I can't do anything but wait, because this is not very troubleshootable.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
so there are some issues over at the FN forum regarding some SAS3 hba's /ssds and reboots, related to smart..

its not what you have but might be worth checking out
 
  • Like
Reactions: T_Minus

vangoose

Active Member
May 21, 2019
326
104
43
Canada
so there are some issues over at the FN forum regarding some SAS3 hba's /ssds and reboots, related to smart..

its not what you have but might be worth checking out
Saw people have issues with SSDs on LSI. Mine is stable witt spindle disks only.
 

cooldude919

New Member
Sep 8, 2016
28
13
3
39
FYI i am running into the same issue, caused an IOMMU fault type 6 at address 0xc0000000. Just started happening recently (server from 2017), i was messing with passing a gtx 1050 TI to another VM on the same server. I took it out and put everything back and its still happening. I was on 11.2 u8 from May, so my freenas version didnt change anytime recently, vmware version 6.5. Natex s2600 cp. Any ideas would be great, ive tried lots of things and havent found a solution yet.
 

cooldude919

New Member
Sep 8, 2016
28
13
3
39
I updated to the latest 11.3 release. I also took the lsi hba card out, and took the HS off and cleaned everything up and put new arctic silver on. The issue seemed to be load related, meaning if i was pushing heavy data to it it would seem to cause it to happen, so I read one of the other threads about it possibly being something with the HBA itself or it overheating. So far the issue has not happened since then, will continue to monitor and see.
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
So this started happening again out of the blue after two years, and it's just a mess. The TrueNAS VM won't even stay up for a few hours now.
I don't know what to do :( Someone help!
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
Shall I maybe buy a different HBA? There must be some sort of incompatibility I guess, but why I had no problems for two years I don't understand.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Its worth a try.
But have you been doing any changes recently? New TNC release? bios update? New drives?
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
I don't even know if ESXi drivers are updataable, so no.
I did a BIOS update on the server at some point in the recent time, but I don't think it started right after that.
No idea what TNC is.

I currently have LSI 2308-based HBA (not sure how to find out the exact card model without pulling it out, it's probably LSI 9207-something ) in the server. What else can I get that's cheap enough and works the same?
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
If you're buying again may as well try slightly newer SAS3 version the LSI 3008
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
TNC is the newer version of Freenas, i was assuming you had updated to that by now;)

I have a FreeNAS VM with LSI 9217-8i card that's set to passtrough.
So you'll prolly still have that;)

You could check if there is a newer bios for the card.
Getting a new one depends on what cheap is for you.

3008 as @T_Minus just said is an option, youll need new cables too then, but thats more future proof (for a couple of years at least).

Where are you located? (to know which Ebay location to look at)
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
Ah in that case yes, I updated to TrueNAS 12 a few months ago.

New cables would suck, it's not too long ago I swapped the WD Reds for SAS which I got off Ebay relatively cheaply, and I had to buy new cables too.
I'm from Czech republic
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
You could try TNC 13 o/c, but I am not sure it will make a difference. Newer HBA FW might though - unless you are running the recommended level already?

Usually there is a recommended version matching the drivers in the respective release (not sure if they alert if you on mismatch nowadays or not)

Else HBAs seem to be quite expensive atm :/