VMWare Mega Build

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

loopback14

New Member
Jan 2, 2018
8
0
1
36
Hey man, did you ever get this thing running? I am planning a similar build with X11SPL-F for my networking lab (I presume you are to - CCIE?)...
 

Kryax

Member
Oct 14, 2017
44
1
8
Hey man, did you ever get this thing running? I am planning a similar build with X11SPL-F for my networking lab (I presume you are to - CCIE?)...
Unfortunately the processor has been backordered for awhile (since mid December). I would of cancelled and ordered elsewhere but was promised twice that it would be here early January. Finally got shipment confirmation and should be here Monday. I have everything else and with photos below though:

IMG_20180114_162812.jpg IMG_20180114_162843.jpg IMG_20180114_162858.jpg

Might be a few more weeks before I get stuff operational though and it will be in stages. I have to start with hardware testing, setting up ESXI, and then Freenas VM/Storage before I get to other stuff such as Cisco VIRL. I have been getting by on the studies with my current desktop with Cisco VIRL and GNS3 though. Once I study for my CCIE lab then I will be using new VIRL setup quite a bit.
 

loopback14

New Member
Jan 2, 2018
8
0
1
36
Thanks for the reply Kryax! That's a beastly setup you got there :) Can you enable PM on your profile so I don't spam the forum with the networking CCIE stuff and lab setups...? :)
 

K D

Well-Known Member
Dec 24, 2016
1,439
320
83
30041
Any questions you ask will definitely not be spam. Having it out here in the forums will add to the vast knowledge out here. I for one learn a lot if stuff by just browsing through the forums.
 

Kryax

Member
Oct 14, 2017
44
1
8
So thought I would provide an update along with some of the current issues I am running into:

- Initial hardware testing has mostly been successful (CPU, memory, motherboard, chassis).
- Device boots up and recognized almost all other hardware (2 x Intel 900P, Intel NIC).
- Ran into issue where system was going into a constant boot loop when I installed the LSI 9300-8i.
A. Could not get IPMI or even desktop to show bootup screen for LSI.
B. Installed LSI 9300-8i into regular desktop and card recognized and boot into LSI menu.
C. Changed BIOS settings in Supermicro X11SPL-F so the PCI slot was set to EFI.
D. Machine then booted up fine but IPMI KVM did not show a bootup option for the LSI bios.
- Was able to successfully install ESXI 6.5u1.
- Installed VCenter Server VM.
- Setup Freenas VM
A. Was able to successfully do a passthrough of the LSI 9300-8i
B. Bootup Freenas VM and was able to recognize all 24 drives.
C. Created 1 pool with 3 vdev (8 drives per vdev).
- Started setup of Freenas share.
- Began transferring data from network to new share for testing.
A. Started getting UDMA errors for random drives as I was writing data to the share drive (see screenshot)

The last step is what I am trying to figure out what are the culprits for getting the errors. Tried the following to troubleshoot:

1. Connected both just 1 to start and then 2 cables to J49 of the BPN-SAS3-846EL backplane and LSI 9003-8i.
2. When I did passthrough of the LSI 9003, I made sure that memory was reserved.
3. Rebuilt storage pool.

Things I am considering trying or could be a limitation:

1. Faulty backplane, cables, or LSI 9003.
2. Still need to perform SMART test and burn of drives.
2. Need to adjust settings from the LSI 9003 bios?
3. Hardware incompatibility with 9300-8i and a BPN-SAS3-846EL backplane?
4. Freenas tuning?
5. Oversaturating the SAS cable bandwidth with the pool speed and causing dropouts?

If anyone has any suggestions it would be much appreciated.
 

Attachments

azev

Well-Known Member
Jan 18, 2013
768
251
63
I would try replacing the cable and see if that helps, another thing to look into is the firmware version on your 9300 I think P20 is the recommended and supported version for freenas.
 

Kryax

Member
Oct 14, 2017
44
1
8
I would try replacing the cable and see if that helps, another thing to look into is the firmware version on your 9300 I think P20 is the recommended and supported version for freenas.

I will mess around with updating firmware and ordering some other branded cables to test them out this weekend. The P20 you reference was probably the version for older cards though. LSI website shows P15 as the newest SAS 9300-8i Host Bus Adapter. Was wondering what version others who run Freenas who might have a 3008 based controller are running?

Also found this post under #5 message for possible and simple instructions albeit for firmware P9:
SAS3 flash
Wanted to verify that these instructions are still valid with just adjust the PXX firmware number based on what people recommend. Instructions below:

1. Search Support Documents and Downloads for 12 Gb/s SAS Host Bus Adapters > SAS 9300-8i Host Bus Adapter > Firmware.
2. Download 9300_8i_Package_P9_IR_IT_Firmware_BIOS_for_MSDOS_Windows.
3. Download Installer_P9_for_UEFI.
4. Format USB to MS-DOS (FAT).
5. Copy 9300_8i_Package_P9_IR_IT_Firmware_BIOS_for_MSDOS_Windows/Firmware/SAS9300_8i_IT/SAS9300_8i_IT.bin into USB.
6. Copy Installer_P9_for_UEFI/sas3flash_udk_uefi_x64_rel/sas3flash.efi into USB.
7. Attach USB to server and boot up, Press F11 to enter boot menu, Boot into EUFI Shell.
8. Clean flash (erase everything except manufacturing area) to avoid “Cannot downgrade NVDATA version” with command:
sas3flash.efi -o -e 6
9. Finally, do the flash with command sas3flash.efi -fwall SAS9300_8i_IT.bin
 

Kryax

Member
Oct 14, 2017
44
1
8
Decided to move forward with both GNS3 and VIRL installation until my new cables come in to troubleshoot Freenas. Setup both but here are some screenshots with Cisco VIRL running 20 x CSRv routers. Quite the RAM hog!

Cisco VIRL Maestro.jpg Cisco VIRL Resources.jpg Cisco VIRL SuperPutty.jpg
 

Kryax

Member
Oct 14, 2017
44
1
8
So got back to working on Freenas VM this weekend. Updated firmware to P15 for the LSI 9300-8i. Replaced cables with different brand. Booted up Freenas and did another test transfer and was getting UDMA SCSI dropout errors more frequently. Did more research and saw that since I am connecting the backplane with SATA 6gb drives that the cable spec is tighter than SAS 12gb drives. They were recommending 50cm or less cabling and I went back and checked what I originally ordered which were 60cm cables. Just put in an order for 25cm cables that should be long enough as right now I do have too much slack in the SAS cable. Should be sometime next week I will try to the cable replacement but I hope this fixes the issue.

Still unsure about connecting 1 or 2 cables to the J49 jumper on the backplane of my chassis. From what I read from the documentation, it states this is for redundancy and not as an aggregate for bandwidth. Can anyone provide information for the BPN-SAS3-846EL1 backplane on which jumpers (J49 vs J50) and should I use 1 or 2 cables to connect to the 9300-8i? From what I understand the J50 jumper is used to connect to another external backplane but I could be misunderstanding that.
 

Kryax

Member
Oct 14, 2017
44
1
8
New cables did not fix the issue. Was reading up on the Freenas forums a bit more but did not really come across anyone that had the same type of issue with HGST SATA drives (did see a few posts in regards to Seagate 10TB enterprise and ironwolf drives causing sync cache and other errors). Below are some more of the outputs:

# sas3flash -list
Code:
Avago Technologies SAS3 Flash Utility
Version 15.00.00.00 (2016.11.17)
Copyright 2008-2016 Avago Technologies. All rights reserved.

        Adapter Selected is a Avago SAS: SAS3008(C0)

        Controller Number              : 0
        Controller                     : SAS3008(C0)
        PCI Address                    : 00:03:00:00
        SAS Address                    : 500605b-0-0898-a680
        NVDATA Version (Default)       : 0e.00.00.07
        NVDATA Version (Persistent)    : 0e.00.00.07
        Firmware Product ID            : 0x2221 (IT)
        Firmware Version               : 15.00.02.00
        NVDATA Vendor                  : LSI
        NVDATA Product ID              : SAS9300-8i
        BIOS Version                   : N/A
        UEFI BSD Version               : N/A
        FCODE Version                  : N/A
        Board Name                     : SAS9300-8i
        Board Assembly                 : N/A
        Board Tracer Number            : N/A

        Finished Processing Commands Successfully.
        Exiting SAS3Flash.
# smartctl -a /dev/da1
Code:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HDN728080ALE604
Serial Number:   
LU WWN Device Id: 5 000cca 261cceb34
Firmware Version: A4GNW91X
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Feb 12 20:10:24 2018 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  101) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1108) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   132   132   054    Pre-fail  Offline      -       112
  3 Spin_Up_Time            0x0007   146   146   024    Pre-fail  Always       -       451 (Average 452)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       11
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   128   128   020    Pre-fail  Offline      -       18
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       252
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       11
 22 Unknown_Attribute       0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       29
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       29
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       36 (Min/Max 22/37)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       6

SMART Error Log Version: 1
ATA Error Count: 6 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 6 occurred at disk power-on lifetime: 247 hours (10 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 20 60 26 44 40 00      00:51:27.143  WRITE FPDMA QUEUED
  2f 00 01 10 00 00 00 00      00:51:27.143  READ LOG EXT
  61 50 30 60 28 44 40 00      00:51:27.141  WRITE FPDMA QUEUED
  61 00 28 60 27 44 40 00      00:51:27.141  WRITE FPDMA QUEUED
  61 00 18 60 25 44 40 00      00:51:27.141  WRITE FPDMA QUEUED

Error 5 occurred at disk power-on lifetime: 247 hours (10 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 20 00 e8 21 40 40 00      00:38:37.673  WRITE FPDMA QUEUED
  2f 00 01 10 00 00 00 00      00:38:37.673  READ LOG EXT
  61 08 00 e0 21 40 40 00      00:38:37.672  WRITE FPDMA QUEUED
  61 08 00 d8 21 40 40 00      00:38:37.672  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 00 00      00:38:33.575  FLUSH CACHE EXT

Error 4 occurred at disk power-on lifetime: 247 hours (10 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 58 00 18 7b 40 40 00      00:37:55.387  WRITE FPDMA QUEUED
  2f 00 01 10 00 00 00 00      00:37:55.387  READ LOG EXT
  61 58 00 c0 7a 40 40 00      00:37:55.386  WRITE FPDMA QUEUED
  61 28 00 18 22 40 40 00      00:37:55.386  WRITE FPDMA QUEUED
  61 58 00 68 7a 40 40 00      00:37:55.386  WRITE FPDMA QUEUED

Error 3 occurred at disk power-on lifetime: 245 hours (10 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 00 a0 d7 43 40 00      00:11:05.278  WRITE FPDMA QUEUED
  2f 00 01 10 00 00 00 00      00:11:05.278  READ LOG EXT
  61 c8 10 a0 d9 43 40 00      00:11:05.277  WRITE FPDMA QUEUED
  61 00 08 a0 d8 43 40 00      00:11:05.277  WRITE FPDMA QUEUED
  61 e8 10 b8 d6 43 40 00      00:11:05.275  WRITE FPDMA QUEUED

Error 2 occurred at disk power-on lifetime: 245 hours (10 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 80 00 a0 e0 40 40 00      00:10:45.710  WRITE FPDMA QUEUED
  2f 00 01 10 00 00 00 00      00:10:45.710  READ LOG EXT
  61 20 00 80 e0 40 40 00      00:10:45.709  WRITE FPDMA QUEUED
  61 c8 08 b8 df 40 40 00      00:10:45.699  WRITE FPDMA QUEUED
  61 00 00 b8 de 40 40 00      00:10:45.699  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       134         -
# 2  Extended offline    Completed without error       00%       111         -
# 3  Extended offline    Aborted by host               60%        93         -
# 4  Short offline       Completed without error       00%        84         -
# 5  Short offline       Completed without error       00%        60         -
# 6  Short offline       Completed without error       00%        36         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Errors when doing a 1 x 3gb file transfer from PC to Freenas VM:
Code:
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e1 20 00 01 00 00 length 131072 SMID 1013 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e4 20 00 00 38 00 length 28672 SMID 282 terminated ioc 804b loginfo 31120303 (da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e1 20 00 01 00 00
scsi 0 state c xfer 0
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e3 20 00 01 00 00 length 131072 SMID 330 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e2 20 00 01 00 00 length 131072 SMID 989 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da11:mpr0:0:18:0): CAM status: CCB request completed with an error
(da11:mpr0:0:18:0): Retrying command
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e4 20 00 00 38 00
(da11:mpr0:0:18:0): CAM status: CCB request completed with an error
(da11:mpr0:0:18:0): Retrying command
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e3 20 00 01 00 00
(da11:mpr0:0:18:0): CAM status: CCB request completed with an error
(da11:mpr0:0:18:0): Retrying command
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e2 20 00 01 00 00
(da11:mpr0:0:18:0): CAM status: CCB request completed with an error
(da11:mpr0:0:18:0): Retrying command
(da11:mpr0:0:18:0): WRITE(10). CDB: 2a 00 40 41 e1 20 00 01 00 00
(da11:mpr0:0:18:0): CAM status: SCSI Status Error
(da11:mpr0:0:18:0): SCSI status: Check Condition
(da11:mpr0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da11:mpr0:0:18:0): Retrying command (per sense data)
(da20:mpr0:0:27:0): WRITE(10). CDB: 2a 00 40 43 5e 18 00 00 b8 00 length 94208 SMID 292 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da20:mpr0:0:27:0): WRITE(10). CDB: 2a 00 40 43 5d 18 00 01 00 00 length 131072 SMID 434 terminated ioc 804b loginfo 31120303(da20:mpr0:0:27:0): WRITE(10). CDB: 2a 00 40 43 5e 18 00 00 b8 00
 scsi 0 state c xfer 0
(da20:mpr0:0:27:0): CAM status: CCB request completed with an error
(da20:mpr0:0:27:0): Retrying command
(da20:mpr0:0:27:0): WRITE(10). CDB: 2a 00 40 43 5d 18 00 01 00 00
(da20:mpr0:0:27:0): CAM status: CCB request completed with an error
(da20:mpr0:0:27:0): Retrying command
(da20:mpr0:0:27:0): WRITE(10). CDB: 2a 00 40 43 5d 18 00 01 00 00
(da20:mpr0:0:27:0): CAM status: SCSI Status Error
(da20:mpr0:0:27:0): SCSI status: Check Condition
(da20:mpr0:0:27:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da20:mpr0:0:27:0): Retrying command (per sense data)
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 16 48 00 01 00 00 length 131072 SMID 333 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 19 48 00 01 00 00 length 131072 SMID 297 terminated ioc 804b loginfo 31120303(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 16 48 00 01 00 00
 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 1a 48 00 00 f0 00 length 122880 SMID 624 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 15 48 00 01 00 00 length 131072 SMID 663 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 14 48 00 01 00 00 length 131072 SMID 337 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 13 48 00 01 00 00 length 131072 SMID 806 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 17 48 00 01 00 00 length 131072 SMID 251 terminated ioc 804b loginfo 31120303(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 19 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 1a 48 00 00 f0 00
 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 18 48 00 01 00 00 length 131072 SMID 311 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 15 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 14 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 13 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 17 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 18 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: CCB request completed with an error
(da13:mpr0:0:20:0): Retrying command
(da3:mpr0:0:10:0): WRITE(10). CDB: 2a 00 40 47 42 00 00 01 00 00 length 131072 SMID 386 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da3:mpr0:0:10:0): WRITE(10). CDB: 2a 00 40 47 43 00 00 00 e8 00 length 118784 SMID 961 terminated ioc 804b loginfo 31120303 (da3:mpr0:0:10:0): WRITE(10). CDB: 2a 00 40 47 42 00 00 01 00 00
scsi 0 state c xfer 0
(da3:mpr0:0:10:0): CAM status: CCB request completed with an error
(da3:mpr0:0:10:0): Retrying command
(da3:mpr0:0:10:0): WRITE(10). CDB: 2a 00 40 47 43 00 00 00 e8 00
(da3:mpr0:0:10:0): CAM status: CCB request completed with an error
(da3:mpr0:0:10:0): Retrying command
(da3:mpr0:0:10:0): WRITE(10). CDB: 2a 00 40 47 42 00 00 01 00 00
(da3:mpr0:0:10:0): CAM status: SCSI Status Error
(da3:mpr0:0:10:0): SCSI status: Check Condition
(da3:mpr0:0:10:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da3:mpr0:0:10:0): Retrying command (per sense data)
(da13:mpr0:0:20:0): WRITE(10). CDB: 2a 00 40 45 13 48 00 01 00 00
(da13:mpr0:0:20:0): CAM status: SCSI Status Error
(da13:mpr0:0:20:0): SCSI status: Check Condition
(da13:mpr0:0:20:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da13:mpr0:0:20:0): Retrying command (per sense data)
I may try to revert to trying both doing a fresh install of Freenas 9.10 train over 11.1 to further test. Also may test downgrading to firmware version P5 on the LSI 9300.

I have an inkling that SATA over SAS with an LSI 9300 card to this backplane was not meant to be. The SMART data output does show I am correcting connecting as SATA and at 6gb though. May have to resort to buying a 9200 card just to test but would rather not spend it if I run into the same issue.
 
Last edited:

sth

Active Member
Oct 29, 2015
379
91
28
OOC, try the same hardware under the same hardware with Linux and/or OmniOS/Napp-it.
 

Kryax

Member
Oct 14, 2017
44
1
8
OOC, try the same hardware under the same hardware with Linux and/or OmniOS/Napp-it.
Not familiar with running ZFS over anything other than Freenas. Once I try the above mentioned I may resort to trying it. I assume I can build the initial pool in Freenas, detach the pool, then install another OS and import said pool. But again I am not as familar with ZFS with those other OS's and mostly use the Freenas GUI to get by although can use some command line.

I did try testing out various firmware versions of the 9300-8i. P5 resulted in way more numerous errors and dropouts than P15. Also you lose the ability to see SMART for any drive. P14 showed about the same amount of dropouts as P15.

Also tried various ZFS configurations:
RaidZ2 3 x vdev (8 drives per vdev)
RaidZ2 4 x vdev (6 drives per vdev)
RaidZ2 1 x vdev (6 drives per vdev)

All had the same dropouts so I am leaning towards ruling out that cable saturation from 24 drives connected as an issue. I will probably continue this more over the weekend with trying out Freenas 9.10 train instead of 11.1 to see if I get the same errors.
 

Kryax

Member
Oct 14, 2017
44
1
8
Well tried the following with still no luck:

1. Installed Freenas 9.10u6. Created various Raidz2 configurations mentioned previous post and still resulted in exact same errors.
2. Tried verifying settings from Supermicro X11 Bios. Changed PCI-E slot 7 (slot that the LSI 9300 is connected) from EFI to Legacy. Still no change and same errors.
3. Edited Freenas VM settings.
A. CPU > "Expose hardware assisted virtualization to the guest"
B. VM Options > Boot Options > Firmware > Change from BIOS to EFI. Tried to reinstall Freenas the UEFI route but system would not even get past the initial install screen.

So starting to run out of idea's to troubleshoot the current setup. Last thing I might try is setting up Freenas baremetal and not as a VM with ESXI. I have another boot drive I can load. If that doesn't work, I am not sure I want to spend the time learning another ZFS compatible OS so I may just try another LSI HBA. So can anyone recommend a LSI HBA that would be compatible with the following:

- 24 x HGST 8TB SATA drives
- 1 x BPN-SAS3-846EL1 (24-port 4U SAS3 12Gbps single-expander backplane, support up to 24x 3.5-inch SAS3/SATA3 HDD/SSD)
- Capable of passthrough in ESXI
 

sth

Active Member
Oct 29, 2015
379
91
28
You should be good with most LSI cards. I've successfully used a mixture of 9207, 9211, 9300 and 9305s all of which worked without issue in IT mode with SAS2 backplanes and a mixture of drives from Seagate & WD.
EDIT: If you are going to buy another card to test, go with the 9211 which is about the most popular and tested card there is.

I'd suggest keeping things simple... install FreeNAS to bare metal, create a 2 drive mirror and validate that works before incrementally adding additional hardware / complexity.

I'd also suggest again value in throwing a Linux system on your hardware to validate too.... Ive had this same issues as this with one model of Seagate drives with FreeNAS I just couldnt get to the bottom of, performance also sucked when it was stable enough to run. Its a single zfs create command to create an array so really isnt very complex or time consuming to do. It appears to be the one thing you havent checked yet and arguably the cheapest and easiest.
 
Last edited:

Kryax

Member
Oct 14, 2017
44
1
8
You should be good with most LSI cards. I've successfully used a mixture of 9207, 9211, 9300 and 9305s all of which worked without issue in IT mode with SAS2 backplanes and a mixture of drives from Seagate & WD.

I'd suggest keeping things simple... install FreeNAS to bare metal, create a 2 drive mirror and validate that works before incrementally adding additional hardware / complexity.

I'd also suggest again value in throwing a Linux system on your hardware to validate too.... Ive had this same issues as this with one model of Seagate drives with FreeNAS I just couldnt get to the bottom of, performance also sucked when it was stable enough to run. Its a single zfs create command to create an array so really isnt very complex or time consuming to do. It appears to be the one thing you havent checked yet and arguably the cheapest and easiest.
Will do a baremetal Freenas install first but I am willing to try another OS. If you got any good tutorial sites or even videos that show what commands to create a pool, allow a guest access/share, and allows SMB for a windows OS to access, I am willing to do it.
 

sth

Active Member
Oct 29, 2015
379
91
28
There was a link on page one to a Napp-it it all-in-one VM you can deploy which makes things trivial if you wish to stay with a GUI approach. Learning some zfs CLI skills wouldnt be a total waste of time though if you go the linux route.
 

Kryax

Member
Oct 14, 2017
44
1
8
Baremetal Freenas install exhibited the same exact issues.

There was a link on page one to a Napp-it it all-in-one VM you can deploy which makes things trivial if you wish to stay with a GUI approach. Learning some zfs CLI skills wouldnt be a total waste of time though if you go the linux route.
Did you mean the homepage or one of the forums? Tried checking a few places and did not see what you were referencing.
 

sth

Active Member
Oct 29, 2015
379
91
28
K D's post on page 1 of this thread. He put together a nice guide for Napp-it in ESXI
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Can you direct attach the drives to the mainboard while you run FreeNas BareMetal ? Just two or 4
That would rule out the HBA...
 

sth

Active Member
Oct 29, 2015
379
91
28
If you have no other drives connected to your motherboard, you can pass through the onboard controller same as you would a PCI based HBA. Make sure its configured for non_RAID configuration and it might be present under a sub-system ID, I recall mine is a Wellsburg AHCI controller FYI.esxipassthrough.png

Passing through individual disks isn't recommended.