Supermicro X8SI6-F LSI SAS2008 IT mode (JBOD) problems

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

rossetyler

New Member
Oct 9, 2011
5
0
0
I am having troubles getting my Supermicro X8SI6-F motherboard to use SATA disks connected to its LSI SAS2008 SAS/SATA (RAID) controller in Just a Bunch Of Disks (JBOD, AKA IT (Initiator Target)) mode.

I am using Linux (Fedora 14) with the latest kernel (2.6.35.14-95.fc14.x86_64).
I have downloaded and installed the latest LSI SAS2008 firmware and BIOS from Supermicro (ftp://ftp.supermicro.com/driver/SAS/LSI/2008/IT/Firmware/PH10.zip).
I have also downloaded, compiled and installed the latest mpt2sas driver (both from Supermicro and LSI - they are the same).

I connected 2 Western Digital RE4-GP 2TB drives
http://www.wdc.com/en/products/products.aspx?id=40
(I have tried some other old Maxtor 160GB drives as well, with similar results)

I checked that the LSI BIOS reported them as directly attached.
I successfully formatted both drives from the BIOS.

The kernel appears to detect them (sd{f,g} at boot time):

> Sep 16 20:39:41 server kernel: [ 6.037674] mpt2sas version 10.00.00.00 loaded
> Sep 16 20:39:41 server kernel: [ 6.038140] scsi6 : Fusion MPT SAS Host
> Sep 16 20:39:41 server kernel: [ 6.038715] mpt2sas 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> Sep 16 20:39:41 server kernel: [ 6.039028] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (8187144 kB)
> Sep 16 20:39:41 server kernel: [ 6.039638] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 51
> Sep 16 20:39:41 server kernel: [ 6.039902] mpt2sas0: iomem(0x00000000fb43c000), mapped(0xffffc90011a40000), size(16384)
> Sep 16 20:39:41 server kernel: [ 6.040393] mpt2sas0: ioport(0x000000000000c000), size(256)
> Sep 16 20:39:41 server kernel: [ 6.114914] mpt2sas0: sending message unit reset !!
> Sep 16 20:39:41 server kernel: [ 6.117934] mpt2sas0: message unit reset: SUCCESS
> Sep 16 20:39:41 server kernel: [ 6.167613] mpt2sas0: Allocated physical memory: size(8051 kB)
> Sep 16 20:39:41 server kernel: [ 6.167884] mpt2sas0: Current Controller Queue Depth(3577), Max Controller Queue Depth(3712)
> Sep 16 20:39:41 server kernel: [ 6.168356] mpt2sas0: Scatter Gather Elements per IO(128)
> Sep 16 20:39:41 server kernel: [ 6.228980] mpt2sas0: LSISAS2008: FWVersion(10.00.02.00), ChipRevision(0x02), BiosVersion(07.19.00.00)
> Sep 16 20:39:41 server kernel: [ 6.229459] mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
> Sep 16 20:39:41 server kernel: [ 6.230700] mpt2sas0: sending port enable !!
> Sep 16 20:39:41 server kernel: [ 6.233564] mpt2sas0: host_add: handle(0x0001), sas_addr(0x5003048000771f30), phys(8)
> Sep 16 20:39:41 server kernel: [ 6.241981] mpt2sas0: port enable: SUCCESS
> Sep 16 20:39:41 server kernel: [ 6.249474] scsi 6:0:0:0: Direct-Access ATA WDC WD2002FYPS-0 1G01 PQ: 0 ANSI: 6
> Sep 16 20:39:41 server kernel: [ 6.250033] scsi 6:0:0:0: SATA: handle(0x0009), sas_addr(0x4433221100000000), phy(0), device_name(0x50014ee25b264649)
> Sep 16 20:39:41 server kernel: [ 6.250505] scsi 6:0:0:0: SATA: enclosure_logical_id(0x5003048000771f30), slot(0)
> Sep 16 20:39:41 server kernel: [ 6.251204] scsi 6:0:0:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
> Sep 16 20:39:41 server kernel: [ 6.254996] scsi 6:0:0:0: serial_number( WD-WCAVY6988280)
> Sep 16 20:39:41 server kernel: [ 6.255255] scsi 6:0:0:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
> Sep 16 20:39:41 server kernel: [ 6.262965] scsi 6:0:1:0: Direct-Access ATA WDC WD2002FYPS-0 1G01 PQ: 0 ANSI: 6
> Sep 16 20:39:41 server kernel: [ 6.263445] scsi 6:0:1:0: SATA: handle(0x000a), sas_addr(0x4433221101000000), phy(1), device_name(0x50014ee205d20340)
> Sep 16 20:39:41 server kernel: [ 6.263926] scsi 6:0:1:0: SATA: enclosure_logical_id(0x5003048000771f30), slot(1)
> Sep 16 20:39:41 server kernel: [ 6.264528] scsi 6:0:1:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
> Sep 16 20:39:41 server kernel: [ 6.268400] scsi 6:0:1:0: serial_number( WD-WCAVY6979293)
> Sep 16 20:39:41 server kernel: [ 6.268670] scsi 6:0:1:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(7), cmd_que(1)
> Sep 16 20:39:41 server kernel: [ 6.269570] sd 6:0:0:0: Attached scsi generic sg5 type 0
> Sep 16 20:39:41 server kernel: [ 6.270045] sd 6:0:1:0: Attached scsi generic sg6 type 0
> Sep 16 20:39:41 server kernel: [ 6.273153] sd 6:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
> Sep 16 20:39:41 server kernel: [ 6.273726] sd 6:0:1:0: [sdg] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
> Sep 16 20:39:41 server kernel: [ 6.292464] sd 6:0:0:0: [sdf] Write Protect is off
> Sep 16 20:39:41 server kernel: [ 6.292931] sd 6:0:1:0: [sdg] Write Protect is off
> Sep 16 20:39:41 server kernel: [ 6.299462] sd 6:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 16 20:39:41 server kernel: [ 6.299963] sd 6:0:1:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Sep 16 20:39:41 server kernel: [ 6.329423] sdf:
> Sep 16 20:39:41 server kernel: [ 6.329597] sdg: unknown partition table
> Sep 16 20:39:41 server kernel: [ 6.330713] unknown partition table
> Sep 16 20:39:41 server kernel: [ 6.393289] sd 6:0:1:0: [sdg] Attached SCSI disk
> Sep 16 20:39:41 server kernel: [ 6.396883] sd 6:0:0:0: [sdf] Attached SCSI disk


However, when writing to them
> dd if=/dev/zero bs=4M of=/dev/sdf
things go wrong.
I get the following messages over and over:

> Sep 16 20:41:19 server kernel: [ 115.875188] sd 6:0:0:0: attempting task abort! scmd(ffff88021ee01400)
> Sep 16 20:41:19 server kernel: [ 115.875194] sd 6:0:0:0: [sdf] CDB: Write(10): 2a 00 00 00 0c 00 00 00 08 00
> Sep 16 20:41:19 server kernel: [ 115.875211] scsi target6:0:0: handle(0x0009), sas_address(0x4433221100000000), phy(0)
> Sep 16 20:41:19 server kernel: [ 115.875216] scsi target6:0:0: enclosure_logical_id(0x5003048000771f30), slot(0)
> Sep 16 20:41:19 server kernel: [ 115.875226] mpt2sas0: sending diag reset !!
> Sep 16 20:41:19 server kernel: [ 116.026998] mpt2sas0: diag reset: FAILED
> Sep 16 20:41:19 server kernel: [ 116.027005] sd 6:0:0:0: task abort: FAILED scmd(ffff88021ee01400)

until it finally gives up and says, over and over:

> Sep 16 20:41:24 server kernel: [ 121.190971] sd 6:0:0:0: Device offlined - not ready after error recovery

then
> Sep 16 20:41:24 server kernel: [ 121.191104] sd 6:0:0:0: [sdf] Unhandled error code
> Sep 16 20:41:24 server kernel: [ 121.191109] sd 6:0:0:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> Sep 16 20:41:24 server kernel: [ 121.191117] sd 6:0:0:0: [sdf] CDB: Write(10): 2a 00 00 00 0c 00 00 00 08 00
> Sep 16 20:41:24 server kernel: [ 121.191137] end_request: I/O error, dev sdf, sector 3072
> Sep 16 20:41:24 server kernel: [ 121.191144] Buffer I/O error on device sdf, logical block 384
> Sep 16 20:41:24 server kernel: [ 121.191149] lost page write due to I/O error on sdf

and over and over
> Sep 16 20:41:24 server kernel: [ 121.191167] sd 6:0:0:0: rejecting I/O to offline device

and
> Sep 16 20:41:24 server kernel: [ 121.206643] sd 6:0:0:0: [sdf] Unhandled error code
> Sep 16 20:41:24 server kernel: [ 121.206646] sd 6:0:0:0: [sdf] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
> Sep 16 20:41:24 server kernel: [ 121.206651] sd 6:0:0:0: [sdf] CDB: Write(10): 2a 00 00 00 64 10 00 00 08 00
> Sep 16 20:41:24 server kernel: [ 121.206662] end_request: I/O error, dev sdf, sector 25616
> Sep 16 20:41:24 server kernel: [ 121.206666] Buffer I/O error on device sdf, logical block 3202
> Sep 16 20:41:24 server kernel: [ 121.206668] lost page write due to I/O error on sdf
> Sep 16 20:41:24 server kernel: [ 121.206692] sd 6:0:0:0: [sdf] Unhandled error code
> Sep 16 20:41:24 server kernel: [ 121.206695] sd 6:0:0:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> Sep 16 20:41:24 server kernel: [ 121.206699] sd 6:0:0:0: [sdf] CDB:
> Sep 16 20:41:24 server kernel: [ 121.206701] sd 6:0:0:0: rejecting I/O to offline device
> Sep 16 20:41:24 server kernel: [ 121.206704] Write(10): 2a 00 00 00
> Sep 16 20:41:24 server kernel: [ 121.206709] sd 6:0:0:0: rejecting I/O to offline device
> Sep 16 20:41:24 server kernel: [ 121.206711] 08 00 00 04 00 00
> Sep 16 20:41:24 server kernel: [ 121.206717] end_request: I/O error, dev sdf, sector 2048
> Sep 16 20:41:24 server kernel: [ 121.206720] Buffer I/O error on device sdf, logical block 256
> Sep 16 20:41:24 server kernel: [ 121.206722] lost page write due to I/O error on sdf
> Sep 16 20:41:24 server kernel: [ 121.206726] Buffer I/O error on device sdf, logical block 257
> ...
> Sep 16 20:41:24 server kernel: [ 121.206898] sd 6:0:0:0: [sdf] Unhandled error code
> Sep 16 20:41:24 server kernel: [ 121.206901] sd 6:0:0:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
> Sep 16 20:41:24 server kernel: [ 121.206905] sd 6:0:0:0: [sdf] CDB: Write(10): 2a 00 00 00 04 00
> Sep 16 20:41:24 server kernel: [ 121.206914] sd 6:0:0:0: rejecting I/O to offline device

and on and on.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,516
5,828
113
Very odd I haven't seen this one yet. Maybe someone else on here has. If you search you can also look at the LSI 9211-8i (or 4i really) for others with that issue. The X8SI6-F is essentially one of those permanently mounted to the PCB with PCIe lanes directly attached instead of through a slot interface. That may help in your search for a fix.
 

rossetyler

New Member
Oct 9, 2011
5
0
0
Does anyone know of a Linux live CD that works with this controller?
It would/should be a simple test to see if the problem is in the OS/driver or not.
 

mobilenvidia

Moderator
Sep 25, 2011
1,952
213
63
New Zealand
Ubuntu should be a good place to start to test it.

Also what did the BIOS format it to ?
You are best to remove all partitions then let the OS format the HDD.

The onboard LSI card has IR or IT BIOS options ?
IR BIOS needs to have each drive as a logical device before OS will see/use it, IT will pass each drive through
 

rossetyler

New Member
Oct 9, 2011
5
0
0
Do you know of an Ubuntu Live CD version that works?
I would like to simply pop in a CD, boot and test.

I assume the LSI SAS2008 format just commanded the disk drive to do a low level format.
I did not format (e.g. partition) the drive further (my tests write directly to the raw drive (e.g. /dev/sdf)).
I believe my problems are independent of this as I used other Maxtor 160GB drives that I did not re-format and they suffer from the same problem.

I am using LSI IT (Intiator Target) mode, not IR (Integrated RAID) as I am planning (eventually) to use Linux software RAID.
 

mobilenvidia

Moderator
Sep 25, 2011
1,952
213
63
New Zealand
I've not tried it my self (yet).
But surely the SAS2x08 chipset should be supported, it's used by Intel, IBM and Dell servers.

I can't test this for a week or so.

But at least everything else looks like it should work.
 

rossetyler

New Member
Oct 9, 2011
5
0
0
Thank you very much for the tip.

I downloaded and booted from ubuntu-11.10-desktop-amd64.iso (live) and it works!
So, at least my hardware/firmware seems to be working and that is great news.
Now I just need to find/fix the software/OS part.

I noted that the Ubuntu mpt2sas driver version (08.100.00.2) is much later/greater than that bundled with the Fedora 14 kernel (05.100.00.02) but much before/less than the latest provided both at the Supermicro and LSI sites (10.00.00.00).
Since 10 is the latest and that is what I have already upgraded my firmware to, 10 should be the best match.
Also, firmware 08 is not available any more at the Supermicro site.

Based on my experience, Fedora 14 kernel 2.6.35.14-95.fc14.x86_64 and mpt2sas driver 10.00.00.00 don't work.
I see that there is a later kernel (2.6.35.14-96.fc14.x86_64).
I will try different kernel/driver combinations to see if I can get something to work and report back.

Thanks again!
 
Last edited:

rossetyler

New Member
Oct 9, 2011
5
0
0
I decided to wait until Fedora 16 came out.
The Fedora-16-x86_64-Live-Desktop.iso is bundled with mpt2sas version 09.100.00.00 which is greater than the Ubuntu version (08.100.00.2) that worked above.
Alas, this Fedora version fares about as well as the others - it doesn't work.
Time to consider going to Ubuntu for my file server