Adaptec ASR-78165 SAS controller - external tape libraries have single LUN and cause lockup in HBA mode (Linux)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gargravarr

Member
Jul 1, 2021
36
1
8
Hi folks,

I've had this issue now with two different types of tape library so I know it's the card.

When connecting to a multi-LUN device (i.e. a tape library with robot), only the tape drive shows as a SCSI device. The robot is not visible. After a while, the entire card locks up and crashes, and the machine has to be rebooted. This is a particular problem because I chose this card for its internal + external connectors - I have internal SAS HDDs and want to be able to connect directly to the tape system from the same machine. It's a mini-ITX NAS so I cannot run multiple cards.

When connecting to a library, I see the following in the logs:
Code:
May 23 14:29:53 excalibur kernel: [1626593.952191] scsi 0:3:0:0: Sequential-Access IBM      ULTRIUM-HH6      E4J1 PQ: 0 ANSI: 6
May 23 14:29:53 excalibur kernel: [1626593.955555] scsi 0:3:0:0: Attached scsi generic sg8 type 1
May 23 14:29:53 excalibur kernel: [1626593.967263] st: Version 20160209, fixed bufsize 32768, s/g segs 256
May 23 14:29:53 excalibur kernel: [1626593.967458] st 0:3:0:0: Attached scsi tape st0
May 23 14:29:53 excalibur kernel: [1626593.967460] st 0:3:0:0: st0: try direct i/o: yes (alignment 4 B)
As you can see, this is just the tape drive - there is no SCSI device for the robot. If I plug this same library into a different machine with a Dell H200 SAS HBA, I get both the drive and the robot appearing (don't have logs for this). The library is fully usable in this other machine, which runs the same OS and kernel. The only real difference is the HBA.

A while later this happens:
Code:
May 23 14:40:51 excalibur kernel: [1627251.956510] st 0:3:0:0: [sg8] tag#713 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE cmd_age=0s
May 23 14:40:51 excalibur kernel: [1627251.956512] st 0:3:0:0: [sg8] tag#713 Sense Key : Not Ready [current]
May 23 14:40:51 excalibur kernel: [1627251.956514] st 0:3:0:0: [sg8] tag#713 Add. Sense: Medium not present
May 23 14:40:51 excalibur kernel: [1627251.956515] st 0:3:0:0: [sg8] tag#713 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:40:51 excalibur kernel: [1627252.489558] st 0:3:0:0: [sg8] tag#740 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE cmd_age=0s
May 23 14:40:51 excalibur kernel: [1627252.489560] st 0:3:0:0: [sg8] tag#740 Sense Key : Not Ready [current]
May 23 14:40:51 excalibur kernel: [1627252.489562] st 0:3:0:0: [sg8] tag#740 Add. Sense: Medium not present
May 23 14:40:51 excalibur kernel: [1627252.489563] st 0:3:0:0: [sg8] tag#740 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:41:56 excalibur kernel: [1627317.068587] st 0:3:0:0: [sg8] tag#742 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:41:56 excalibur kernel: [1627317.068593] st 0:3:0:0: [sg8] tag#742 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:01 excalibur kernel: [1627322.290794] st 0:3:0:0: [sg8] tag#704 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:01 excalibur kernel: [1627322.290796] st 0:3:0:0: [sg8] tag#704 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:06 excalibur kernel: [1627327.562551] st 0:3:0:0: [sg8] tag#716 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:06 excalibur kernel: [1627327.562559] st 0:3:0:0: [sg8] tag#716 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:12 excalibur kernel: [1627332.848259] st 0:3:0:0: [sg8] tag#749 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:12 excalibur kernel: [1627332.848267] st 0:3:0:0: [sg8] tag#749 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:17 excalibur kernel: [1627338.120115] st 0:3:0:0: [sg8] tag#752 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:17 excalibur kernel: [1627338.120121] st 0:3:0:0: [sg8] tag#752 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:22 excalibur kernel: [1627343.391946] st 0:3:0:0: [sg8] tag#744 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:22 excalibur kernel: [1627343.391949] st 0:3:0:0: [sg8] tag#744 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:27 excalibur kernel: [1627348.663764] st 0:3:0:0: [sg8] tag#744 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:27 excalibur kernel: [1627348.663772] st 0:3:0:0: [sg8] tag#744 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:33 excalibur kernel: [1627353.935580] st 0:3:0:0: [sg8] tag#742 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:33 excalibur kernel: [1627353.935588] st 0:3:0:0: [sg8] tag#742 CDB: Test Unit Ready 00 00 00 00 00 00
May 23 14:42:38 excalibur kernel: [1627359.207407] st 0:3:0:0: [sg8] tag#794 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=0s
May 23 14:42:38 excalibur kernel: [1627359.207416] st 0:3:0:0: [sg8] tag#794 CDB: Test Unit Ready 00 00 00 00 00 00
Culminating in:
Code:
May 23 14:41:44 excalibur kernel: [1627304.952590] sd 0:1:0:0: [sda] tag#726 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=2s
May 23 14:41:44 excalibur kernel: [1627304.952593] sd 0:1:0:0: [sda] tag#726 CDB: Write(16) 8a 00 00 00 00 02 fc 66 a9 e8 00 00 00 08 00 00
May 23 14:41:44 excalibur kernel: [1627304.952594] blk_update_request: I/O error, dev sda, sector 12824521192 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
May 23 14:41:44 excalibur kernel: [1627304.953341] zio pool=z2 vdev=/dev/disk/by-path/pci-0000:01:00.0-scsi-0:1:0:0-part1 error=5 type=2 offset=6566153801728 size=4096 flags=180880
In dmesg, I see the following after the card crashes:
Code:
[1627419.946693] aacraid: Host bus reset request. SCSI hang ?
[1627419.947388] aacraid 0000:01:00.0: Adapter health - 255
And the Adaptec utility detects no cards on the system:
Code:
root@excalibur:~# /usr/Arcconf/arcconf getconfig 0
Controllers found: 0
Invalid controller number.
The card is on the current firmware as of today (hasn't been updated since 2018):
Code:
Firmware                                 : 7.5-0 (32118)
The card does run pretty hot but I have an 80mm fan sitting on top (can't get the official Adaptec fan kit, sold out everywhere).

This only occurs with tape libraries. If I connect to a plain SAS tape drive, it works perfectly (including writing an entire LTO-5 tape at once). I also have zero problems with SAS HDDs. I've seen this happen with a Dell TL4000 (detailed in another thread) and a Quantum SuperLoader 3. I can connect to both drives in a Dell PV114X simultaneously with no problem.

The card is set in HBA mode in the BIOS. I can't see any options for multiple LUNs. The rest of the system is an Asus P11C-i ITX motherboard with a Core i3 9100T CPU and 32GB DDR4 ECC memory. It runs Devuan 4 (Debian 11 without systemd).

If this card has known problems, can anyone recommend me a card with the following specs:
  • Minimum SAS-2
  • Minimum 2x internal SAS connectors (8 drive slots)
  • Minimum 2x external SAS connectors (support for 2 external tape drives)
  • Ideally mini-SAS HD connectors
  • No RAID functionality required/can be easily disabled (I run ZFS) - I really don't like cross-flashing LSI cards (have had lots of trouble) so I'd prefer something else
Many thanks.
 
Last edited: