PCI SATA controllers and dummy unused ports on Linux

kibibyte

New Member
Apr 17, 2021
8
0
1
Linköping, Sweden
Hi! My workplace bought a Supermicro server with a H11SSW-NT (EPYC) motherboard a year ago, and there's a small annoyance that I've been trying to find the cause for ever since:

When Linux boots up and initializes the AMD FCH SATA controller, any empty drive bay results in a DUMMY ata port, meaning, as far as I can tell, that you can't use it without rebooting after installing a drive in it. The normal writing to /sys/class/scsi_host/hostN/scan does nothing, because the whole port is disabled.

I searched through the kernel code today, and I think I've found a clue in libata-sff.c: "/* Discard disabled ports. Some controllers show their unused channels this way. Disabled ports are made dummy. */"

So apparently this is hardware-specific. My question now is mainly how common this is. The motherboard in question has a whole bunch of SlimSAS NVMe ports, of which two can instead be configured in SATA (AHCI) mode, each supporting eight drives. And second, does anyone know if there's a way to re-initialize the disabled ports after all?
 

UhClem

Member
Jun 26, 2012
88
34
18
NH, USA
Hi! My workplace bought a Supermicro server with a H11SSW-NT (EPYC) motherboard a year ago, and there's a small annoyance that I've been trying to find the cause for ever since:
Small? I'd say that having to reboot a system, in this day and age, just to add a Sata drive, in a hot-swap bay no less, is a royal PITA. There is, however, a half-ass workaround which IS (only) a small annoyance. Get in the habit of putting drive(s) in those bay(s) which you'll want to use as hot-plug locations, WHEN you boot the system. Once the system has booted, you can remove those drives. Henceforth, those bays will be normal, full-function, hot-swap locations (until you reboot ... [rinse/repeat]).

Your "pursuit of the truth" was laudable, but only worth that (dastardly?) "A for effort". While the comment you found in libata-sff.c looks to be spot-on-topic, it is, in fact, completely off-topic. That module is for legacy IDE devices, and the code/comment you cited would come into play, for example, with a controller card, such as the Highpoint Rocket 620, which uses a Marvell 9120 chip. The chip itself contains a 2-port Sata controller AND a one-port PATA/IDE controller. But the 620 card ONLY utilizes, and "activates", the Sata. The kernel recognizes this "exists in theory, but not in practice" situation, and dummies the PATA. That's my take, anyway ... (but I haven't mucked with the Unix kernel for 45 years.)

Small favor, please.Could you do
Code:
lspci -vv | gzip > H11_lspci_vv.gz
and attach it to a reply. Thanks.
 

kibibyte

New Member
Apr 17, 2021
8
0
1
Linköping, Sweden
Thanks for the reply! Well, I figured the DUMMY from dmesg had to come from
ata_host_register() in libata-core.c, where it checks if
ata_port_is_dummy()
, which is an inline function that checks if the
struct ata_port's ops points at a special ata_dummy_port_ops structure, and it's set that way only in a few places, but I guess it's more likely that it comes from ahci_init_one() in the ahci module. Either way, doesn't it seem that it's the hardware that decides that the port is disabled, not the kernel?

EPYC has an integrated SATA controller, which I assumed is what's used here (and hence that I'd see the same behaviour regardless of motherboard), but the lspci output also mentions Super Micro:

Code:
45:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51) (prog-if 01 [AHCI 1.0])
        Subsystem: Super Micro Computer Inc FCH SATA Controller [AHCI mode]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 92
        NUMA node: 0
        Region 5: Memory at b0400000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [64] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed unknown, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed unknown, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable+ Count=16/16 Maskable- 64bit+
                Address: 00000000fee00000  Data: 0000
        Capabilities: [d0] SATA HBA v1.0 InCfgSpace
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Capabilities: [270 v1] #19
        Capabilities: [2a0 v1] Access Control Services
                ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Capabilities: [400 v1] #25
        Capabilities: [410 v1] #26
        Capabilities: [440 v1] #27
        Kernel driver in use: ahci
        Kernel modules: ahci
There are four of these, matching the block diagram (there's also an ASMedia SATA controller that handles the SATA DOMs, which are ata1 and ata2).

Curiously though, the devices at 87:00.0 and 88:00.0, which have to correspond to the unused SlimSAS connector, each have a port with a SATA link that is reported down (ata3 and ata4). (Finally, there's ata13, which is also down, and I think that has to be the unused SD Card slot.)
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,383
491
83
An interesting problem - certainly not one I've seen myself on my own (non-Epyc) AMD kit nor the HP Epyc kit at work (but that isn't using SATA in any case). In fact I don't think I've ever seen a port named as dummy.

Not that it helps any, but the FCH SATA ports on my 3700X/X470D2U show up virtually identical to yours, including showing the OEM:
Code:
31:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51) (prog-if 01 [AHCI 1.0])
    Subsystem: ASRock Incorporation FCH SATA Controller [AHCI mode]
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 54
    Region 5: Memory at f7c00000 (32-bit, non-prefetchable) [size=2K]
    Capabilities: [48] Vendor Specific Information: Len=08 <?>
    Capabilities: [50] Power Management version 3
        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [64] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #0, Speed unknown, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed unknown, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
             EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
    Capabilities: [a0] MSI: Enable+ Count=1/16 Maskable- 64bit+
        Address: 00000000fee00000  Data: 0000
    Capabilities: [d0] SATA HBA v1.0 InCfgSpace
    Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
    Capabilities: [270 v1] #19
    Capabilities: [370 v1] Transaction Processing Hints
        Device specific mode supported
        Steering table in TPH capability structure
    Capabilities: [400 v1] #25
    Capabilities: [410 v1] #26
    Capabilities: [440 v1] #27
    Kernel driver in use: ahci
    Kernel modules: ahci
If you don't hook a port up to a drive bay and instead try direct-attached SATA hot-swap does that change matters any? Do you get any kernel messages at all when you plug in a drive? Are there any SATA hot-swap options in the BIOS?
 

UhClem

Member
Jun 26, 2012
88
34
18
NH, USA
Thank you!

I probably should have explained my request (but thought the "favor" part would be a clue). The lspci was completely off-topic--I'm just curious about the "indoor plumbing" of a full-featured current-gen mobo like yours. And, I apologize that it [in/se]duced you to take that journey.

Back to the real/original issue:
So apparently this is hardware-specific. My question now is mainly how common this is.
Well, it's been annoying me on a Dell T30 (C236 chipset) and a HP ML30 G10 (C246), but I don't think it is really hardware-specific. At least, in those two cases; both those mobo's use the native Sata ports in the chipset. Not having perused the relevant AMD doc on your FCH, I can't speak for your mobo. It's hard for me to imagine that Intel would design the Sata ports on its chipset to have the observed PITA characteristic (nor AMD) And I don't believe that Linux would be so hard-headed either. (Strictly conjecture, but)I think it's the BIOS that is enforcing this (misguided?) policy of "If a Sata port is not alive&active when I start, then it's just dead."

Have you considered reaching out to SM support, and asking them about this "hot-swappable but not hot-pluggable" behavior?

Lastly, you mentioned
... the DUMMY from dmesg ...
If you'd like to pursue/discuss that, please do a
Code:
dmesg | grep -C 5 -i dummy
and paste it in.
 

UhClem

Member
Jun 26, 2012
88
34
18
NH, USA
An interesting problem - certainly not one I've seen myself on my own (non-Epyc) AMD kit nor the HP Epyc kit at work (but that isn't using SATA in any case). In fact I don't think I've ever seen a port named as dummy.
So far, this looks like it is just inference and speculation (pending a view of the dmesg|grep)
If you don't hook a port up to a drive bay and instead try direct-attached SATA hot-swap does that change matters any? Do you get any kernel messages at all when you plug in a drive? Are there any SATA hot-swap options in the BIOS?
I'll throw my observations in here as (possibly relevant) data points. All of my instances are direct-attached, and there is absolutely no reaction (other than audible spin-up); no kernel msg. If I explicitly attempt to provoke/awaken it (via
Code:
echo "- - -" > /sys/class/scsi_host/hostN/scan
), all I get is a kernel msg line of
Code:
ataN: SATA link down (SStatus 4 SControl 300)
exactly the same as that at kernel initialization/startup.
[Edit:]I could never find any BIOS settings ... but keep hoping that I overllooked them :).
 
Last edited:

kibibyte

New Member
Apr 17, 2021
8
0
1
Linköping, Sweden
I'll include a bit more than just the lines in the absolute vicinity of the "DUMMY" ones.
Code:
[    2.833156] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
[    3.501413] SCSI subsystem initialized
[    3.509262] libata version 3.00 loaded.
[    3.578625] ahci 0000:c9:00.0: version 3.0
[    3.578722] ahci 0000:c9:00.0: SSS flag set, parallel bus scan disabled
[    3.578752] ahci 0000:c9:00.0: AHCI 0001.0200 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
[    3.578754] ahci 0000:c9:00.0: flags: 64bit ncq sntf stag led clo pmp pio slum part ccc sxs
[    3.579053] scsi host0: ahci
[    3.579957] scsi host1: ahci
[    3.580008] ata1: SATA max UDMA/133 abar m512@0xb8700000 port 0xb8700100 irq 77
[    3.580009] ata2: SATA max UDMA/133 abar m512@0xb8700000 port 0xb8700180 irq 77
[    3.580209] ahci 0000:87:00.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
[    3.580210] ahci 0000:87:00.0: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part
[    3.580358] scsi host2: ahci
[    3.580405] ata3: SATA max UDMA/133 abar m2048@0xf0100000 port 0xf0100100 irq 79
[    3.580570] ahci 0000:88:00.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
[    3.580571] ahci 0000:88:00.0: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part
[    3.647443] scsi host3: ahci
[    3.647508] ata4: SATA max UDMA/133 abar m2048@0xf0000000 port 0xf0000100 irq 81
[    3.648148] ahci 0000:45:00.0: AHCI 0001.0301 32 slots 5 ports 6 Gbps 0xda impl SATA mode
[    3.648149] ahci 0000:45:00.0: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part
[    3.648151] ahci 0000:45:00.0: both AHCI_HFLAG_MULTI_MSI flag set and custom irq handler implemented
[    3.648735] scsi host4: ahci
[    3.648856] scsi host5: ahci
[    3.649695] scsi host6: ahci
[    3.649787] scsi host7: ahci
[    3.649907] scsi host8: ahci
[    3.649999] scsi host9: ahci
[    3.650099] scsi host10: ahci
[    3.650242] scsi host11: ahci
[    3.650277] ata5: DUMMY
[    3.650279] ata6: SATA max UDMA/133 abar m2048@0xb0400000 port 0xb0400180 irq 93
[    3.650279] ata7: DUMMY
[    3.650281] ata8: SATA max UDMA/133 abar m2048@0xb0400000 port 0xb0400280 irq 95
[    3.650282] ata9: SATA max UDMA/133 abar m2048@0xb0400000 port 0xb0400300 irq 96
[    3.650283] ata10: DUMMY
[    3.650284] ata11: SATA max UDMA/133 abar m2048@0xb0400000 port 0xb0400400 irq 98
[    3.650286] ata12: SATA max UDMA/133 abar m2048@0xb0400000 port 0xb0400480 irq 99
[    3.894844] ata3: SATA link down (SStatus 0 SControl 300)
[    3.963889] ata4: SATA link down (SStatus 0 SControl 300)
[    4.056311] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.057403] ata1.00: ATA-9: SuperMicro SSD, SOB20R, max UDMA/133
[    4.057406] ata1.00: 30932992 sectors, multi 1: LBA48 NCQ (depth 32), AA
[    4.058348] ata1.00: configured for UDMA/133
[    4.058572] scsi 0:0:0:0: Direct-Access     ATA      SuperMicro SSD   0R   PQ: 0 ANSI: 5
[    4.124310] ata12: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.124344] ata11: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.124370] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.124398] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.124427] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.131192] ata6.00: ATA-10: ST2000NM0105-1YY104, NN02, max UDMA/133
[    4.131195] ata6.00: 488378646 sectors, multi 2: LBA48 NCQ (depth 32), AA
[    4.131417] ata9.00: ATA-10: ST2000NM0055-1V4104, TN02, max UDMA/133
[    4.131420] ata9.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[    4.132735] ata9.00: configured for UDMA/133
[    4.132741] ata6.00: configured for UDMA/133
[    4.135117] ata8.00: ATA-10: ST2000NM0055-1V4104, TN02, max UDMA/133
[    4.135119] ata8.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[    4.136619] ata8.00: configured for UDMA/133
[    4.227035] ata12.00: ATA-10: WDC WD2005FBYZ-01YCBB2, RR07, max UDMA/133
[    4.227038] ata12.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[    4.229951] ata11.00: ATA-10: WDC WD2005FBYZ-01YCBB2, RR07, max UDMA/133
[    4.229953] ata11.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[    4.246299] ata12.00: configured for UDMA/133
[    4.249456] ata11.00: configured for UDMA/133
[    4.532306] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    4.533384] ata2.00: ATA-9: SuperMicro SSD, SOB20R, max UDMA/133
[    4.533387] ata2.00: 30932992 sectors, multi 1: LBA48 NCQ (depth 32), AA
[    4.534323] ata2.00: configured for UDMA/133
[    4.534578] scsi 1:0:0:0: Direct-Access     ATA      SuperMicro SSD   0R   PQ: 0 ANSI: 5
[    4.535057] scsi 5:0:0:0: Direct-Access     ATA      ST2000NM0105-1YY NN02 PQ: 0 ANSI: 5
[    4.535430] scsi 7:0:0:0: Direct-Access     ATA      ST2000NM0055-1V4 TN02 PQ: 0 ANSI: 5
[    4.535695] scsi 8:0:0:0: Direct-Access     ATA      ST2000NM0055-1V4 TN02 PQ: 0 ANSI: 5
[    4.535885] scsi 10:0:0:0: Direct-Access     ATA      WDC WD2005FBYZ-0 RR07 PQ: 0 ANSI: 5
[    4.536092] scsi 11:0:0:0: Direct-Access     ATA      WDC WD2005FBYZ-0 RR07 PQ: 0 ANSI: 5
[    4.652532] ahci 0000:46:00.0: failed stop FIS RX (-16)
[    4.652543] ahci 0000:46:00.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
[    4.652545] ahci 0000:46:00.0: flags: 64bit ncq sntf ilck pm led clo only pmp fbs pio slum part
[    4.652911] scsi host12: ahci
[    4.653001] ata13: SATA max UDMA/133 abar m2048@0xb0300000 port 0xb0300100 irq 118
[    4.661051] sd 0:0:0:0: [sda] Attached SCSI disk
[    4.661057] sd 1:0:0:0: [sdb] Attached SCSI disk
[    4.670683] sd 11:0:0:0: [sdg] Attached SCSI disk
[    4.673552] sd 10:0:0:0: [sdf] Attached SCSI disk
[    4.681234] sd 8:0:0:0: [sde] Attached SCSI disk
[    4.702707] sd 5:0:0:0: [sdc] Attached SCSI disk
[    4.713272] sd 7:0:0:0: [sdd] Attached SCSI disk
[    4.966788] ata13: SATA link down (SStatus 0 SControl 300)
[    8.936864] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    8.936914] sd 1:0:0:0: Attached scsi generic sg1 type 0
[    8.936960] sd 5:0:0:0: Attached scsi generic sg2 type 0
[    8.937003] sd 7:0:0:0: Attached scsi generic sg3 type 0
[    8.937049] sd 8:0:0:0: Attached scsi generic sg4 type 0
[    8.937095] sd 10:0:0:0: Attached scsi generic sg5 type 0
[    8.937130] sd 11:0:0:0: Attached scsi generic sg6 type 0
The following lines are extra interesting. Didn't notice them before:
Code:
[    3.648148] ahci 0000:45:00.0: AHCI 0001.0301 32 slots 5 ports 6 Gbps 0xda impl SATA mode
[    3.648151] ahci 0000:45:00.0: both AHCI_HFLAG_MULTI_MSI flag set and custom irq handler implemented
Note that it says "5 ports" (we have 5 drives installed).

Here's part of /sys/block for overview:
Code:
lrwxrwxrwx 1 root root 0 apr 20 20:39 sda -> ../devices/pci0000:c0/0000:c0:03.4/0000:c9:00.0/ata1/host0/target0:0:0/0:0:0:0/block/sda
lrwxrwxrwx 1 root root 0 apr 20 20:39 sdb -> ../devices/pci0000:c0/0000:c0:03.4/0000:c9:00.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb
lrwxrwxrwx 1 root root 0 apr 20 20:39 sdc -> ../devices/pci0000:40/0000:40:08.2/0000:45:00.0/ata6/host5/target5:0:0/5:0:0:0/block/sdc
lrwxrwxrwx 1 root root 0 apr 20 20:39 sdd -> ../devices/pci0000:40/0000:40:08.2/0000:45:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sdd
lrwxrwxrwx 1 root root 0 apr 20 20:39 sde -> ../devices/pci0000:40/0000:40:08.2/0000:45:00.0/ata9/host8/target8:0:0/8:0:0:0/block/sde
lrwxrwxrwx 1 root root 0 apr 20 20:39 sdf -> ../devices/pci0000:40/0000:40:08.2/0000:45:00.0/ata11/host10/target10:0:0/10:0:0:0/block/sdf
lrwxrwxrwx 1 root root 0 apr 20 20:39 sdg -> ../devices/pci0000:40/0000:40:08.2/0000:45:00.0/ata12/host11/target11:0:0/11:0:0:0/block/sdg
I didn't give all details before, but the backplane is BPN-SAS3-825TQ, which has a separate SATA-type SAS connector per bay, connected to the motherboard via a Slimline SAS to 8 SATA breakout cable.
 
Last edited:

UhClem

Member
Jun 26, 2012
88
34
18
NH, USA
Sorry for delay; I got sucked in to exploring the maze of your system's "plumbing"; it was good that you posted the more expansive segment of kernel messages (better too much than too little), but I got distracted (ie: 45: & 46: ; also 87: & 88: ; etc.--along with your mobo's block diagram). All I was expecting was evidence of the "relevance" of dummy, and that is confirmed.

At this point, although you & I share the same primary "symptom" (unexpected failure of Sata hotplug), we have different "diseases". For yours, you are on the right path with backtracking from ata_host_register(). On that journey, set your radar to look for any clue that appears to be specific to your hardware (maybe that backplane, as @EffrafaxOfWug inferred). . For example, is there any relevance/clue to be gleaned from this discussion [Link]?

[I feel strongly that the FCH itself does not inherently have this hotplug misfeature. There must be a bad actor on the "outside".]

Good luck. (and pls follow-up when you figure it out, or just to "think out loud":))
 

UhClem

Member
Jun 26, 2012
88
34
18
NH, USA
I guess I could disconnect one of the empty bays, reboot, and see if that makes a difference.
Yes, but ...
That will (most likely) tell you that you're on the right track--that port's DUMMY line will probably be absent from messages.

However, if, instead, you disconnect the SEPARATE (your word) connector, from an empty bay, that might** (!!) just (half-asedly) fix the issue, and give you all the ammunition you need to enlist Supermicro's support for a full & proper solution.

rsvp

** [Edit @1450et] iff the DUMMY line @startup goes away; else this hunch is bogus
 
Last edited:

kibibyte

New Member
Apr 17, 2021
8
0
1
Linköping, Sweden
Yes, but ...
That will (most likely) tell you that you're on the right track--that port's DUMMY line will probably be absent from messages.

However, if, instead, you disconnect the SEPARATE (your word) connector, from an empty bay, that might** (!!) just (half-asedly) fix the issue, and give you all the ammunition you need to enlist Supermicro's support for a full & proper solution.

rsvp

** [Edit @1450et] iff the DUMMY line @startup goes away; else this hunch is bogus
I'm not sure what you meant to do differently. I think what you're thinking is exactly what I was thinking. Anyway, I tried booting with one empty bay unconnected, and it made no difference.