[SOLVED] SRP connection fails with "SRP_LOGIN_REQ because target port ibp33s0f1_1 has not yet been enabled" but it is :(

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

naptastic

New Member
Jan 27, 2023
21
3
3
Output from targetcli ls

o- srpt ............................................................................................................. [Targets: 1]
| o- ib.fe800000000000005849560e53b70b01 ........................................................................... [no-gen-acls]
| o- acls ............................................................................................................ [ACLs: 1]
| | o- ib.fe800000000000005849560e59150301 .................................................................... [Mapped LUNs: 1]
| | o- mapped_lun0 .................................................................................. [lun0 ramdisk/swap (rw)]
| o- luns ............................................................................................................ [LUNs: 1]
| o- lun0 .................................................................................. [ramdisk/swap (default_tg_pt_gp)]


the only "enable" I can find is already enabled. I tried turning it off and back on to no avail:

[root]@[southpark][07:06:44][/sys/kernel/config/target/srpt]# cat 0xfe800000000000005849560e53b70b01/tpgt_1/enable
1


The initiator is Debian Bookworm; the target is Debian Trixie. Both are using Mellanox Connect-IB HCAs and an SX6005 switch.

They pass VXLAN traffic just fine. iSCSI is a different set of problems. NVMe-oF doesn't do ramdisks. Fibre Channel just isn't very fast.

dmesg on the initiator:

[611882.283783] scsi host11: ib_srp: REJ received
[611882.283787] scsi host11: ib_srp: SRP LOGIN from fe80:0000:0000:0000:5849:560e:5915:0301 to fe80:0000:0000:0000:5849:560e:53b7:0b01 REJECTED, reason 0x00010006
[611882.283794] scsi host11: ib_srp: Connection 0/12 to fe80:0000:0000:0000:5849:560e:53b7:0b01 failed


dmesg on the target:

[5433244.260520] ib_srpt Received SRP_LOGIN_REQ with i_port_id 5849:560e:53b7:0b09:5849:560e:53b7:0b01, t_port_id 5849:560e:53b7:0b01:5849:560e:53b7:0b01 and it_iu_len 8260 on port 1 (guid=fe80:0000:0000:0000:5849:560e:53b7:0b09); pkey 0xffff
[5433244.288102] ib_srpt rejected SRP_LOGIN_REQ because target port ibp33s0f1_1 has not yet been enabled
[5433244.301979] ib_srpt Rejecting login with reason 0x10001
[5433244.315893] scsi host14: ib_srp: REJ received
[5433244.315902] scsi host14: ib_srp: SRP LOGIN from fe80:0000:0000:0000:5849:560e:53b7:0b01 to fe80:0000:0000:0000:5849:560e:53b7:0b09 REJECTED, reason 0x00010001
[5433244.343554] scsi host14: ib_srp: Connection 0/16 to fe80:0000:0000:0000:5849:560e:53b7:0b09 failed


ALSO, how do I put an SRP ACL in a specific partition? It looks like srp_daemon is scanning all defined partitions, which isn't exactly what I want.

Thanks,
nap
 

necr

Active Member
Dec 27, 2017
190
63
28
125
Been a while...did you do an echo command like this:


Code:
echo "id_ext=0002c903009f4480,ioc_guid=0002c903009f4480,dgid=fe800000000000000002c903009f4481" > /sys/class/infin
iband_srp/srp-mlx4_0-1/add_target
or a srpt command like this
/srpt create 0xfe80000000000000001175000077dd7e


based on target's ibstat. Proxmox VE 4.1 setup with scst srpt & srptools over Infiniband
 

naptastic

New Member
Jan 27, 2023
21
3
3
Yes. I figured out that I was looking at the wrong error in dmesg. The "port is not turned on" was for the port I'm not using as a target; the target port is turned on and showing a different error:

[ 4700.620836] ib_srpt Received SRP_LOGIN_REQ with i_port_id 5849:560e:53b7:0b01:5849:560e:5915:0301, t_port_id 5849:560e:53b7:0b01:5849:560e:53b7:0b01 and it_iu_len 8260 on port 1 (guid=fe80:0000:0000:0000:5849:560e:53b7:0b01); pkey 0xffff
[ 4700.648157] infiniband ibp33s0f0: create_qp:3111:(pid 2403): Create QP type 2 failed
[ 4700.663927] ib_srpt Rejected login for initiator 5849:560e:5915:0301: ret = -13.
[ 4700.691018] ib_srpt Rejecting login with reason 0x10006


My searches suggest that there might be a library mismatch or something more deeply wrong with my installation.
 

necr

Active Member
Dec 27, 2017
190
63
28
125
what about basic tests like ibping/ib_send_bw? You can play with QP types and find out what the problem is.
 

naptastic

New Member
Jan 27, 2023
21
3
3
Everything there works except for atomic tests, which I think my hardware doesn't support.

I suspect there are mismatched libraries between the two systems. I'm going to try a different initiator.
 

naptastic

New Member
Jan 27, 2023
21
3
3
after many red herrings, we've arrived at the actual problem: the ACL is not effective.

Here's what srp_daemon is able to find on the fabric once I disallow targets I'm not going to use:

# srp_daemon -c -v -o -p 1
configuration report
------------------------------------------------
Current pid : 6363
Device name : "ibp1s0f0"
IB port : 1
Mad Retries : 3
Number of outstanding WR : 10
Mad timeout (msec) : 5000
Prints add target command : 1
Executes add target command : 0
Print also connected targets : 0
Report current targets and stop : 1
Reads rules from : /etc/srp_daemon.conf
Do not print initiator_ext
No full target rescan
Retries to connect to existing target after 20 seconds
------------------------------------------------
id_ext=5849560e53b70b01,ioc_guid=5849560e53b70b01,dgid=fe800000000000005849560e53b70b01,pkey=ffff,service_id=5849560e53b70b01



The actual connection attempt:

# echo 'id_ext=fe80000000000000,ioc_guid=5849560e53b70b01,dgid=fe800000000000005849560e53b70b01,pkey=ffff,service_id=5849560e53b70b01' > /sys/class/infiniband_srp/srp-ibp1s0f0-1/add_target
-bash: echo: write error: Connection reset by peer


Here's dmesg on the initiator:

[ 3523.496285] scsi host4: ib_srp: REJ received
[ 3523.496301] scsi host4: ib_srp: SRP LOGIN from fe80:0000:0000:0000:5849:560e:5366:0101 to fe80:0000:0000:0000:5849:560e:53b7:0b01 REJECTED, reason 0x00010006
[ 3523.496396] scsi host4: ib_srp: Connection 0/6 to fe80:0000:0000:0000:5849:560e:53b7:0b01 failed


Here's dmesg on the target:

[24621.522361] ib_srpt Received SRP_LOGIN_REQ with i_port_id 0000:0000:0000:0000:5849:560e:5366:0101, t_port_id 5849:560e:53b7:0b01:5849:560e:53b7:0b01 and it_iu_len 8260 on port 1 (guid=fe80:0000:0000:0000:5849:560e:53b7:0b01); pkey 0xffff
[24621.548165] infiniband ibp33s0f0: create_qp:3138:(pid 41285): Create QP type 2 failed
[24621.563117] ib_srpt Rejected login for initiator 5849:560e:5366:0101: ret = -13.
[24621.589725] ib_srpt Rejecting login with reason 0x10006


The answer is definitely "permission denied" / the ACL doesn't exist.

But it definitely exists. It's in targetcli, it's in sysfs... it just doesn't work.
 

naptastic

New Member
Jan 27, 2023
21
3
3
gah, it wants 0000 instead of fe80 for the first 16 bits.

edit: So take a look at this targetcli:
/srpt/ib.fe80...53b70b01/acls> ls
o- acls .................................................................................................................. [ACLs: 4]
o- ib.00000000000000005849560e53660101 .......................................................................... [Mapped LUNs: 1]
| o- THIS ACL WILL WORK ................................................................................. [lun0 ramdisk/swap (rw)]
o- ib.fe800000000000005849560e53660101 .......................................................................... [Mapped LUNs: 1]
| o- THIS ACL WILL FAIL ................................................................................. [lun0 ramdisk/swap (rw)]


...and with that set up correctly, we get pretty decent speed to a ramdisk:

[root]@[shark][00:22:46][~]# dd if=/dev/sdb of=/dev/null bs=4M
4096+0 records in
4096+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 5.38771 s, 3.2 GB/s

[root]@[shark][00:23:02][~]# dd if=/dev/zero of=/dev/sdb bs=4M
4097+0 records in
4096+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 13.7431 s, 1.3 GB/s
 
Last edited:
  • Like
Reactions: necr