I have an infiniband network providing a SAN to my workstation, debian 10 server (runs opensm), ubuntu 20.04 workstation.
Without a partitions.conf file, everything works, ipoib pings fine, except that ibdiagnet gives an error message that the group speed is only 10Gb, while node speed is 40Gb.
Searching around (from here), I found that making a partitions.conf file with this line:
Then restarting opensm makes the warning by ibdiagnet go away.
All fine so far.
However, at boot, it's not possible to ipoib ping a different node. The way to get everything working again, is disabling the partitions.conf config, restarting opensm, pinging another node (which has then become possible), rewriting the conf file, restarting opensm again, and then everything works without errors again.
I've made a few scripts to expedite this and messed with every permutation of the settings line I could come up with, but I cannot get the system to boot directly into a state where it's at both 40Gb and capable of ipoib communications at the same time.
My experience with infiniband is limited, and I'm hoping someone has some idea what's going wrong.
Without a partitions.conf file, everything works, ipoib pings fine, except that ibdiagnet gives an error message that the group speed is only 10Gb, while node speed is 40Gb.
Searching around (from here), I found that making a partitions.conf file with this line:
Code:
Default=0x7fff, ipoib, mtu=5, rate=7, defmember=full : ALL=full, ALL_SWITCHES=full,SELF=full;
All fine so far.
However, at boot, it's not possible to ipoib ping a different node. The way to get everything working again, is disabling the partitions.conf config, restarting opensm, pinging another node (which has then become possible), rewriting the conf file, restarting opensm again, and then everything works without errors again.
I've made a few scripts to expedite this and messed with every permutation of the settings line I could come up with, but I cannot get the system to boot directly into a state where it's at both 40Gb and capable of ipoib communications at the same time.
My experience with infiniband is limited, and I'm hoping someone has some idea what's going wrong.