Patching Intel X520 EEPROM to unlock all SFP+ transceivers

Discussion in 'Networking' started by NathanA, May 29, 2019.

  1. NathanA

    NathanA New Member

    Joined:
    May 21, 2019
    Messages:
    4
    Likes Received:
    16
    I recently had the need to put together a cheap 10Gbit/s lab, including a server with a NIC. I really only needed a single interface on the server, and I preferred it to be SFP+ rather than twisted-pair for flexibility (I figured I could always stuff a TP module into an SFP+ slot if need be). While researching the various PCIe NIC options available, I arrived at the conclusion that although the X520 might be neither the absolute best performing sub-25gig adapter on the market nor the cheapest 10gig adapter on the market (second-hand or otherwise), it seemed to strike the best balance of performance, stability, wide first-party (and *ongoing*) platform support, and price of any of the ones I looked at.

    There was only one potential wrinkle: SFP+ module compatibility. Word on the street was that some of these 82599ES-based cards were known to be "picky". And not necessarily because there was an actual interop issue with modules that were discovered not to work, but because Intel apparently was (at least as the story goes) merely afraid of having to support untested modules that _might_ prove to have interop issues, so they artificially blocked all but the ones on their tested-and-supported list. (*sigh* There always has to be some catch, right? Other old cards of a similar vintage by the likes of Chelsio and Mellanox are long since EOL'd but will supposedly take just about any transceiver you throw at them. With Intel, though, you get widespread platform/OS compatibility with continued vendor support, only to be forced to either play "optics roulette" or go out of your way to source known-compatible modules.) But it was very difficult to ascertain specifically which cards were "problem" cards (all of the 82599s, just the Intel-branded ones, just some of the Intel-branded ones...?).

    If you were a Linux user with one of these cards, and discovered that the SFP you wanted to populate your card with was either (seemingly) artificially "blacklisted" or "not whitelisted", you were lucky: the Linux driver actually has a parameter you can pass to it that will override that check. However, other platforms, such as ESXi and Windows, are apparently _not_ so lucky.

    But the mere existence of that override parameter in the Linux driver was interesting, and got me thinking. And after taking delivery of a card, playing with it, and doing some additional research and testing, I think I may have discovered a way to instruct any X520 that rejects non-officially-supported transceivers by default to instead allow them, regardless of which driver you are using on your operating system of choice.

    My problem is that right now I lack the time and resources to fully test this theory. So instead, I'm going to document my findings here and explain how (I think) this works, and hopefully others here who would like to be able to use their cards with unsupported optics on non-Linux OSes will stumble upon this, be willing to give this a shot, post their findings, and thus either prove this theory right or lay it to rest.

    One of the reasons that I can't conclusively test this theory is because it turns out that the card I bought, which is a second-hand, Yottamark-verified genuine Intel X520-DA1, has no issues with any SFP+ module I feed it, even when I don't supply the Linux driver with the allow_unsupported_sfp=1 parameter. So I seemingly lucked out and got one of the "good out-of-box" ones. The thing I was able to accomplish that I feel halfway proves my theory, though, is that I managed to "convert" this unlocked card to a locked version -- a card that refused to work with some of my modules _until_ I supplied that parameter to the driver -- and then convert it back to an unlocked card again. What I don't know for sure is if the mechanism described here is applicable to all X520s that are transceiver-restricted or not...apparently some (the X520-SRx and -LRx, which are basically the -DAx pre-populated with an Intel transceiver) are locked to specific Intel-manufactured SFPs while some of the -DAx models are still restricted but will accept a wider range of approved modules? It's all still a bit unclear to me. (I also haven't been able to test my card on an OS other than Ubuntu Server LTS, either.)

    Anyway, on with the show...

    ----​

    The key to "unlocking" an X520 appears to be an undocumented bit within the card EEPROM. Most of the functions of the EEPROM are described in the 82599 datasheet (https://www.intel.com/content/dam/w...asheets/82599-10-gbe-controller-datasheet.pdf), but this one seems to be completely undocumented. The clue came from reading the sources to the Linux ixgbe driver: there is a bitfield in the EEPROM that the driver is checking which the driver source calls IXGBE_DEVICE_CAPS. (DEVICE_CAPS == "device capabilities") So it would seem that the card uses this bitfield to inform the driver about some of its features (which presumably the OEM of each card that uses an 82599 decides for). There are other preprocessor #defines for the various features that are represented by this bitfield that are all named IXGBE_DEVICE_CAPS_*; one of them is IXGBE_DEVICE_CAPS_ALLOW_ANY_SFP, which is the first/least-significant bit.

    So it seems reasonable to assume that if one could permanently flip that bit within the EEPROM of an X520 that rejects non-whitelisted SFP+ modules, then you might be able to permanently unlock that card. One of the reasons for assuming this is that the code in the Linux driver (which Intel themselves wrote large portions of) that checks the IXGBE_DEVICE_CAPS_ALLOW_ANY_SFP bit in the EEPROM long predates the addition of the "allow_unsupported_sfp" option in the driver, which wass a relatively late addition (you can see this for yourself by reading through the discussion in the thread at Intel Ethernet Drivers and Utilities / [E1000-devel] [PATCH RFC] ixgbe: Module param "allow_any_sfp" for allowing unsupported SFP+ modules which is interesting reading anyway, if only for people's reactions to the news after they learned what Intel was doing). And if Intel snuck a check of this undocumented field into the Linux driver, it also seems reasonable to assume that they make similar checks in drivers that they have written for other platforms, so the Intel-written drivers for other platforms will *likely* honor the "ALLOW_ANY_SFP" bit in a given card's EEPROM if it is set, even if that driver has no way of allowing the user to override the check.

    So what's the exact offset of the bit in question, and how do we change it?

    This post from the Intel Linux driver development mailing list gives us some clues: Intel Ethernet Drivers and Utilities / [E1000-devel] ixgbe fcoe disabled by eeprom

    As mentioned, the offset of IXGBE_DEVICE_CAPS is indeed 0x2C: Linux source code: drivers/net/ethernet/intel/ixgbe/ixgbe_type.h (v5.1.5) - Bootlin

    The device_caps field from the EEPROM is read in by this function: Linux source code: drivers/net/ethernet/intel/ixgbe/ixgbe_common.c (v5.1.5) - Bootlin

    From this, we can see that EEPROM data on the Intel NICs consists of 16-bit words. It turns out that they are words stored in little-endian format (least-significant byte first).

    On Linux, we can usually use the 'ethtool' utility to dump out the EEPROM of a supported ethernet card either in whole or in part...here are the first 96 bytes of my X520-DA1:

    root@server:~# ethtool -e enp1s0
    Offset Values
    ------ ------
    0x0000: 60 07 00 00 00 00 40 00 6d 00 fd 00 8d 01 a3 01
    0x0010: a9 01 af 01 b7 01 bf 01 c7 01 cf 01 09 02 38 05
    0x0020: ff f7 ff ff ff ff ff ff ff ff fa fa 10 0e 48 02
    0x0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0050: 0a 0e 93 00 ff ff bc 0c fd ff 01 00 11 5e ff ff
    [...]

    Note that the offsets displayed here are are byte offsets, not word offsets. So to find the *byte* offset that needs to be changed, we take 0x2C and multiply it by 2 to get 0x58. At word 0x2C (byte 0x58), as you can see above, we have word value (if displayed in big-endian order) 0xFFFD.

    Since the word is in fact stored in little-endian order, the byte that we're really interested in is the one immediately at offset 0x58, with value 0xFD. If we represent that as bits (11111101), we can see that bit 1 (the last bit, the one in the "1s place"), which is the bit that supposedly controls whether the SFP+ slot will allow for "any" module or not, is already "on". This is exactly the state that I found my card in when it arrived.

    Now, if I wanted to *lock* the SFP+ slot so that it only allows you to use the transceivers that Intel officially *supports*, I'd have to write a new byte value to offset 0x58 which would change bit 1 to "off". If bit 1 is the only bit of that byte that I change, the hex value of that would be 0xFC. It turns out that I can, in fact, also modify the EEPROM using the 'ethtool' utility (assuming the driver supports it, which in the case of the ixgbe driver, it does...in fact, the driver will even recompute the EEPROM checksum for you after making the changes you ask, and update it for you!). In the case of the ixgbe driver, it wants you to also supply a "magic" (a.k.a. "secret") value at the time you request the modification in order to confirm the change, and this "magic" value turns out to be the PCI device ID and vendor ID concatenated together, in that order, so you'll want to verify this first with 'lspci', though Intel's PCI vendor ID is 8086 and odds are good that device ID for most X520 models is going to be 10FB:

    root@server:~# lspci -nn
    [...]
    01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)

    Now we know all that we need to in order to make this change (which will need to be done as root, naturally), so first we will modify the EEPROM with the -E parameter, and then read back the EEPROM again with -e in order to verify that the change took:

    root@server:~# ethtool -E enp1s0 magic 0x10fb8086 offset 0x58 value 0xfc
    root@server:~# ethtool -e enp1s0
    Offset Values
    ------ ------
    0x0000: 60 07 00 00 00 00 40 00 6d 00 fd 00 8d 01 a3 01
    0x0010: a9 01 af 01 b7 01 bf 01 c7 01 cf 01 09 02 38 05
    0x0020: ff f7 ff ff ff ff ff ff ff ff fa fa 10 0e 48 02
    0x0030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0050: 0a 0e 93 00 ff ff bc 0c fc ff 01 00 11 5e ff ff
    [...]

    ...and as you can see above, byte 0x58 now shows a value of 0xFC. Success. If I reboot now with an "unauthorized" transceiver still in the slot, I will no longer have network connectivity and I will see the infamous "failed to load because an unsupported SFP+ module type was detected" in my dmesg output.

    Re-unlocking it (in my case) would be as simple as writing 0xFD back to the same offset.

    In order to unlock your locked card so that it (perhaps? hopefully? maybe?) will work under any OS/driver with any reasonable SFP+ module, you would just need to follow the same basic steps: first, read back byte 0x58 from your card's EEPROM:

    root@server:~# ethtool -e enp1s0 offset 0x58 length 1
    Offset Values
    ------ ------
    0x0058: fe

    ...then second, do the math on the existing value to change the first bit from 0 to 1, and finally, write the new value back to the card's EEPROM:

    root@server:~# ethtool -E enp1s0 magic 0x10fb8086 offset 0x58 value 0xff

    I would advise you to NOT simply re-use the "known-working" values for this byte shown here (e.g., 0xFD or 0xFF) on your card...since you don't know what behavior in the card's firmware the other bits change, you shouldn't change the value of any bit *other* than the first on your particular card. Also, don't simply assume that you should add 0x01 to the existing value to achieve what you want. Let's say bit 1 was *already* set "on", and byte 0x58 has, say, a value of 0xEF, and your transceiver isn't working for some other reason: if you make an incorrect assumption and change that to 0xF0 then you have just (unintentionally) changed a whole range of bits, and at the same time turned the first bit "off", too. Actually read the current byte back and do the math correctly.

    If you aren't planning to use the card in a Linux box, and don't even have a Linux host to throw your card in temporarily in order to make the modification, I'm not (yet) sure how you would accomplish the same thing in Windows, but if your card is installed in an ESXi host, it appears that the local/remote "Tech Support" ESXi shell also has 'ethtool' and that it works exactly the same way as the Linux one does, as far as I have been able to tell. Other open-source *nix-like or Unix-derived OSes such as the *BSDs likely have 'ethtool' as well or something similar. (Also, macOS X apparently has 'ethtool', too, for what it's worth.)

    ----​

    As far as I can tell, the only kinds of modules that a "locked" X520 blocks are unapproved, active SFP+ modules, while virtually all SFP modules and passive SFP+ DACs from any vendor will work even in a card with DEVICE_CAPS bit 1 set to 0.

    One thing I'm hoping to establish through both my experimentation and yours is exactly which Intel parts have this bit flipped "off", and whether flipping this bit "on" is a complete unlock in all circumstances or whether other things factor in. /u/eruffini posted in this Reddit thread (Intel X520-DA2 10G NIC ESXi 6u2 woes. : vmware) that he'd been told that more recently manufactured X520s are the ones with the problem. On the other hand, we also know that X520-SR1/2 and X520-LR1/2 came with Intel optics and are intended to only be used with those. Is this bit in the EEPROM what is used to lock those models down? Does setting this bit to "on" essentially convert an X520-SR1 to an X520-DA1, for example? Or are there X520-DA1/2 cards that are also locked down from the factory? Is the enforcement to use Intel optics in the SR/LR cards done via the same mechanism or a totally separate one? Are 82599-based cards manufactured by other OEMs known to restrict SFP+s in the same way, and using the same mechanism? Etc.

    The other scenario that I'm hoping doesn't end up being the case -- but certainly is within the realm of possibility -- is that it seems feasible that Intel *could* have written and released a driver that ignores this bit in the EEPROM completely and always enforces SFP+ restrictions regardless. If the driver on the host has enough influence over the card's firmware that the SFP+ check can be overridden *despite* that bit (e.g. "allow_unsupported_sfp" on Linux), it is certainly plausible that this could cut both ways...

    Definitely looking forward to everybody's feedback and reports, & good luck!

    -- Nathan
     
    #1
  2. iceisfun

    iceisfun New Member

    Joined:
    Jul 19, 2014
    Messages:
    18
    Likes Received:
    1
    This is awesome, I have looked into this problem just enough to become frustrated and switch to MNX QSFP cards and have a pile of cards on a shelf that do not accept "locked"/unapproved SFP+ modules for SM LR 10k.

    I felt like I had done something wrong here and just needed to take more time on my debugging. I'll set this back up and test later.
     
    #2
  3. JustinClift

    JustinClift Member

    Joined:
    Oct 5, 2014
    Messages:
    35
    Likes Received:
    14
    From a different direction, this a util that flashes the EEPROM on SFP+ modules: sonicepk/sfppi

    Never got around to doing anything with it, but it might prove useful here. :)

    Oh, a similar effort for unlocking Intel cards, but on FreeBSD: bu7cher/ixl_unlock
     
    #3
  4. daniele99

    daniele99 New Member

    Joined:
    Aug 31, 2019
    Messages:
    3
    Likes Received:
    0
    Very thanx to NathanA...I follow all the steps (using a Live of Ubuntu) for my X520-DA2 and I changed (in my case) “fc” value to “fd”. Now I can use non-Intel transceiver (I use Fs ones). Thanx again
     
    #4
  5. GuybrushThreepwood

    Joined:
    Aug 2, 2015
    Messages:
    70
    Likes Received:
    26
    Anyone tried this on the X710 cards?
     
    #5
  6. Juan C

    Juan C New Member

    Joined:
    Oct 9, 2018
    Messages:
    6
    Likes Received:
    6
    I successfully "unlocked" an X520 card I picked up off eBay a few days ago. I used a bootable USB installation of Ubuntu 19.04, but had to apt-get install a few utilities before ethtool would run. In my case the byte value at 0x58 was identical to the one shown in @NathanA's post, so following the rest of the tutorial was easy. After making the modifications, the card is happy to accept Finisar SFP+ modules and is working perfectly with ESXi.
     
    #6
    JustinClift, dawsonkm and T_Minus like this.
  7. vanfawx

    vanfawx Active Member

    Joined:
    Jan 4, 2015
    Messages:
    301
    Likes Received:
    51
    Can we get this copied/moved to the "Guides" section? This is good stuff!
     
    #7
    JustinClift likes this.
  8. NathanA

    NathanA New Member

    Joined:
    May 21, 2019
    Messages:
    4
    Likes Received:
    16
    So glad to hear that people are actually having success with this!

    For anybody who had a previously-locked card that they managed to unlock using this procedure, would you be so kind as to post any identifying markers on your cards, like model numbers or the like? I'm curious what differentiates your factory-locked cards from my apparently factory-unlocked one.

    -- Nathan
     
    #8
  9. vanfawx

    vanfawx Active Member

    Joined:
    Jan 4, 2015
    Messages:
    301
    Likes Received:
    51
    Here's a question @NathanA - If I've loaded the driver with "allow_unsupported_sfp=1", does that make these instructions invalid? When I look at those bytes you show, I'm seeing FC instead of FE.

    # ethtool -e eth2 offset 0x58 length 1
    Offset Values
    ------ ------
    0x0058: fc​

    The card in question is a Dell "Intel 2P X520/2p i350 rNDC".

    Thanks!
     
    #9
  10. NathanA

    NathanA New Member

    Joined:
    May 21, 2019
    Messages:
    4
    Likes Received:
    16
    0xFC represents a value with the "allow all SFPs" bit flipped off. 0xFE still has that bit flipped off, and compared to 0xFC it flips on a different bit (the second-to-last one, not the last one).

    It would be changing from 0xFC > 0xFD that would flip the correct bit on for this.

    This is why it is important to double-check your binary math. So far from the reports that have rolled in, the most common values for that byte are 0xFC and 0xFD. Nobody has reported any byte value where second-to-last bit is on, and since these are undocumented I have ABSOLUTELY no idea what the consequences would or could be if someone tried flipping that bit to "on" on their particular card. (Linux driver source code may or may not shine some light on that bit...I'd have to dive into it again.)

    General rule of thumb so far is that 0xFC restricts the SFP+ modules you can use, and 0xFD removes the restriction. However, I caution people against merely blindly setting that byte to 0xFD if it is already set to something OTHER than 0xFC, because they will be flipping MORE than just the ONE bit needed to unlock all SFP modules. So far, we have not seen anybody report that they have a card where that byte is a value other than 0xFC or 0xFD, but we also have a VERY small sample size so far.

    Yes, if you are using this card in Linux and loading the kernel module with the allow_unsupported_sfp parameter set to 1, then your SFP will work regardless of what that bit in the EEPROM is set to. It's nice that the Linux driver has this option, but most other platforms (e.g., Windows, ESXi) do not. So these instructions are really to supply people who want to work around this restriction on their card in an OS OTHER than Linux. (Of course, you can flip the bit to "on" if you are using Linux, too, and then at that point you can stop bothering with setting allow_unsupported_sfp=1. But as far as I can tell, there is no real benefit to doing that, other than if you are, say, dual-booting between OSes on the machine this card is installed in and you want your "unsupported" SFP to work under all of them.)

    Hope this helps,

    -- Nathan
     
    #10
    vanfawx and JustinClift like this.
  11. vanfawx

    vanfawx Active Member

    Joined:
    Jan 4, 2015
    Messages:
    301
    Likes Received:
    51
    Thank you for the detailed explanation! I'll be trying this out on 3 servers in the near future and will update with my findings.
     
    #11
  12. ChasW

    ChasW New Member

    Joined:
    Nov 3, 2019
    Messages:
    9
    Likes Received:
    0
    Does this work just for Intel branded X520 cards?
    Has anybody tried this with a HPE NC560SFP for example?
     
    #12
  13. tomaash

    tomaash Member

    Joined:
    Oct 11, 2016
    Messages:
    87
    Likes Received:
    33
    Are you having issues? HPE 560sfp+ (and also FLR) have been working great for me with non-HP branded Finisar SFP+s.
     
    #13
  14. ChasW

    ChasW New Member

    Joined:
    Nov 3, 2019
    Messages:
    9
    Likes Received:
    0
    Not having trouble, but I looking to upgrade an existing network to 10GbE using HPE NC560SFP Adapters and I have not decided on one of the media connections yet. There is an existing cat6 run that is technically partially outdoor as it is in part run through a crawlspace that is 100 feet. I am considering trying a pair of SFP+ to RJ45 transceivers with this cable. Still doing research, but so far it seems that the transceivers that are Intel compatible will cost about as much as outdoor rated OM3 MMF LC to LC. If I have to do a new run I will, but that said, if I knew that the compatibility check on the card could be bypassed, it might pay off to test a few different SFP+ to RJ45 transceivers. Thoughts?
     
    #14
  15. nerdalertdk

    nerdalertdk Fleet Admiral

    Joined:
    Mar 9, 2017
    Messages:
    133
    Likes Received:
    52
    I think Hpe don’t care, I use a Cisco dac cable between my hpe 530 and my edgeswitch
     
    #15
  16. ChasW

    ChasW New Member

    Joined:
    Nov 3, 2019
    Messages:
    9
    Likes Received:
    0
    Good to know. Thank you.
     
    #16
  17. NathanA

    NathanA New Member

    Joined:
    May 21, 2019
    Messages:
    4
    Likes Received:
    16
    Not familiar with them, but I just looked those up, and physically they look nearly indistinguishable from their Intel-branded counterparts. I have nothing to back this up, but my *suspicion* is that the 82599ES behaves identically regardless (in terms of how it interprets and acts on EEPROM contents), but that a card from an OEM other than Intel has less incentive to "lock out" certain SFP+ modules, so HP cards probably just shipped with that bit set to 1. Of course, I have nothing but a hunch to back this up.

    Unfortunately, this probably doesn't mean anything...as I mentioned in the original post, my research thus far has led me to conclude that even locked cards do not care about what DAC cables you use. It's only enforcing a whitelist for laser transceivers (and only for SFP+ transceivers! it doesn't care about 1Gbit SFPs, just 10Gbit+ ones). So even if you had a card that had shipped with the lock bit set in EEPROM, the fact that your DAC cable works with it tells us nothing.
     
    #17
    nerdalertdk likes this.
  18. tomaash

    tomaash Member

    Joined:
    Oct 11, 2016
    Messages:
    87
    Likes Received:
    33
    So to remove any doubt on HP(E) 560SFP+, it's not locked:

    root@florian:~# ethtool -e ens1f0
    Offset Values
    ------ ------
    ...
    0x0050: 9b 14 20 40 13 20 4d 13 fd ff 35 08 00 80 48 02
    ...

    A simple test with Mikrotik S+RJ10 transceiver I can get a link however I have now just 1GbE switch at hand.

    100ft( ~30m) with cat6 seems to be just the limit for the Mikrotik S+RJ10 to work at full speed. https://i.mt.lv/cdn/rb_files/sfp_splusrj10-190211155011.pdf
    Maybe other transceivers have longer reach over cat6.
     
    #18
Similar Threads: Patching Intel
Forum Title Date
Networking Intel D-1518 build for PFSense 10Gb SFP+ router Nov 8, 2019
Networking Are these Intel i350-t4 cards all fake? Nov 2, 2019
Networking Intel X710-AT2 and X710-TM4 Carlsville NICs Oct 24, 2019
Networking Intel SR-IOV VM to host switching? Oct 20, 2019
Networking Broadcom BCM57416 10Gbe vs Intel X550 10Gbe adapters? Sep 29, 2019

Share This Page