Recent content by julianmh

  1. J

    amd epyc rome on h122ssl-c random crashes

    It was the nvmes. One time I sat by the server watching it die and had the opportunity to open event viewer. Both Samsung 990 had the same issue I guess it's a driver or firmware issue (yes newest firmwares were installed). Swapped disks had no issues anymore.
  2. J

    amd epyc rome on h122ssl-c random crashes

    just a note secure boot is disabled
  3. J

    amd epyc rome on h122ssl-c random crashes

    second run did finish too with 0 errors
  4. J

    amd epyc rome on h122ssl-c random crashes

    finished memtest: 0 errors, 0 ecc errors. sigh!
  5. J

    amd epyc rome on h122ssl-c random crashes

    im at the end of any idea.
  6. J

    amd epyc rome on h122ssl-c random crashes

    two complete pcs in different locations. one has a zippy redundant psu the other a normal 650 watt. did not swap them.
  7. J

    amd epyc rome on h122ssl-c random crashes

    yes, read if it is enabled by default but was unsure so i wrote the settings to disk and looked at the eccpoll setting in the mt86.cfg which is default set to 1. TL;DR; yes it is
  8. J

    amd epyc rome on h122ssl-c random crashes

    nothing till now, is currently at 64% with old bios
  9. J

    amd epyc rome on h122ssl-c random crashes

    again with 2.1 uefi firmware bug is still persistent, when blockmoving test is running with 64bytes blocks it immediately starts
  10. J

    amd epyc rome on h122ssl-c random crashes

    yes getting [UEFI Firmware error] could not start CPU 4 alog with 5 and 7 now trying to downgrade to 2.1
  11. J

    amd epyc rome on h122ssl-c random crashes

    ok just fiddled a bit around if something is extraordinary hot and if you look at page 10 here https://www.supermicro.com/manuals/motherboard/EPYC7000/MNL-2314.pdf the heatsink next to LEDSAS is extremly hot.
  12. J

    amd epyc rome on h122ssl-c random crashes

    it was the exhaust fan and optimal speed went into "reporting" as it was not spinning fast enough, full speed just yanks up all the fans and standard speed makes it green and again. i don't think that overheating is an issue as the server is hardly under load and temps were always ok when i...
  13. J

    amd epyc rome on h122ssl-c random crashes

    just wanted to say: i appreciate you, thanks! ok, that i could test with a bios downgrade. i let this run the last 40% and then see if its still there after that downgrade and test with multi core again.
  14. J

    amd epyc rome on h122ssl-c random crashes

    did not check till now. The idea was to rule out ECC issues first and then have a second run with all cores / threads enabled. (not sure if its correct if memtest tells me it has found 16 cpus, if it is supposed to count the cpu and the threads), on my list to look that up
  15. J

    amd epyc rome on h122ssl-c random crashes

    by this you mean lsi sas logs and ipmi health logs right? LSI sas logs has nothing and ipmi health logs has some fan issues with fan 5 and two less speed, but only after the crash maybe it has something to do with then going to uefi shell. anyway there is this: nothing which leads me into any...