overclocking Xeon E7 8894 v4 (x8) inside Lenovo X3950 X6 - "all core boost" ?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
The link you posted does not say anything regarding the part (or any part) not supporting v4.
I believe that link is saying that the part numbers do not support certain firmware versions older than, not newer than.

My part Both Top & Bottom IO Book: 00FN847
I am running Xeon V3.
The link you provided lists that part as V2 only.

You listed your firmware as latest, but you are running an older version than what I listed.

While I can be wrong, it was my impression that getting the latest firmware installed on the machine prior to purchase would allow me to run the V4, which is why they flashed it.
I have not verified that yet since I only have V3s.

I think you will have good results if you upgrade your firmware to:

UEFI Pri:A9E164AUS
v5.30 Date:2022/09/14

UEFI Alt:A9E154AUS
v4.80 Date:2020/04/09
 
  • Like
Reactions: chrgrose

RussianE39

New Member
May 22, 2021
13
2
3
It's not that bad, SMI almost double the speed.

E5 v0: 1600MHz factory
E5 v2: 1866MHz factory
E5 v2 overclocked (1650v2/1680v2): 2133MHz (Samsung 1866MHz DDR3 ECC)
E5 v3: 2133/2400MHz factory (~2666MHz overclocked DDR4)
E7 v2: 2666MHz (SMI)
E7 v4: 3200MHz (SMI2)

Trick is you need 2 dimms per SMI channel, so the memory is working at 1333MHz max in v2 and 1600MHz while connected via SMI2 to v4 CPU.
so E7 have 4 fast SMI channels (8 slower "DDR" channels - but 8 not 4 like E5).
I would recommend you reading "White Paper
FUJITSU Server PRIMERGY & PRIMEQUEST
Memory performance of Xeon E7-8800 / 4800 v2 (Ivy Bridge-EX) based systems".
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
I updated the UEFI to v5.3 and IMM to v5.9. No change: firmware hangs or CPU bus errors. Also, the claims on the webpage are:

With each release of newly supported CPU families, the Standard I/O Book replacement part number and VPD contents will change.
and
Replacement part number 00FN847 supports processors only in the Intel Xeon E7-x8xx V2 family
and
Replacement part number 00FN850 supports processors only in the Intel Xeon E7-x8xx V2 and Intel Xeon E7-x8xx V3 families.
This is literally saying that the parts only support the stated families.

I would expect this to be written differently, or to find something else on the page, indicating that a firmware update on the part would fix this, if it did. I agree it is odd that you have parts 00FN847 and are running v3 Xeons. But didn't you mention that your system wasn't working correctly before? Is your system working with no issues now? Perhaps v4 support for 00FN850 is more crippling than v3 support in part 00FN847.
 

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
While I cannot state for certain that 00FN847 can run V4, I can for 100% certain say that it will run V3.
I have 8 ddr4 compute books, all 00D0402.
Three books had issues.
One book had a voltage fault and so I could only run 6 books in the machine.
Another book one had an SMI Link error preventing 2 dimms from being used.
And a third book had SMI Link errors preventing 4 dimms from being used.
So I was able to run 6, but am currently only running 4 while I await replacement books.

I had believed 2 things.
1) That this was only possible after the UEFI & IMM firmware upgrades.
2) That the same upgrade made both v3 & v4 possible.

I could be wrong on both counts.
It's unfortunate that you are having this issue.

One thing to make sure, do you have the same firmware on both primary and secondary storage books?
Also, I am curious, how did you upgrade the firmware while having these bus errors?

I see that there are a couple of E7-8867 v4 cpus on ebay for under $20, I am tempted to get them to verify if my machine can really do v4.
I think I can run separated nodes and just 2 cpu in primary node for the experiment.
The only cpus I am interested in though are E7-8895v3, E7-8891v4, E7-8894v4.
It looks like E7-8891v4 would likely be slower than the E7-8895v3, but the E7-8894v4 is still too expensive for me.
 
  • Like
Reactions: chrgrose

chrgrose

Active Member
Jul 18, 2018
106
53
28
I've been updating firmware through the IMM, and updating both top/bottom storage books equally, at least the primary firmware (backup is different version sometimes). I know you can update with Xclarity, but I've had problems with it. The IMM seems all fine, but when I try to power the system on it runs into the CPU bus error or the firmware hangs.

I've purchased a set of four 8895v3 cpus, so hopefully that will let me nail down whether or not it is truly a problem with v4 compatibility or not.

Thanks for your tips and advice so far :)

By the way if you go for v4 xeons, I suggest looking into the 8890v4. The 8894v4, at least on the ebay market, is terrible right now (about a month ago they could be had for $200-250 a unit, but they are all gone now). On the other hand, the 8890v4, which I went with, can be had for half that. I got my set of 8 from an offer of 95 a unit.
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
I just replaced my 8890v4's with 8895v3's and it booted into UEFI immediately.

I'm guessing the I/O books must be the source of the incompatibility.
 

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
I guess you bought up the remaining 8895's off ebay ;)
One quick thing though, you said you have the v4 books by part number, but are they the books where the first dimm socket is white and the second and third dimm sockets are black?
Also, are you putting DDR4 in there? Or are you using DDR3?

You could try to remove all the dimms except one per book and remove the half height io books and basically anything that is not minimally required and see if that has any different startup ability.
 

MichalPL

Active Member
Feb 10, 2019
189
25
28
I would recommend you reading "White Paper
FUJITSU Server PRIMERGY & PRIMEQUEST
Memory performance of Xeon E7-8800 / 4800 v2 (Ivy Bridge-EX) based systems".
Good document, I also recommend this one:
Optimizing Memory Performance of Lenovo Servers Based on Intel Xeon E7 v3 Processors
and my opinion about SMI 2666/3200 is the same, no sense to use other memory modes, so this is max for this platforms


I just replaced my 8890v4's with 8895v3's and it booted into UEFI immediately.
Means 15 minutes later ;), after pressing the power button :p
I can't shutdown my server right now, can I check model numbers of the books or other important components via IMM?

my for sure was configured as v3 CPU's (factory thermal paste and 8x 8880v3 inside when bought), upgraded to v4 without any SF/FW update.

Also, are you putting DDR4 in there? Or are you using DDR3?
I think Lenovo have just 2 versions:

DDR3 for v2/v3
DDR4 for v3/v4 (this is model in my server) both v3/v4 fully working - fully confirmed, without any SF/FW upgrades.
 
  • Like
Reactions: aij

chrgrose

Active Member
Jul 18, 2018
106
53
28
MichalPL,
Yeah not immediately in that sense, haha. Well, actually I think the last power cycle it went through on the first boot couldn't have been more than a few minutes, but I only have 4x32GB in 4 installed compute books at the moment.

I don't think you can check part numbers in the IMM, but if you power it off anytime soon I'd be interested to know. I'm also curious if #00FN856 can for sure do v4's or not, since that one can be had for not too much at the moment.

ootronicsnazleaki,
yes I've tried booting it with only 1 dimm in each book, and I don't have any secondary i/o books (just the standard i/o books, storage books, and compute books). No luck with the v4's. I am using the DDR4 books which have the white/black/black dimm slots.

Hope I can be part of the happy x3950 x6 owners club soon ;)
 

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
Sad thing - how ugly x3950 x6 looks with a full size PCI-EX book with GPU inside :)
I purchased one of the full length io books.
The video card height is too much though for it, any card which is taller than the top of the bracket would not fit without taking a cutter to the case to make room, which I intend to do.
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
Quick report: Installed 00YA701 standard i/o books. Now works perfectly fine! Now, I just have to work on my codes to make them work on 8 NUMA nodes :)
 

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
Congratulations!

What's that phrase? "benchmarks or it didn't happen", something like that :D

Now I'm worried I will need to upgrade my io boards, maybe my v4 upgrade path got lost in the convo when I had it put together.
But I won't have to do that until E7-8894 V4 prices go down, if ever.

For now, anyone have any experience trying to hack the bios to be not extremely dumb?

A couple things:
1) I mean, does the biod not use the actual Xeon to do the ECC memory zeroing?
It should not take 10 minutes to do these 64 dimms, it should take the time it takes to do 8 dimms and do 8 in parallel.

2) It should not go through the ecc, then enter bios then check 'power adjustment' or whatever, only to reboot then go through the ecc again, then the power adjustment.

3) it should not overflow the error log with pointless messages about not official ram, thus occluding any 'real' log messages there may have been.

I am willing to do a hardware pcb dongle add on somewhere if feasible, but I probably do not have the ability to access the information needed to do such a thing.

I think I read some comments over at coreboot saying it would take an expert years to reverse engineer getting their bios to work on such a system due to all the proprietary hoops to navigate.

If I could trick the system into falsifying the ECC memory zeroing, that would be a huge start.

If the bios isn't running on the xeons, it would seem fine to get to the bios screen with a flag saying reboot without the falsification setting, thus eliminating some of the ECC boot delays.

The bios is also insanely slow, is it running everything through some Ic2 10 hz bus or something?
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
I haven't done much 'proper' benchmarking of my system yet, but I have played with it a bit. Here's a shot with a few benchy things displayed:

lores.jpg

I installed 16x600GB 12GB/s HDDs, and the front face of the machine certainly looks more dignified with the Lenovo trays and activity lights.

s-l1600.jpg

The Home Depot dolly is also convenient to haul it into the rooms that need heating.

The Cinebench runs were both on 96 cores. 128 cores seems to give no improvement or depresses performance for unclear reasons. The Passmark tests seem to all be perfectly in line with results from MichalPL, adjusted down by about 10%, since his 8894v4's are clocked about 10% higher across the board.

The two CrystalDiskMark results are for a 1TB Corsair MP510 M.2 NVME (top) and one of the arrays of 8 HDD's in RAID-5 (below). The read performance on the array seems pretty nice.

My main interest at the moment is power usage evaluation. I currently have 6TB DDR4 installed, and DC power usage reported by IMM is

idle: ~130W (CPU) & ~310W (RAM)
Cinebench R23 on all cores of the main node: ~515W (CPU) & ~585W (RAM)

Here's the IMM on the main node at idle:
1678407222390.png

and during an R23 run:
1678406848061.png

So presumably this would be about 2.2kW DC power for a fully loaded system. However, I'm a little confused about the readings, as they conflict with the system power displayed on the front LCD panel (environmental information). The front panel report is generally much lower, displaying about 280W at idle and 580W on load, for each top/bottom nodes.

The simple way I'm going to figure out the power thing for sure is that I've ordered two wall power meters, which should arrive over the weekend.

In either case the power draw of 192x 2Rx4 32GB DIMM's must be substantial, so I plan to later test a configuration with 32x 8GB 1Rx8 DIMMs,
which seems like a good minimum capacity configuration that optimizes power and bandwidth. I will probably end up using a lower memory configuration until I get back to developing my codes that actually need a huge memory footprint.

I'll try to do a proper time test of the boot time with 6TB DDR4. I have also noticed the UEFI is pretty slow, but no idea how to fix that haha. Things are better when I'm in remote desktop.
 
  • Like
Reactions: aij

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
Looks good. If you read the memory documents posted earlier in this thread, they will describe there is an optimal minimal config for memory of 64 DIMM.

You might be leaving more than 30% performance lost from using 32 single rank.
You are using V4, but this document still applies:
It shows DDR4 configurations.

All channels should have at least one DIMM.
All channels should be populated identically.
Even Rank count should be populated (2, 4, 6) per channel.

For me that equates to 2 ranks on each channel, 64 x 16GB 2Rx4 DIMMs.
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
Thanks for the info. From the documentation, it appears that ranks per channel does not significantly change performance, but yes all 8 channels should be populated. I guess I'll try an 8x8GB config and see if power differs much from 4x8GB config, in each book.

Anyway, I got the wall power meters:

Each node draws about 440W at idle, 1120 during R23, from the wall. An AIDA64 CPU/memory stress test draws about 880W from the node with all cores working.
 

ootronicsnazleaki

New Member
Jan 9, 2023
14
3
3
For the rank, going from single to double is 5.26% uptick.
I wonder why the node front display watts is so different than the wall meter, I can understand the temp would be off due to just being a sensor on the compute book rather than core readings.
I guess I'll have to measure mine with the wall meter, I thought I was getting 180 to 240 watts idle per node but I guess maybe not.
I only use 4 of the PSUs, living dangerously but might help with efficiency there.
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
Yeah I suppose 5% can matter. Though, the almost 40% loss from populating only 4 of 8 channels could be major.

It looks like the front panels are reliably displaying almost exactly half of the actual power usage on each node. No idea why. I haven't tried to use fewer PSU's in a while. When I was first playing with it it got mad at me when I tried to use fewer. Maybe I'll try again.
 

chrgrose

Active Member
Jul 18, 2018
106
53
28
Just got around to looking more into power consumption on the X6 for lower memory configs. This is AC power from the wall. The 4 PSU's from each node are taking power from separate outlets, each outlet has a power meter on it.

CB23: power from the one (main) node whose 96 cores are running Cinebench R23.
idle: power from the same node about a minute after CB23 stopped running.

6TB (192x32GB 2Rx8)
idle: 440W/node
CB23: 1120W/node

2TB(64x32GB 2Rx8)
idle: 380W/node
CB23: 1035W/node

256GB(28x8GB 1Rx8 + 4x8GB 2Rx4)
idle: 315W/node
CB23: 940W/node

I might have to do this again with only 1 DIMM installed. This is not an acceptable configuration for most any work, but would show how much of the memory power draw is due to the amount of dimms versus the dimm types. If its the amount of dimms, the power could go way down if you just have 1 installed, but it would at least show that there is no point in using low capacity dimms, or dimms with fewer chips, to get lower power draw.

Note that my tests may not be good for figuring the effect of memory population on power draw in other important situations, because cinbench doesn't really hit memory.

As an aside, I timed the boot for the 256GB config to 16 minutes to windows. I didn't do this for the 6TB config, but I don't recall it being very much longer than this. This could be because a huge amount of the boot time is other system initialization processes, aside from memory training.