Can I escape ThreadRipper PRO with AM5?

Pakna · Dec 16, 2022

I am looking at an mid-tier AM5 board (something like Asus Prime X670-P) and am trying to understand if 24 PCIe lines will be enough - I am really tired of having the struggle on X99 and my i7-5820k and would like to avoid this in the future, thus hoping someone can confirm or explain better if this is how it's working?

To give some further context - I don't care about PCIe 5.0, 4k gaming, crossfire, copper-based LAN, wifi, sound or USB 3.2+. My work revolves mostly around (software) engineering with casual gaming sprinkled here and there (stuff like Deus Ex MD, X-Com, Stellaris or Elite: Dangerous). I'd like to retain my existing 1080Ti as it serves me well. Ideally, I'd be looking for a board that can host a modern graphics card, a pair of NVMe SSDs, a 10/25 G fibre NIC and have some PCIe expansion capacity without having to turn off anything that's already connected over PCIe.

From what I understand, no GPU actually uses (or cares for) anything more that PCIe 3.0 x8. More precisely, what I am referring to is the actual necessity of a given (consumer) GPU for anything more than PCIe 3.0 x8. Electrically, these are wired to be PCIe 3.0 x16, but if you halve the port lanes, FPS rates will be impacted minimally.
State-of-the-art consumer SSDs are mostly PCIe 4.0 x4 and the rest of the expansion componentry should be more than ok with anything that's equal to PCIe 3.0 x4/x8.

Given that, with an AMD board [like the Prime X670], I should be able to:
- populate the 1x PCIe 4.0 x16 with GPU
- populate the 1x PCIe 4.0 x16 with a 10G+ fiber optics NIC
- still have one extra slot fully functional for, say an expansion card with 4x M.2 NVMe SSD slots
- still have at least two M.2 NVMe SSD on-board slots functional even with everything populated

Questions:
- are my assumptions above true or false?
- do I need to free up any of the lanes by downgrading the, e.g. GPU slot to PCIe 4.0 x8, in order to have everything functional at the same time
- can you suggest a better motherboard than the one I had considered?
- if a given adapter requires less speed (e.g. a single-port 10G NIC requires a PCIe 2.0 x8), I would assume that it will occupy that number of lanes and no more? To extend the example, a PCIe 2.0 x8 == PCIe 3.0 x4 == PCIe 4.0 x2, thus even though the NIC is plugged-in to the PCIe 4.0 x16 port, I would expect it to use only x2 (i.e. two lanes) thus leaving 14 from that port available. Does lane occupancy get determined like that or is there more that I don't know?

NB, there is a motherboard that does all of this, and it's of course the Asus WRX80 Sage but if at all possible I'd like to avoid having to jump to the expensive Threadripper PRO - even though I am changing workstations roughly once every 7 or 8 years (on average).

Any thoughts/suggestions are more than welcome - appreciate your time and TIA.

i386 · Dec 16, 2022

Pakna said:
To extend the example, a PCIe 2.0 x8 == PCIe 3.0 x4 == PCIe 4.0 x2, thus even though the NIC is plugged-in to the PCIe 4.0 x16 port, I would expect it to use only x2 (i.e. two lanes) thus leaving 14 from that port available.

That's not how pcie works, every pcie lane is a single serial connection.

Pakna said:
- are my assumptions above true or false?

they are a mix of both. 1x x16 slot comes ddirectly from the cpu, the other two come from the chipset and are electronically only x4 (check the mainboards specification & datasheet)

Pakna said:
Any thoughts/suggestions are more than welcome - appreciate your time and TIA.

I think you should make a list with "wants" vs "needs" and then reconsider what workstation you want to build.
If a gpu (8 or 16 lanes) + a 10+ GBE nic (8 lanes) + 4+ m.2 (16 lanes, not considering plx based stuff) are hard requirements than am5 (24 lanes) is the wrong platform in my opinion...

Pakna said:
- can you suggest a better motherboard than the one I had considered?

supermicro m12swa

T_Minus · Dec 16, 2022

The sad part is the M12SWA cost the same or less than high end consumer boards at the moment, it's insane... $999 for top end consumer boards... WHAT... CPU whole different thing.... just need those single core passmark scores in the 4000 range and we're good

LOL

mattventura · Dec 16, 2022

Probably the wrong platform. You get 24 lanes, and typically that would be arranged such that the two main x16 slots share 16 lanes (i.e. x16/disabled or x8/x8), and the other 8 lanes go to M.2 slots. Everything else runs off the chipset.

Sadly it's difficult to find good benchmarks on exactly how much performance you lose when running a high-speed NIC or NVMe drive off of chipset lanes really impacts performance.

Pakna said:
- do I need to free up any of the lanes by downgrading the, e.g. GPU slot to PCIe 4.0 x8, in order to have everything functional at the same time

This typically happens automatically. If the MB detects something plugged into the second slot, it will switch from x16/disabled to x8/x8.

Pakna said:
- if a given adapter requires less speed (e.g. a single-port 10G NIC requires a PCIe 2.0 x8), I would assume that it will occupy that number of lanes and no more? To extend the example, a PCIe 2.0 x8 == PCIe 3.0 x4 == PCIe 4.0 x2, thus even though the NIC is plugged-in to the PCIe 4.0 x16 port, I would expect it to use only x2 (i.e. two lanes) thus leaving 14 from that port available. Does lane occupancy get determined like that or is there more that I don't know?

No, it doesn't work like that. Typically the NIC will be sized such that it will have just enough bandwidth. Example: a single-port 25G NIC might have a PCIe 3.0 x4 link (32gb/s nominally). If you plugged it into a PCIe 5.0 port, it wouldn't "downsize" to a PCIe 5.0 x1 link - it would still be stuck at 3.0 speeds and would try to use the full x4 link if available.

The only time you can really downsize without bottlenecking is if you're only running a single port on a dual/quad-port card.

NB, there is a motherboard that does all of this, and it's of course the Asus WRX80 Sage but if at all possible I'd like to avoid having to jump to the expensive Threadripper PRO - even though I am changing workstations roughly once every 7 or 8 years (on average).

Any thoughts/suggestions are more than welcome - appreciate your time and TIA.

There might be boards out there with PCIe switches to provide more effective slots (or DIY routes). If you have, say, a PCIe switch with a PCIe 5.0 x16 connection to the CPU, it would theoretically be able to provide 4x 16x PCIe 3.0 connections without any bottlenecking. But the chipset is already a PCIe switch, and sort of does that (AM5 uses PCIe 5.0 x4 to the chipset, so it would be able to do 8 lanes of 4.0 or 16 lanes of 3.0), so you might be able to make use of that. The only issue is that most of these boards send those lanes to either x4/x1 slots, M.2 slots, or onboard devices.

You could go for an older gen Threadripper or TR pro, but you're also getting a downgrade from AM5's DDR5 back to DDR4, not to mention an older CPU. The extra channels and PCIe lanes on TR could very well outweigh that depending on your application.

Pakna · Dec 16, 2022

When it comes to Threadripper PRO, I am actually not that bothered by motherboard prices (which are expensive but also sophisticated) - but the CPUs are a kick in the gutter. A 16-core 5955WX Chagall is 1800 CAD - it's the price of a full top-end 7950x AM5 system, sans DDR5 only to get extra PCI lanes and memory bandwidth! Back in the day, I was pretty miffed with how intel penny-pinched PCI lanes for X99 but AMD is taking this to an even greater extreme. What I dislike here are the binary absolutes - it's either poverty-spec PCI lanes for AM5 or ridiculuous lavishness of WRX80. I mean, would you honestly run something like 4 nVidia A80 on this board or would you use an Epyc as a platform? Or an array of 28 NVMe SSDs? Can't we have something like 40 PCI lanes for some in-between price? Why is that impossible?

I am just so disappointed with what feels like a giant step back - i7-5930k had 40 PCI lanes.....in 2015.

M12SWA looks like a great board but I am kind of concerned running a server board in something like a Fractal Torrent - not sure if this will be enough airflow to dissipate the VRM heat without making it unbearable to be near the chassis. AsRock WRX80 creator dropped Intel NICs in Rev 2.0 of the board - looks like a drop in the price as well. Still, the CPU price is insane....and it looks like zero price delta between 3955WX and 5955WX - that is a surprise.

mattventura said:
No, it doesn't work like that. Typically the NIC will be sized such that it will have just enough bandwidth. Example: a single-port 25G NIC might have a PCIe 3.0 x4 link (32gb/s nominally). If you plugged it into a PCIe 5.0 port, it wouldn't "downsize" to a PCIe 5.0 x1 link - it would still be stuck at 3.0 speeds and would try to use the full x4 link if available.

I am still not sure if I understand how this works. Let me see if I do with an extreme example - a 2012 SFP+ NIC uses PCIe 2.0 x8, thus it'll electrically (physically) occupy no less than 8 PCIe 4.0 lanes? And by restricting the number of PCIe 4.0 lanes on that port to 4 (say with bifurcation setting), I'll immediately drop the NIC speed as well, since it'll be operating effectively with PCIe 2.0 x4 settings?

mattventura · Dec 16, 2022

Pakna said:
I am just so disappointed with what feels like a giant step back - i7-5930k had 40 PCI lanes.....in 2015.

Sadly it's par for the course for desktop platforms - you have to jump up to HEDT if you want more lanes, since 99% of users are throwing a GPU in their single x16 slot and maybe an M.2 drive or two. It used to be worse - 16 lanes from the CPU and that's it.

Now, there's still chipset lanes as an option, but the two issues I've noticed are:
1. They usually go to x4, x1, or M.2 slots, or to onboard devices, not to x8 or x16 slots
2. There is a lack of proper benchmarking of direct CPU lanes vs chipset lanes.

I'm actually in sort of the same bind - I'd like to upgrade, and I have a need for more PCIe bandwidth than desktop platforms provide (GPU+40GbE+NVMe), but it's unclear whether or not chipset lanes really make a difference for that kind of thing.

I am still not sure if I understand how this works. Let me see if I do with an extreme example - a 2012 SFP+ NIC uses PCIe 2.0 x8, thus it'll electrically (physically) occupy no less than 8 PCIe 4.0 lanes? And by restricting the number of PCIe 4.0 lanes on that port to 4 (say with bifurcation setting), I'll immediately drop the NIC speed as well, since it'll be operating effectively with PCIe 2.0 x4 settings?

That is correct. A PCIe 2.0 x8 device can downgrade to PCIe 1.0, and/or use fewer than 8 lanes, but it would always lose bandwidth by doing so.

Pakna · Dec 16, 2022

mattventura said:
Now, there's still chipset lanes as an option, but the two issues I've noticed are:
1. They usually go to x4, x1, or M.2 slots, or to onboard devices, not to x8 or x16 slots
2. There is a lack of proper benchmarking of direct CPU lanes vs chipset lanes.

I'm actually in sort of the same bind - I'd like to upgrade, and I have a need for more PCIe bandwidth than desktop platforms provide (GPU+40GbE+NVMe), but it's unclear whether or not chipset lanes really make a difference for that kind of thing.

Come to think of it, most GPUs with PCIe 3.0 x8 are within a few percents of baseline performance on x16. If (and that is a big "if") the BIOS provides the option, that would immediately yield eight more lanes and solve most of our problems?

What motherboards would be most likely to have something like this - I am thinking Supermicro? EVGA might be another but sadly, I am seeing a tentative AM5 board sometime next year (likely Q3 or later).

i386 · Dec 17, 2022

Pakna said:
Can't we have something like 40 PCI lanes for some in-between price?

Did you look at intel stuff? they have cpus with 64 pcie lanes

Pakna said:
that would immediately yield eight more lanes and solve most of our problems?

I'm not sure;in order to use the additional lanes you would probably need pcie switch(es).
example how it could be implemented (from X12SPA-TF):

DaveLTX · Dec 17, 2022

Pakna said:
When it comes to Threadripper PRO, I am actually not that bothered by motherboard prices (which are expensive but also sophisticated) - but the CPUs are a kick in the gutter. A 16-core 5955WX Chagall is 1800 CAD - it's the price of a full top-end 7950x AM5 system, sans DDR5 only to get extra PCI lanes and memory bandwidth! Back in the day, I was pretty miffed with how intel penny-pinched PCI lanes for X99 but AMD is taking this to an even greater extreme. What I dislike here are the binary absolutes - it's either poverty-spec PCI lanes for AM5 or ridiculuous lavishness of WRX80. I mean, would you honestly run something like 4 nVidia A80 on this board or would you use an Epyc as a platform? Or an array of 28 NVMe SSDs? Can't we have something like 40 PCI lanes for some in-between price? Why is that impossible?

I am just so disappointed with what feels like a giant step back - i7-5930k had 40 PCI lanes.....in 2015.

M12SWA looks like a great board but I am kind of concerned running a server board in something like a Fractal Torrent - not sure if this will be enough airflow to dissipate the VRM heat without making it unbearable to be near the chassis. AsRock WRX80 creator dropped Intel NICs in Rev 2.0 of the board - looks like a drop in the price as well. Still, the CPU price is insane....and it looks like zero price delta between 3955WX and 5955WX - that is a surprise.

Mainstream boards have always been like that. More IO more power draw. More silicon.
If you went back (as I did back then), a 5960x was almost 2k USD (6950x nailed that home!) and wouldn't offer anything more than double the core count of mainstream platforms.

All the HEDT, which I'll say began with X38 offered more IO but it had a different NB back then to enable more IO. With X58 the socket split and designs also split

In a sense I could argue that Intel deliberately did that so we can't get more IO with the same socket?
Otherwise, HEDT has always been repurposed server designs.

If most people have no purpose for a gigantic IO die, why should Intel or AMD do so?

i386 said:
Did you look at intel stuff? they have cpus with 64 pcie lanes

I'm not sure;in order to use the additional lanes you would probably need pcie switch(es).

example how it could be implemented (from X12SPA-TF):

Screenshot 2022-12-17 112151.jpg

1) Intel hasn't updated HEDT yet. Still cascade lake. He would be going back many generations.

2) that's a bifurcation.

Pakna said:
Come to think of it, most GPUs with PCIe 3.0 x8 are within a few percents of baseline performance on x16. If (and that is a big "if") the BIOS provides the option, that would immediately yield eight more lanes and solve most of our problems?

What motherboards would be most likely to have something like this - I am thinking Supermicro? EVGA might be another but sadly, I am seeing a tentative AM5 board sometime next year (likely Q3 or later).

Not necessarily. 1080ti? Sure. But if you were to be downgraded to pcie 3.0 x8 on a 3080 or newer it WILL hurt. Thankfully you're going to be running pcie 4.0
And also, that pcie 4.0 will remain at 4.0 even if the bifurcated device is 3.0. but the nic will be at whatever version and lanes it gets. If bifurcated. Many boards don't split them anymore for cost reasons.

MichalPL · Dec 19, 2022

First - fully agree that "X79"/"X99"/"X299" was an amazing platforms - my favorite was Xeon 1660 v3 overclocked to 5.1GHz (4.8 all core)

the joy ended when Ryzen 5000 arrived with ~3700 single core passmark (1650/1680 v2 was able to do ~2500, 1660 v3 ~3000). And now when 13 gen is able to almost double performance of the 1650v2 no sense to use them anymore.

To the merits, because @ work we had to solve the same issue (migrating from 1650/1680 v2 and 1660 v3) to Ryzen 5000/7000, Intel 12/13 gen having lot of NVMe and 10G/40G and sometimes 10G cards.

Pakna said:
Given that, with an AMD board [like the Prime X670], I should be able to:
- populate the 1x PCIe 4.0 x16 with GPU
- populate the 1x PCIe 4.0 x16 with a 10G+ fiber optics NIC
- still have one extra slot fully functional for, say an expansion card with 4x M.2 NVMe SSD slots
- still have at least two M.2 NVMe SSD on-board slots functional even with everything populated

Yes, easily (up to almost 40GbE, but not 100GbE) - the method will be much simpler than mentioned above and what is worth to mention is Intel 12/13 gen platform (Z690/Z790 chipset only) that is MUCH better (although AM4 and Ryzen5000 was much better than Intel 11gen).

Pakna said:
Questions:
- are my assumptions above true or false?
- do I need to free up any of the lanes by downgrading the, e.g. GPU slot to PCIe 4.0 x8, in order to have everything functional at the same time
- can you suggest a better motherboard than the one I had considered?
- if a given adapter requires less speed (e.g. a single-port 10G NIC requires a PCIe 2.0 x8), I would assume that it will occupy that number of lanes and no more? To extend the example, a PCIe 2.0 x8 == PCIe 3.0 x4 == PCIe 4.0 x2, thus even though the NIC is plugged-in to the PCIe 4.0 x16 port, I would expect it to use only x2 (i.e. two lanes) thus leaving 14 from that port available. Does lane occupancy get determined like that or is there more that I don't know?

1. True
2. No
3. Yes
4. No

=====
4. First NIC:

Super old 10G NIC (like Mellanox ConnectX-2) have 8 lanes PCIe 2.0, but if you have plan to use just one port at once x4 is enough and x2 is almost (you will achieve ~900MB/s instead of 1100MB/s) - conclusion you need just PCIe x4 eclectically.

Modern 10G NIC (like Mellanox ConnectX-3) have 4 lanes PCIe 3.0. To achieve almost 10G (~900MB/s) you need just 1 lane, for full speed at one port 2 lanes.

Modern 25G NIC (like Mellanox ConnectX-4) have 8 lanes PCIe 3.0. To achieve full 25GbE speed at one port you need 4 lanes PCIe 3.0.

Old 40G NIC (like Mellanox ConnectX-3) have 8 lanes PCIe 3.0. To achieve almost 40G you need all 8 lanes, for almost 40GbE (3700MB/s instead of 4420MB/s) just 4 lanes of the PCIe 3.0 are ok.

Summary:
for 10, 2x10, 25 and almost 40 GbE 4 lanes of PCIe 3.0 are ok (one exception for 2x10GbE LACP PCIe 3.0 card is needed to achieve full 2x10GbE).

=====

Pakna said:
Given that, with an AMD board [like the Prime X670], I should be able to:

It's almost perfect (ideal as a AM5) losing only witch much faster intel 12/13gen Z690/Z790 platform (twice bandwidth to the chipset).

What you have here in Prime X670:

-Full PCIe 4.0 x16 for the graphics card
-Full PCIe 4.0 x4 (x16 physically) [chipset] for the Network Card (ideal for 10/25GbE and almost ideal for 40GbE, also you can put super expensive ConnectX-5 100GbE and will do about 8000MB/s because it's PCIe 4.0).
-2x NVMe x4 connected directly to the CPU (one is PCIe 4.0 second 5.0).
-Full PCIe 4.0 x4 (x16 physically) [chipset] for another NIC or whatever including NVMe on PCIe card.
-1x NVMe x4 [chipset] for another NVMe.

===
the only issue comparing to the intel, the chipset connection is equivalent of PCIe 4.0 x4 at intel (Z690/Z790 only) it is PCIe 4.0 x8, so you can use NVMe drive and NIC at the same time at the full speed, but here you have 2 NVMe connected to cpu (in intel it's just 1 and "only PCIe 4.0).

My favorite is:
Gigabyte Z690 GAMING X DDR4
and i5/i7/i9 13gen CPU (single core faster and DDR4)
also:
1x PCIe 4.0 x16
2x PCIe 3.0 x4 (PCIe 4.0 in Z790 variant)
and 4x NVMe PCIe 4.0 <- also same max around 21GB/s in RAID0, but chipset PCIe better balanced.

Summary:

Yes, you are right.
No problem to connect 10/25G cards (even 2) and use them at full speed. (max is 2x ConnectX-5 100G and use them at 80% of the speed each).
No problem to use also 2 NVMe PCIe 4.0 drives at full speed in RAID with ~14GB/s speeds (at the same time).

You can even use 3 NVMe PCIe4.0 drives in RAID at about ~21GB/s speed when not using NIC at exactly same time.

oneplane · Dec 19, 2022

Just to add some info that isn't written out explicitly: PCIe lanes don't work like memory allocation where you have a bunch of lanes and you can split them up however you like. They are hard-wired, and the only thing you can do is 'disable' lanes on a slot, not 'move' or 'add' them.

This means that the electrical connections are your upper limit, and any shared (so: double connected) lanes between the physical slots will be going via a chip that can cut connections. It will then have some firmware configuration somewhere that defines which lanes are cut from a slot, and it usually has very few profiles, like 16 lanes and 2 slots will maybe have (as written earlier) a x0/x16, x8/x8 and maybe x16/x0 if there is some reason to do that for physical slot locations (i.e. cooler overhang that might be 'above' or 'below' the slot). This sometimes also means that one slot is reversed (wrt. lane Ids) depending on the lane configuration chip used (if not done on the root complex itself). That is so that switching an entire bank of lanes will always only 'remove' the last 8 lanes so you don't end up with 8 lanes on the wrong end of the slot.

Depending on where the switching is done (CPU, Chipset, lane switcher, or actual PCIe switch), the amount of lanes, the firmware, the slots, the physical topology and features like bifurcation and bus pausing this all gets somewhat complicated.

MichalPL · Dec 19, 2022

mattventura said:
Now, there's still chipset lanes as an option, but the two issues I've noticed are:
1. They usually go to x4, x1, or M.2 slots, or to onboard devices, not to x8 or x16 slots
2. There is a lack of proper benchmarking of direct CPU lanes vs chipset lanes.

1. on AMD AM5 on the good boards: 2 NVMe's directly to the CPU, on Intel one drive. Gigabyte Z690 and ASUS Prime X670 have 2x x4 slots with is almost ok for most cases (yes x8 will be better I know).

2. I have tested it: same speed almost same latency, no difference (on AM5 one PCIe 4.0 drive will saturate bandwidth (~8000MB/s) on Intel two drives (~16000MB/s).

mattventura said:
I'm actually in sort of the same bind - I'd like to upgrade, and I have a need for more PCIe bandwidth than desktop platforms provide (GPU+40GbE+NVMe), but it's unclear whether or not chipset lanes really make a difference for that kind of thing.

It's no problem, you can almost do it on the Gigabyte Z690/Z790 GAMING X or Asus Prime X670, no bifurcation needed.
except one thing: 40GbE will work at around 3700MB/s instead of 4420MB/s. but NVMe can achieve 21GB/s in both cases in RAID0.

DaveLTX · Dec 19, 2022

MichalPL said:
1. on AMD AM5 on the good boards: 2 NVMe's directly to the CPU, on Intel one drive. Gigabyte Z690 and ASUS Prime X670 have 2x x4 slots with is almost ok for most cases (yes x8 will be better I know).

2. I have tested it: same speed almost same latency, no difference (on AM5 one PCIe 4.0 drive will saturate bandwidth (~8000MB/s) on Intel two drives (~16000MB/s).

It's no problem, you can almost do it on the Gigabyte Z690/Z790 GAMING X or Asus Prime X670, no bifurcation needed.
except one thing: 40GbE will work at around 3700MB/s instead of 4420MB/s. but NVMe can achieve 21GB/s in both cases in RAID0.

It bears reminding that PCIE is bidirectional dual simplex, you can read and write at the same time
i.e you can write to a drive while the network card is receiving. (i think, someone correct me on this if wrong) But... If you were to use on a windows platform, SMB will bite you first before you reach anywhere near 40gbe

mattventura · Dec 19, 2022

MichalPL said:
First - fully agree that "X79"/"X99"/"X299" was an amazing platforms - my favorite was Xeon 1660 v3 overclocked to 5.1GHz (4.8 all core) the joy ended when Ryzen 5000 arrived with ~3700 single core passmark (1650/1680 v2 was able to do ~2500, 1660 v3 ~3000). And now when 13 gen is able to almost double performance of the 1650v2 no sense to use them anymore.

Stuck in kinda the same boat here. Waiting to see what the DDR5 HEDT offerings look like before I make a decision. I'm considering buying a 2nd-gen EPYC as a stopgap since I'd be able to repurpose it as a server later, but the single thread performance on those is just too big of a downgrade.

DaveLTX said:
It bears reminding that PCIE is bidirectional dual simplex, you can read and write at the same time
i.e you can write to a drive while the network card is receiving. (i think, someone correct me on this if wrong) But... If you were to use on a windows platform, SMB will bite you first before you reach anywhere near 40gbe

This is true, given a similarly-specced Samba host and Windows client, the client tends to bottleneck first (assuming the underlying storage can keep up).

MichalPL · Dec 19, 2022

DaveLTX said:
It bears reminding that PCIE is bidirectional dual simplex, you can read and write at the same time
i.e you can write to a drive while the network card is receiving. (i think, someone correct me on this if wrong) But... If you were to use on a windows platform, SMB will bite you first before you reach anywhere near 40gbe

Sounds right, so when using all 21GB/s bandwidth by reading files into memory and at the same time sending say 3700MB/s on first 40GbE NIC and another 3700MB/s on the second one to the network should be possible!

And this is really Max what AM5 can do right now

(until next gen chipset will be PCIE 5.0 speed including CPU connection)

SMB: not agree

MichalPL · Dec 19, 2022

mattventura said:
Stuck in kinda the same boat here. Waiting to see what the DDR5 HEDT offerings look like before I make a decision. I'm considering buying a 2nd-gen EPYC as a stopgap since I'd be able to repurpose it as a server later, but the single thread performance on those is just too big of a downgrade.

I think for the workstation they are not bad:

-GF 3090/4090 PCIe 4.0 x16 possible
-4x NVMe 4.0 (3 of them at full speed at the same time).
-2x 40GbE NIC (so ~7500MB/s using LACP)

For NAS they are not good, drives or network card.

but yes 100GbE was possible on almost 10 year old 1650v2

also easy 8x NVMe PCIe 3.0 and ~20GB/s too

mattventura said:
This is true, given a similarly-specced Samba host and Windows client, the client tends to bottleneck first (assuming the underlying storage can keep up).

Not agree. I tested long time ago Mellanox ConnectX-3 40GbE using two computers E5 1620 and Raid0 on 4x Samsung 970 (the first one) and without any overclocking (4 core 3.9GHz) and SMB tricks they were able to achieve 4.42GB/s.

Problem is with 100GbE - yes (on old almost 10 year old computers), but not with 40GbE.
13600kf or 7900X are fast enough to do 100GbE even fast enough to do ZFS

(while any Epyc is quite slooooooow).

MichalPL · Dec 19, 2022

mattventura said:
Stuck in kinda the same boat here. Waiting to see what the DDR5 HEDT offerings look like before I make a decision.

Same here, looks like ~20 P cores ~48lanes canalled - something faster is coming
but new TR can be amazing and maybe can beat 13900kf in single core too.

mattventura said:
I'm considering buying a 2nd-gen EPYC as a stopgap since I'd be able to repurpose it as a server later, but the single thread performance on those is just too big of a downgrade.

As a workstation or a builder or NAS ? (for builder I bought old server and put 8x8894v4 inside - not that bad single core comparing to Epyc as I thought, should be similar to future 2x96core Epyc at lest I hope

)

DaveLTX · Dec 20, 2022

MichalPL said:
Same here, looks like ~20 P cores ~48lanes canalled - something faster is coming
but new TR can be amazing and maybe can beat 13900kf in single core too.

As a workstation or a builder or NAS ? (for builder I bought old server and put 8x8894v4 inside - not that bad single core comparing to Epyc as I thought, should be similar to future 2x96core Epyc at lest I hope )

34 P cores supposedly Raptor Lake S. But will probably be repurposed Sapphire rapids. Sapphire rapids is PROPERLY expensive, i heard. And is essentially Naples too. Those weren't good for inter CCX communication.
Also, TR Genoa if it happens will never beat 13900KF. Zen 4 cores weren't built to be as big as Alder/Raptor.

You might be measuring on the wrong software. Rome, let alone milan is already well well ahead of Broadwell.

bayleyw · Dec 20, 2022

This board more or less maxes out what AM5 can do. The x16 root complex is bifurcated into x8 + x8, then 4 of the remaining 8 CPU lanes are routed to a third slot. The other 4 go to a pretty useless PCIe 5.0 M.2 slot.

The 4 chipset lanes route to the X670E, which muxes them ~16 lanes. 4 of them go to the AQC113, then the other 12 go to 3 x4 M.2 slots, so all 3 PCIe 4.0 M.2 slots share x4 bandwidth. You can run 4 M.2 drives, but you only get the bandwidth of two.

DaveLTX said:
34 P cores supposedly Raptor Lake S. But will probably be repurposed Sapphire rapids. Sapphire rapids is PROPERLY expensive, i heard. And is essentially Naples too. Those weren't good for inter CCX communication.

It's not. SPR extends the mesh/ring out the chiplets and is supposed to be "logically monolithic". Intel uses EMIB to lash their dies together giving them much higher interconnect density than Naples, so they can afford to run the 1000+ pins of mesh between the tiles. Naples was just traces in an organic substrate, so it used high speed serial links (basically, PCIe) between the CCX dies, which incurred a latency penalty.

MichalPL said:
for builder I bought old server and put 8x8894v4 inside - not that bad single core comparing to Epyc as I thought, should be similar to future 2x96core Epyc at lest I hope

Octal socket is a bad meme, please folks don't buy an 8S system.

DaveLTX · Dec 20, 2022

bayleyw said:
It's not. SPR extends the mesh/ring out the chiplets and is supposed to be "logically monolithic". Intel uses EMIB to lash their dies together giving them much higher interconnect density than Naples, so they can afford to run the 1000+ pins of mesh between the tiles. Naples was just traces in an organic substrate, so it used high speed serial links (basically, PCIe) between the CCX dies, which incurred a latency penalty.

I am very aware of that.

Can I escape ThreadRipper PRO with AM5?

Member

Well-Known Member

Build. Break. Fix. Repeat

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

Active Member

Active Member

Active Member