3090 driver handicap?

balnazzar

Active Member
Mar 6, 2019
205
25
28
Thanks! Do you think small heatsinks are better or larger ones? Bit difficult to fit the larger ones on cards packed 2 slots apart.
There are *wide* heatsinks having modest height (<1cm).

Curiously I've had my Turbo crash my entire system (full power off with no ability to power on again) twice when mem temps stayed at 100C for more than 30 mins. Would need to let it rest for a couple of mins before it worked again. Core temps were just 56C. Keeping the mem temps 2C lower at 98C eliminates the problem entirely for some strange reason.
That shouldn't happen. Even if the vram hits 110C, it just throttles down. Nothing will crash.
Maybe it's just a coincidence: have you connected the GPU to two different power outlets on the psu? One is not supposed to daisy-chain the 8-pin connectors.
 

balnazzar

Active Member
Mar 6, 2019
205
25
28
is it a good idea to blow the CPU fan downwards to circulate air across the backplate of the first GPU?
If your cpu never rises above 50C, yes. My 8260M touches 80C, so in my case it won't be a good idea.
 

balnazzar

Active Member
Mar 6, 2019
205
25
28
It is a workstation and doesn't run GPU loads 24x7. The fans ramp up to the heat load as needed, automatically. Some of my workloads are CPU based too. Automatically managed fans are much better for a lot of reasons.
My board, despite being the successor of yours in some sense, doesn't allow any fan hysteresis (yes, even the cpu fan has to be set to a fixed rpm regimen).

Also, no option to connect sensors.

TBH, I'm a bit disappointed by the scalable platform. Narrower choice of motherboards. Scalables run a lot hotter than E5s, while having more or less the same performance per core if you normalize the frequencies. Ok, 8 lanes more are a nice thing, but then there were the pcie switches.
And you did notice the noctua 92mm since it's the only one that fits. Even the 120mm would knock the first pcie slot out, and the same stood for the X11SPA, despite its EEB form factor.
 

funkywizard

mmm.... bandwidth.
Jan 15, 2017
748
313
63
USA
ioflood.com
I got a unit with local warranty, don't think I want to void the warranty on it by tearing out the heatsink. I also have a bit of PTSD from my WC loop leaking a couple of years ago and spraying coolant all over the parts. Not sure when I'll be comfortable getting back into it.
Reliability and cost are two major reasons not to do a custom water loop. The AIO GPUs I've used have been super reliable, perform great, amazing thermals, and only roughly $100 more than an air cooled card. Don't have to worry that I screwed something up building a custom loop, nor risk breaking the card installing the water block, or dealing with compatibility / availability issues between the exact model / manufacturer of GPU and the water block. No brainer.
 

balnazzar

Active Member
Mar 6, 2019
205
25
28
If you want an easy solution, you buy AIO watercooled cards.

If you want control over the final result AND you know what you are doing, you do a custom waterloop.

As for the cost, I'm not sure. AIO 3090s I have been able to find are all >2300eur.
 

funkywizard

mmm.... bandwidth.
Jan 15, 2017
748
313
63
USA
ioflood.com
If you want an easy solution, you buy AIO watercooled cards.

If you want control over the final result AND you know what you are doing, you do a custom waterloop.

As for the cost, I'm not sure. AIO 3090s I have been able to find are all >2300eur.
Yeah the costs are tough right now since basically no RTX cards are available at retail.

From EVGA's site, the cheapest water cooled 3090, at $1699, is only $100 more than the cheapest air cooled 3090:

 

balnazzar

Active Member
Mar 6, 2019
205
25
28
From EVGA's site, the cheapest water cooled 3090, at $1699, is only $100 more than the cheapest air cooled 3090:
Yes but.. Will one be able to find it at that price? I don't think so. Will one be able to find it at all? Well, maybe.

Look, I understand your concerns about a custom loop, but let us examine pros and cons.

An AIO card will still have a pump that can fail. Pumps in AIOs are nothing special, generally just standard Asetek pumps. At least in a custom loop you can select the pump you like, and regulate it to inaudible noise levels.
More importantly, in a custom loop you can install two (or N) pumps serially, so that if one fails, there will be another.

You cannot select the tubing length in an AIO, and so you have to be very careful in doing your calculations and selecting your case. If you want two cards, installing two different 240mm radiators with fixed length tubes becomes quite an endeavour. You don't get the possibility of installing angled fittings too.

If something breaks in a custom loop, you can just take it out. If something happens in the AIO card (pump, fittings, leaks) you have to return the whole card.

With a custom loop you can install an intelligent controller that constantly monitors it and if something goes awry shuts the computer down immediately, the hard way (bridged atx adapter).

On the other hand, yes: if you don't know what you are doing with a custom loop, it will end up in a mess. You can even break the cards if not careful in disassembling them..
 

funkywizard

mmm.... bandwidth.
Jan 15, 2017
748
313
63
USA
ioflood.com
Yes but.. Will one be able to find it at that price? I don't think so. Will one be able to find it at all? Well, maybe.

Look, I understand your concerns about a custom loop, but let us examine pros and cons.

An AIO card will still have a pump that can fail. Pumps in AIOs are nothing special, generally just standard Asetek pumps. At least in a custom loop you can select the pump you like, and regulate it to inaudible noise levels.
More importantly, in a custom loop you can install two (or N) pumps serially, so that if one fails, there will be another.

You cannot select the tubing length in an AIO, and so you have to be very careful in doing your calculations and selecting your case. If you want two cards, installing two different 240mm radiators with fixed length tubes becomes quite an endeavour. You don't get the possibility of installing angled fittings too.

If something breaks in a custom loop, you can just take it out. If something happens in the AIO card (pump, fittings, leaks) you have to return the whole card.

With a custom loop you can install an intelligent controller that constantly monitors it and if something goes awry shuts the computer down immediately, the hard way (bridged atx adapter).

On the other hand, yes: if you don't know what you are doing with a custom loop, it will end up in a mess. You can even break the cards if not careful in disassembling them..
That's all true enough. I'll take the prebuilt over the artisnal water loop any day, as the pros far outweigh the cons.

As to the GPUs being available, that's a problem for both the air cooled and water cooled. They routinely cost 50-100% above MSRP right now.
 

balnazzar

Active Member
Mar 6, 2019
205
25
28
You got the same iron system scaffolding I got :)

What about dust for your all-open machine?? :rolleyes:
 

larrysb

Active Member
Nov 7, 2018
107
45
28
My board, despite being the successor of yours in some sense, doesn't allow any fan hysteresis (yes, even the cpu fan has to be set to a fixed rpm regimen).

Also, no option to connect sensors.

TBH, I'm a bit disappointed by the scalable platform. Narrower choice of motherboards. Scalables run a lot hotter than E5s, while having more or less the same performance per core if you normalize the frequencies. Ok, 8 lanes more are a nice thing, but then there were the pcie switches.
And you did notice the noctua 92mm since it's the only one that fits. Even the 120mm would knock the first pcie slot out, and the same stood for the X11SPA, despite its EEB form factor.

Mine is in a Corsair Carbide 540-Air case. Oldie, but goodie. Double-chambered as you noted. PSU and a bunch of drive cages on the other side of the case.

Yeah, I think Intel peaked at the Broadwell EP (Xeon E5-V4 family) and the X99 chipset that could do double-duty as either a Core i7 or Xeon platform with no problems at all. Skylake was kind of a turd overall, Xeon Scalable never made any sense to me. Xeon Wxxxx was just the Skylake HEDT Core i9 Extreme with ECC turned on, for a lot of +$$$. Then there's that whole weird NUMA thing with the "mesh" instead of ring system. The Broadwell EP's have a single root PCIe complex and skus with 10 or fewer cores are all on a single ring internally, which makes them great for GPU applications.

I have other workstations built on the Asus X99-E WS series boards. Very happy with them, self-managed cooling fans, and fairly easy to turn all the stupid stuff off to make them reliable. No baseband controller though, so no remote management. 7-slots, 4 are x16 due to the PLX'd switches on the board.

They're actually pleasant to live with in the office environment. With a fully utilized CPU-only workload, the system is whisper quiet and not throttling at all. When the GPU's and CPU are all loaded and working at full tilt, there's a lot more air, but the noise isn't bad.

If I boot into Windows 10 and run the Quake RTX demo, it will maintain >100fps on a QHD monitor indefinitely, with the fans ramping up. (2560x1440)

My other favorite case is the Fractal Design Define 7. These Asus E-WS boards will fit in it fine. The blower card are better in those as Fractal Design is a quiet case with a bit less airflow then the Carbide 540 (which is basically wide open mesh everywhere).

Long live x99 + Broadwell. I wish I could buy these great boards brand new. With all the Broadwell EP chips on the second-hand market now, they're quite a bargain. Heck, for that matter, even the Haswell chips work in these boards too with great results.
 
  • Like
Reactions: balnazzar

balnazzar

Active Member
Mar 6, 2019
205
25
28
My other favorite case is the Fractal Design Define 7
I have the Define 7 XL (the one in the pic above). The fine thing about that case is that it behaves equally well both on air and liquid.

Long live x99 + Broadwell
Fine processors and fine boards. I have an Asrock EPC612D8, since I absolutely needed IPMI. It is a fine board itself: nvme slot onboard, four x16 slots correctly spaced. No PLX, but if you run two GPUs, they'll still get 16 lanes each.

But look, if you want a cheap but capable dual-gpu platform today, and don't need >128gb ram, just buy a ryzen 5000 and a b550 board with correctly spaced slots. Stellar performance, no headaches. Dual x8 gen4 lanes, equivalent to dual gen3 x16.
 
Last edited:

josh

Active Member
Oct 21, 2013
514
136
43
I have the Define 7 XL (the one in the pic above). The fine thing about that case is that it behaves equally well both on air and liquid.



Fine processors and fine boards. I have an Asrock EPC612D8, since I absolutely needed IPMI. It is a fine board itself: nvme slot onboard, four x16 slots correctly spaced. No PLX, but if you run two GPUs, they'll still get 16 lanes each.

But look, if you want a cheap but capable dual-gpu platform today, and don't need >128gb ram, just buy a ryzen 5000 and a b550 board with correctly spaced slots. Stellar performance, no headaches. Dual x8 gen4 lanes, equivalent to dual gen3 x16.
Is 3.0x16 even necessary for deep learning? DDR4 is still ridiculously expensive. A dual E5-2678V3 stacks with DDR3 for 1/4 the price of a Ryzen 5000 system.
 

funkywizard

mmm.... bandwidth.
Jan 15, 2017
748
313
63
USA
ioflood.com
Is 3.0x16 even necessary for deep learning? DDR4 is still ridiculously expensive. A dual E5-2678V3 stacks with DDR3 for 1/4 the price of a Ryzen 5000 system.
Faster pcie is certainly better, especially when training with multiple gpus. I would assume you can get by with x8 lanes per GPU but I certainly wouldn't want less than that, and x16 per GPU is best.

You don't need an expensive system to get x16 to each GPU for 3 GPUs, maybe 4. ASUS E5 workstation boards can do 4 gpus at x16 ea due to a pcie switch on the board. Other E5 boards can be found that'll do 3 double spaced gpus at x16 ea without a pcie switch.

Looks like rtx 3080 / 3090 supports pcie 4.0, so to get the maximum speeds out of that you'll want an AMD board, preferably threadripper or epyc to get the most lanes but ryzen may work ok too.
 

balnazzar

Active Member
Mar 6, 2019
205
25
28
Is 3.0x16 even necessary for deep learning? DDR4 is still ridiculously expensive. A dual E5-2678V3 stacks with DDR3 for 1/4 the price of a Ryzen 5000 system.
Are you sure Haswell can use ddr3? I was not aware of that. I never heard of 2011-3 mobos with ddr3 support.

But you are right, ddr3 is more than sufficient for DL workloads. However, consider that one 32gb ddr4 rdimm can be bought brand new on ebay for 75eur. I paid 300 for 128gb.
Udimms are still relatively expensive (~130e for a 32gb ECC udimm).
 

funkywizard

mmm.... bandwidth.
Jan 15, 2017
748
313
63
USA
ioflood.com
Are you sure Haswell can use ddr3? I was not aware of that. I never heard of 2011-3 mobos with ddr3 support.

But you are right, ddr3 is more than sufficient for DL workloads. However, consider that one 32gb ddr4 rdimm can be bought brand new on ebay for 75eur. I paid 300 for 128gb.
Udimms are still relatively expensive (~130e for a 32gb ECC udimm).
Although theoretically there is some limited ddr3 support for 2011-3, I've never seen it in the wild.

DDR4 isn't all that expensive anymore, at least the slower ones like ddr4-2400 are ~$45 for a 16gb stick. A dual E5v4 should have 8 sticks for maximum speed, so, around $360 for 128gb ddr4 ram. Compared to the cost of GPUs that's pretty minimal. The higher speed ram you'll want to use for AMD (ddr4-3200), does cost more, however.