Future data center power/cooling?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Y0s

Member
Feb 25, 2021
36
7
8
Hi,

I've seen hints in posts that future processors will be higher TDP, generally power density going up, and even some references to liquid cooling!

I can see the power-density with modern servers packed with NVME drives, GPUs, and high-speed networking. I guess the future processor TDP is some open-secret of which I'm unaware (Milan?)

What is everyone planning for their datacenters? We have plenty of cooling from the neighboring physical plant, but can currently only supply 5kW per rack.

Thanks!
 

Serverking

The quieter you are, the more you can hear...
Jan 6, 2019
510
212
43
We have plenty of cooling from the neighboring physical plant, but can currently only supply 5kW per rack.
Can we build our own power plants yet? 2 MW plant and I'm all set.
 

i386

Well-Known Member
Mar 18, 2016
4,217
1,540
113
34
Germany
I think the more interesting question is how many organizations will run their own datacenters and how many will move to the cloud(s)...
modern servers packed with NVME drives, GPUs, and high-speed networking.
It's 2021 and like 95% of our customers (mostly small or medium businesses/organizations) are still not using gpu computing or networking beyond 10gbe. I don't think this will be a problem in the next few years.
 
  • Like
Reactions: Markess and T_Minus

Y0s

Member
Feb 25, 2021
36
7
8
Migrating to the cloud is a different question, I'd like to focus on datacenter supply. We've started to get requests for GPUs, and deep 4u servers stuffed with disk. Sometimes we can't fill the rack with current power limit
 
Last edited:

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
Migrating to the cloud is a different question, I'd like to focus on datacenter supply. We've started to get requests for GPUs, and deep 4u servers stuffed with disk. Sometimes we can't fill the rack with current power limit
I've contacted numerous datacenters that colo to the public\businesses none can fill a rack due to power constraints, most can't fill half a rack with dense compute or GPU because of this too. Add-in cooling requirement and a B-side that matches and you probably have even fewer that can fulfill these needs.

I don't think this is abnormal issue for general datacenters, for purpose-built \ company unique usages this probably is compeltely different as they have staff engineers for this purpose and build it for their purposes. Higher density can be done but must be at very expensive datacenters I haven't asked :D
 

Y0s

Member
Feb 25, 2021
36
7
8
From today's "STH Q1 2021 Update A Letter from the Editor":

power budgets for servers are going up. 2021 will see a significant, but rather modest bump. Into 2022 and 2023, we are going to see fairly massive spikes in what we are asked to test. While we can use 5kW racks today, we are going to need to be several times that by 2023
Any rumors as to the reason(s)? Will CXL/Gen-Z require lots of power, or enable much higher density, or are the next-gen C/GPUs TDP monsters?
 
  • Like
Reactions: Samir

alex_stief

Well-Known Member
May 31, 2016
884
312
63
38
On the CPU side of things, I see two driving factors for a rapid increase in power density: competition and chiplets.
Now that AMD is back in the game, the race for the fastest CPUs with highest core counts is on again. In order to get ahead, maximum power efficiency seems to get less important. Bumping up frequency by just another 100 Mhz might be enough to come out in front in some benchmark, at the cost of massively increased TDP.
And "gluing together" chiplets instead of using monolithic dies enables a much faster increase in total core count. Driving TDP up further.
 
  • Like
Reactions: Samir

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
The more you scale up nodes, the less you need in terms of fabric to tie everything together. Look at Cerebras as an example. There are benefits to scaling out larger nodes.

I can tell you, we put 3 nodes in the data center today. They are using 7.2kW. Two have four accelerators each the other is CPU only.
 
  • Like
Reactions: Samir and T_Minus

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
We are doing 20-25kw racks for sure, becoming way more common.
for this mostly is one of the 3 solutions.
- purpose built internal DC’s. $$$ (number has 8 digits, but if done well 12-15yr before updates so let’s say 3 replacement cycles, after refurb life again)
- outsource to a colo, heaps are able to air cool 25kw racks, not that expensive really, power maybe even cheaper I have found than own on prem.
- dedicated containment options, can be as little as one or 2 racks for high density if you like, not that cheap but neither are 25kw per rack IT loads
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,257
1,445
113
49
HSV and SFO
And the argument is that older servers use too much power, lol. Yes, there is less computing power per watt, but you could easily outfit those on a 5kw rack back in the day. :)
 
  • Like
Reactions: Patrick

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
And the argument is that older servers use too much power, lol. Yes, there is less computing power per watt, but you could easily outfit those on a 5kw rack back in the day. :)
well yes and no. Today the idle power usage is a lot less than past, and peak is up from days gone by. Just once we used to use bigger servers or at the very least we didn’t put so many servers in a rack, while 42u is still common a lot more 47/48u and people are now putting a couple of 1u TOR switch in and then 20+ 2u servers or if SAN connected also not a bad idea just put in 32 port switches TOR as well and then 30 x 1u servers.

Server racks like this though are not near 25kw, that comes from GPU servers or when you go to 2 x dual CPU servers per 1u for compute power.
 
  • Like
Reactions: Patrick

jabuzzard

Member
Mar 22, 2021
45
18
8
What is everyone planning for their datacenters? We have plenty of cooling from the neighboring physical plant, but can currently only supply 5kW per rack.
That's really rather wimpy to be honest.

Rear door coolers are good for in the region of 30kW per rack. I think we are pulling on average 32kW on ours when they have compute nodes in. However it is just brutal opening the rear door when everything is screaming. Good for a laugh for unsuspecting vendor service engineers. With the door closed you would not have a scooby how hot is about to become :).

For the vast majority of people this is the way forward. It is basically just a large car radiator in the back door of the rack. I know when I was a small child car radiators where not terribly reliable, but I can't think of when I have known someone outside of a crash having a leaky car radiator or even hoses. Being in a datacentre is a way more benign enviroment. We have been using them for a nearly a decade now. They also mean you don't need to bother with the hot/cold isle stuff and your PUE jumps dramatically over traditional free air cooling.

The current limit is I think around 130kW per rack with on die water cooling. But that's really specialist stuff. You have to vendor supplied heat exchangers to maintain the water quality in your rack row. It effectively has to be ultrapure and yeah that's a real term for water quality.

However I recall proposals for much higher limits for exascale that my back of the envelope calculations suggest would turn it all into a molten puddle of metal in a minute or so should the cooling ever fail. If I recall correctly it was around one megawatt per rack. Yep I nearly fell off my chair when I heard that too. Personally I don't think HVAC is anywhere near reliable enough for that, because you have one maybe two seconds to react before everything in the rack is scrap on a cooling failure so not a chance in hell of a orderly shutdown, you are going to have to throw breakers on the basis of incoming cooling water temperature being out of spec and or not flowing.

As for free cooling, the best proposal I have seen is to drill down and install a heat exchanger with the water in the abandoned flooded coal mines below the data centre. Year round free cooling from that.