20+ hard disks -> which psu?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

brad

New Member
Dec 22, 2015
14
0
1
Dear STH community,

I hope this is the correct forum for my problem.
Recently I switched to Supermicro X10SDV-7TP4F and added a few hard drives.
As a PSU, I got a bequiet straight power 680W.

My Problem: when attaching more then 14 spinning + 2 solid drives, the system becomes instable, generating tons of io errors and/or other zfs errors.

I know that the primary 12V rail of the psu, can't support all drives. So i distributed the drives over 2 12V rails. For the 5V, there is just 1 source rated @28A. Is it possible that 16 drives draw too much power on the 5V?

Besides that the MB manual states:
It is strongly recommended that you use a high quality power supply that meets ATX
power supply Specification 2.02 or above. It must also be SSI compliant. (For more
information, please refer to the web site at Server System Infrastructure (SSI) Forum). Additionally, in
areas where noisy power transmission is present, you may choose to install a line
filter to shield the computer from noise. It is recommended that you also install a
power surge protector to help avoid problems caused by power surges.

The X10SDV Flex ATX Series motherboard alternatively supports
an 8-pin 12V DC input power supply for embedded applications. The 12V
DC input is limited to 36A by design. It provides up to 432W power input
to the motherboard. Please keep onboard power use within the power
limits specified above. Over-current DC power use may cause damage
to the motherboard!
I do not understand the difference. I assume if I attach the power via the 8 pin connector, the mainboard generates its other voltages itself?

Can you please recommend a PSU for my server? No redudancy needed, with support for ~25 spinning drives and the standard desktop form factor.

Thank you.
 

brad

New Member
Dec 22, 2015
14
0
1
thanks, but i do not need any rails for graphic cards on it. these would be totally wasted in my case.
besides that, the 5v rail is rated even lower (20A) than on my current psu.
i guess i gotta get a current clamp and measure how much these disks really draw on the 5v and 12v.
there are no infos in the specs from seagate :(
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Is this definitely a PSU problem...? Even if you're using power-hungry HDDs that swallow 15W each on the 12V rail (vast majority of the power drawn by the drives will solely be over the 12V rail), 16 of them shouldn't pull more than 250W and if you've split the load over different rails then this shouldn't be an issue... didn't see a manual on the BQ website but the Hexus review here says each 12V rail should support a minimum of 18A so that's at 200W available on each 12V rail.
Review: be quiet! Straight Power E9 680W CM PSU - PSU - HEXUS.net

Do you get any weird clicking from the drives when you're running them all? Shortage of power on the 12V will usually mean it can't spin up properly, shortage on the 3.3/5V you'll typically get intermittent clicking as it can't power the actuators properly, or just failure of the electronics which would result in the drive "disappearing". What does the OS say is happening during all this?

Do you have access to a power meter so you can see what the power-draw at the wall is?

Ignore the bit about the DC input, that's only for datacentres (or weirdoes with their own DC power supply ;)).
 

brad

New Member
Dec 22, 2015
14
0
1
it is just my uneducated guess, that the psu might be the problem.
here you can find the specs from my psu.
all drives work fine individually.
they sound ... "normal". certainly no clicking noises.
the whole system draws ~.5A@235V. during spinup ~1.2A. the mainboard has some sensors to measure its voltages. unfortunetly i haven't managed to get it running under linux yet. i'll try to measure the 5v and 12v with a scope and check if these drop with all drives.
it does not matter which drive i add as the 15th+, some random drives start to fail with io errors. here are some lines from the syslog:
Code:
Jul 15 06:26:39 spike kernel: [14041.906006] mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
Jul 15 06:26:39 spike kernel: [14041.906035] scsi_io_completion: 1 callbacks suppressed
Jul 15 06:26:39 spike kernel: [14041.906049] sd 0:0:17:0: [sdq] tag#0 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
Jul 15 06:26:39 spike kernel: [14041.906059] sd 0:0:17:0: [sdq] tag#0 CDB: Read(16) 88 00 00 00 00 00 ab 36 da e0 00 00 00 10 00 00
Jul 15 06:26:39 spike kernel: [14041.906064] blk_update_request: 1 callbacks suppressed
Jul 15 06:26:39 spike kernel: [14041.906070] blk_update_request: I/O error, dev sdq, sector 2872498912
Jul 15 06:26:39 spike kernel: [14041.911105] sd 0:0:17:0: [sdq] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 15 06:26:39 spike kernel: [14041.911133] sd 0:0:17:0: [sdq] tag#5 Sense Key : Not Ready [current]
Jul 15 06:26:39 spike kernel: [14041.911141] sd 0:0:17:0: [sdq] tag#5 Add. Sense: Logical unit not ready, cause not reportable
Jul 15 06:26:39 spike kernel: [14041.911150] sd 0:0:17:0: [sdq] tag#5 CDB: Read(16) 88 00 00 00 00 00 00 00 1a 10 00 00 00 10 00 00
Jul 15 06:26:39 spike kernel: [14041.911156] blk_update_request: I/O error, dev sdq, sector 6672
Jul 15 06:26:39 spike kernel: [14041.911251] sd 0:0:17:0: [sdq] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 15 06:26:39 spike kernel: [14041.911256] sd 0:0:17:0: [sdq] tag#7 Sense Key : Not Ready [current]
Jul 15 06:26:39 spike kernel: [14041.911261] sd 0:0:17:0: [sdq] tag#7 Add. Sense: Logical unit not ready, cause not reportable
Jul 15 06:26:39 spike kernel: [14041.911266] sd 0:0:17:0: [sdq] tag#7 CDB: Read(16) 88 00 00 00 00 01 5d 50 9c 10 00 00 00 10 00 00
Jul 15 06:26:39 spike kernel: [14041.911269] blk_update_request: I/O error, dev sdq, sector 5860531216
Jul 15 06:26:39 spike kernel: [14041.911343] sd 0:0:17:0: [sdq] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 15 06:26:39 spike kernel: [14041.911348] sd 0:0:17:0: [sdq] tag#10 Sense Key : Not Ready [current]
Jul 15 06:26:39 spike kernel: [14041.911353] sd 0:0:17:0: [sdq] tag#10 Add. Sense: Logical unit not ready, cause not reportable
Jul 15 06:26:39 spike kernel: [14041.911358] sd 0:0:17:0: [sdq] tag#10 CDB: Read(16) 88 00 00 00 00 01 5d 50 9e 10 00 00 00 10 00 00
Jul 15 06:26:39 spike kernel: [14041.911361] blk_update_request: I/O error, dev sdq, sector 5860531728
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
A quick `aptitude install lm-sensors` (or whatever your distro's equivalent is) followed by running sensors-detect should get you some reliable voltage info fairly sharpish; alternatively `ipmitool sensors` should get you some good info from the BMC.

Have you run a quick SMART check on these drives to see if that mentions owt obvious?

How are your drives hooked up...? Are all 16 of them plumbed into the LSI chip? Does it make any difference if you try running any of those drives (e.g. the SSDs) off the motherboard SATA?
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,513
5,805
113
I have seen this with 20+ drives and lower-cost PSUs. Personally, used Seasonic power supplies but now I just use server PSUs in cases I buy.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,709
517
113
Canada
I don't think the power supply if faulty, but I do suspect the power supply can't cope with the inrush current demand on the 12V rail. Most of the cheaper supplies use the 12V rail as a reference for regulation, if that goes out of wack, the primary supply will foldback and try again, over and over until the overload is removed. With this number of disks, you should really be using a power supply designed for server use :)
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
If the PSU is behaving according to spec then it shouldn't have any problem with dealing with this amount of HDDs. That's a big if of course, but I think there are other things to test before splashing out on a new PSU. Don't really have much experience with BQ's PSUs but I always thought they were a company without a bad rep for cheap PSUs.

Spin-up draw is usually much, much higher than active draw so even if the 12V rail had problems, I'd expect it to crap out at startup rather than doing so after boot. I think monitoring load on the lines as you add more hard drives should be the priority at the minute.

I'll second Patrick's recommendation of Seasonic's though if you do end up buying another one. Been buying nothing but them for over a decade (and the occasional badge-engineered seasonic-under-the-hood), solid through and through.
 

brad

New Member
Dec 22, 2015
14
0
1
A quick `aptitude install lm-sensors` (or whatever your distro's equivalent is) followed by running sensors-detect should get you some reliable voltage info fairly sharpish; alternatively `ipmitool sensors` should get you some good info from the BMC.
done that already. lm-sensors driver is not yet written and ipmitool does not find /dev/ipmi...
Code:
Driver `coretemp':
  * Chip `Intel digital thermal sensor' (confidence: 9)

Driver `to-be-written':
  * ISA bus, address 0xca0
    Chip `IPMI BMC KCS' (confidence: 4)
Have you run a quick SMART check on these drives to see if that mentions owt obvious?
All smart data is fine.

How are your drives hooked up...?
1x m.2 ssd onboard
1x m.2 ssd pcie card
16x sata on lsi
1x sata onboard

Personally, used Seasonic power supplies but now I just use server PSUs in cases I buy.
Any particular model you can recommend? As I wrote: no graphics card needed. ~25 drives will be the maximum. and standard form factor, not these thin and long psus.

I don't think the power supply if faulty, but I do suspect the power supply can't cope with the inrush current demand on the 12V rail.
I suspect that too, since it works with 16 drives and starts to cause random problems after that.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Nutsacks, didn't realise there wasn't a driver for it yet. Quick look through the manual says it's using a winbond chip but can't see what model.

Regarding ipmitool, you might need to manually load some modules via modprobe to get /dev/ipmi0 to appear; try this:
Code:
modprobe ipmi_msghandler
modprobe ipmi_devintf
modprobe ipmi_si
The last seasonic PSU I got was the fanless 520W platinum model for my workstation (6-core xeon and a middlin' graphics card), the SS-520FL2. It's capable of 43A on the 12V rail. Next model up power-wise is the 600W which'll do 55A and has a fan.
 

brad

New Member
Dec 22, 2015
14
0
1
Thanks @EffrafaxOfWug, that worked flawlessly.
Code:
ipmitool sensor
CPU Temp         | 50.000     | degrees C  | ok    | 0.000     | 0.000     | 0.000     | 99.000    | 104.000   | 104.000
PCH Temp         | 42.000     | degrees C  | ok    | 0.000     | 5.000     | 16.000    | 90.000    | 95.000    | 100.000
System Temp      | 34.000     | degrees C  | ok    | -10.000   | -5.000    | 0.000     | 80.000    | 85.000    | 90.000
Peripheral Temp  | 32.000     | degrees C  | ok    | -10.000   | -5.000    | 0.000     | 80.000    | 85.000    | 90.000
SAS2 Temp        | 51.000     | degrees C  | ok    | -5.000    | 0.000     | 5.000     | 100.000   | 105.000   | 110.000
DIMMA1 Temp      | 40.000     | degrees C  | ok    | -5.000    | 0.000     | 5.000     | 80.000    | 85.000    | 90.000
DIMMA2 Temp      | na         |            | na    | na        | na        | na        | na        | na        | na
DIMMB1 Temp      | 39.000     | degrees C  | ok    | -5.000    | 0.000     | 5.000     | 80.000    | 85.000    | 90.000
DIMMB2 Temp      | na         |            | na    | na        | na        | na        | na        | na        | na
FAN1             | 1500.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000
FAN2             | 2800.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000
FAN3             | na         |            | na    | na        | na        | na        | na        | na        | na
FAN4             | 1700.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000
FANA             | 2800.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000
FANB             | 1800.000   | RPM        | ok    | 300.000   | 500.000   | 700.000   | 25300.000 | 25400.000 | 25500.000
12V              | 11.937     | Volts      | ok    | 10.173    | 10.299    | 10.740    | 12.945    | 13.260    | 13.386
5VCC             | 4.922      | Volts      | ok    | 4.246     | 4.298     | 4.480     | 5.390     | 5.546     | 5.598
3.3VCC           | 3.265      | Volts      | ok    | 2.789     | 2.823     | 2.959     | 3.554     | 3.656     | 3.690
VBAT             | 3.159      | Volts      | ok    | 2.375     | 2.487     | 2.599     | 3.775     | 3.887     | 3.999
Vcpu             | 1.800      | Volts      | ok    | 1.242     | 1.260     | 1.395     | 1.899     | 2.088     | 2.106
VDIMMAB          | 1.182      | Volts      | ok    | 0.948     | 0.975     | 1.047     | 1.344     | 1.425     | 1.443
VDIMMCD          | 1.200      | Volts      | ok    | 0.948     | 0.975     | 1.047     | 1.344     | 1.425     | 1.443
5VSB             | 4.896      | Volts      | ok    | 4.246     | 4.298     | 4.480     | 5.390     | 5.546     | 5.598
3.3VSB           | 3.231      | Volts      | ok    | 2.789     | 2.823     | 2.959     | 3.554     | 3.656     | 3.690
1.5V PCH         | 1.509      | Volts      | ok    | 1.320     | 1.347     | 1.401     | 1.644     | 1.671     | 1.698
1.2V BMC         | 1.227      | Volts      | ok    | 1.020     | 1.047     | 1.092     | 1.344     | 1.371     | 1.398
1.05V PCH        | 1.050      | Volts      | ok    | 0.870     | 0.897     | 0.942     | 1.194     | 1.221     | 1.248
Chassis Intru    | 0x0        | discrete   | 0x0000| na        | na        | na        | na        | na        | na
though this is currently for the stable system. i gotta do some tests with that.
 

fractal

Active Member
Jun 7, 2016
309
69
28
33
the whole system draws ~.5A@235V. during spinup ~1.2A. the mainboard has some sensors to measure its voltages.
Making overly conservative assumptions you want a power supply capable of delivering 25 amps of 12 volts. Making reasonable assumptions it is less than that. In fact, making reasonable assumptions AND assuming you are drawing all of your +12V from one rail, you APPEAR to be hitting the 18 amp limit on the 12V rails.

The manual you linked confirms that you have 18 amps of 12v on +12v1 to run your drives and everything on your motherboard other than the processor and your fans. You have another 18 amps of 12v on +12v2 to run your processor. You have two more 12v rails to run your video card(s).

My suggestion is that you purchase a power supply rated at at least 350 watts with a single +12 rail rated at at least 25 amps. Find one with enough plugs to hook up your stuff. The seasonic you linked is twice the power supply you need and looks very expensive. But, it should work just fine.
 

jgreco

New Member
Sep 7, 2013
28
16
3
My suggestion is that you purchase a power supply rated at at least 350 watts with a single +12 rail rated at at least 25 amps. Find one with enough plugs to hook up your stuff.
You're suggesting a 350 watt power supply can run 20+ hard disks? That's crazy. The spinup current alone is in the neighborhood of 42 amps or around 500 watts. That's not powering the drive electronics, that's not powering the host system, that's not leaving any margin for error. If you were able to get PUIS configured, the 8 watts a typical drive requires times 20 is 160, plus the 150 watt peak a typical Xeon E3 system might require, plus spinup current for a single drive at a time, still puts you perilously close to that 350 watts, leaving no margin for derating or aging of components over time. Your PSU will be cooking.

I am very much in favor of slightly oversizing a PSU in order to provide a safer experience. It is dumb to try to cut costs on a PSU, because a damaged PSU is perfectly capable of ruining all your other expensive components.

We have a nice guide over in the FreeNAS forums that describes some much safer concepts, admittedly very conservative, but also showing some interesting stuff such as power intake during spinup for various drives. It doesn't go out as far as 20 drives, because once you get out around a dozen, different factors start to play a significant role.
 

fractal

Active Member
Jun 7, 2016
309
69
28
33
You're suggesting a 350 watt power supply can run 20+ hard disks? That's crazy. The spinup current alone is in the neighborhood of 42 amps or around 500 watts. That's not powering the drive electronics, that's not powering the host system, that's not leaving any margin for error.
You are right. I was only considering the steady state using his posted numbers. I failed to consider spinup current.

The 600 watt seasonic he linked is looking more attractive in that light.
 

wiretap

Active Member
Jul 14, 2015
128
88
28
Michigan
Spin-up is really what you have to size your power supply for when going with a large amount of hard drives. You can make a spreadsheet of the current draw in amps by looking at the manufacturer's website for each hard drive, then summing them all up. It can help a little bit if your motherboard or SAS/SATA controller supports staggered spinup, but it doesn't help all that much. Usually even on staggered spinup, they all come on relatively fast. The current draw upon startup is more of a logarithmic decay, so until the drives reach full speed, you're still drawing quite a few amps. It is best to go with a good quality power supply that supplies the rated amps you need, with minimal AC ripple at full load. I find Seasonic power supplies are my go-to for top quality power under maximum load.