need advice about quad GPU build for cuDNN

Discussion in 'DIY Server and Workstation Builds' started by fragar, Feb 4, 2019.

  1. fragar

    fragar New Member

    Joined:
    Feb 4, 2019
    Messages:
    5
    Likes Received:
    0
    This is my first post here and my first computer build (as will soon become apparent :)). I work in deep learning and, after renting various servers for several years, have built the following machine for a specific workflow centered around cuDNN:

    Build’s Name: Server-1
    Operating System/ Storage Platform: Ubuntu Server 18.04
    CPU: Threadripper 2990WX
    Motherboard: X399 Taichi
    Chassis: ?? (need advice)
    Drives: Samsung SSD 970 EVO 1 TB M.2
    RAM: Corsair Vengeance LPX 4x16 GB 3200 MHz
    Graphics Cards: Quad (4x) GigaByte RTX 2080 Ti WindForce
    Power Supply: Corsair AX1600i

    For now I have it running on an open test bench in my office (pictures attached). My plan is to put it in a rack mount case and move it to a co-location facility once everything is done, but I have not yet purchased the rack mount case or case fans.

    My problem is with cooling the graphics cards. I made the mistake of getting "open-air"-style graphics cards instead of blower-style and now I am experiencing significant thermal throttling and would like some advice about what to do.

    When I run the graphics cards with their fans and shields on, and point an external 16" house fan running at max speed over the open-air case, I get the following performance:

    GPU 0 (on the end, fans facing open air, backplate facing GPU 1): 100% (ie. same as running just one GPU)
    GPU 1 (surrounded on both sides): 71% of one GPU
    GPU 2 (surrounded on both sides): 82%
    GPU 3 (on the end, fans facing GPU 2, backplate open): 91%

    When I remove the fans and shields of GPUs 1-3, as shown in the pictures, and use the same external house fan to get air flow, I get the following improved performance:

    GPU 0: 100%
    GPU 1: 77%
    GPU 2: 89%
    GPU 3: 96%

    So, removing the fans and shields from the graphics cards clearly helps. The performance still isn't great, but it's (I guess) acceptable. However, I am not sure if the airflow will be as good inside a case.

    My questions are:

    1. What rack mount server case would be best for the above components?
    2. Is there anything worth doing to try to improve the GPU cooling?

    It seems to me that I have the following options:

    1. Put this system inside something like a Rosewill RSV-L4500 case, get the best case fans (maybe Delta PFB1212UHE-F00 ??), and accept a 10% or so loss of performance.
    2. Split out the graphics cards using a PCI-e x16 extender, put the WindForce GPU fans back on, and move the cards to a separate section of the case where they have more spacing.
    3. Keep the graphics cards on the motherboard (where they have dual-slot spacing) but replace the GPU cooling to either water cooling, a better passive solution, or a custom blower.

    I am leaning towards #1 but am not sure. I also haven't been able to find any after-market passive or blower style cooling solutions for the RTX 2080 Ti.
     

    Attached Files:

    #1
    Last edited: Feb 4, 2019
  2. MiniKnight

    MiniKnight Well-Known Member

    Joined:
    Mar 30, 2012
    Messages:
    2,947
    Likes Received:
    858
    #2
    fragar likes this.
  3. maze

    maze Active Member

    Joined:
    Apr 27, 2013
    Messages:
    540
    Likes Received:
    77
    Have you looked at Spotwoods open rig mining setups?

    Your alternative method could be to remove the fans, remove the pci plates (or make Them a lot more open) and simpely get a 3u or so case that you Can put a few gentle typhoons or simular super High output fans in to remove the heat by pure airflow.

    Edit:

    https://www.amazon.com/RAIJINTEK-MORPHEUS-Superior-High-end-Cooler/dp/B071VZ7M4K
    - could be an option, but the fins do turn the wrong Way :/

    Or
    https://www.arctic.ac/eu_en/accelero-s3.html
    You could try and see if its possible to make this fit with some copper heatsinks on the chips.. with enough airflow it could be possible..
     
    #3
    Last edited: Feb 5, 2019
    fragar likes this.
  4. fragar

    fragar New Member

    Joined:
    Feb 4, 2019
    Messages:
    5
    Likes Received:
    0
    Thanks for the comments.

    Those solutions would need PCIe extenders, like this one:

    Thermaltake - TT Premium PCI-E 3.0 Extender – 600mm

    I wasn't able to find any really clear reports online about deep learning builds which use this approach. There were a few people discussing it in forums, but without clear conclusions. It's not clear if those cables can be routed through those cases, or how long the cables need to be (600mm may not be enough), or how good the performance will be.

    This approach is widely used by crypto miners but those workloads are much less bandwidth-constrained, allowing the use of more flexible 1x to 16x connectors.

    Can anyone comment on the viability of PCIe x16 extenders for deep learning builds?
     
    #4
  5. fragar

    fragar New Member

    Joined:
    Feb 4, 2019
    Messages:
    5
    Likes Received:
    0
    Indeed, something like this would be my preferred approach. Strong front-to-back airflow can no doubt be achieved through a normal server case (like the Rosewill RSV-L4000) with a boatload of 120mm Delta 9000 rpm fans.

    In fact, this is pretty close to what I am doing now. I removed the fans and shielding from the GigaByte 2080 Ti WindForce cards and am blowing air over them with a large room fan. Photo attached.

    The problem with my approach, and with the Raijintek Morpheus, is indeed that the heatsink grooves run top-to-bottom. The S3 solves that problem, but it also has some drawbacks:

    1. It doesn’t officially support the 2080 Ti.
    2. It’s advertised as only dissipating up to 135W.
    3. It’s also not really aimed at server builds (fe. the bumps on the backplate run top-to-bottom).

    Can anyone recommend any good passive coolers which can be installed on these cards for use in a server (ie. with strong front-to-back airflow)?
     

    Attached Files:

    #5
    Last edited: Feb 5, 2019
Similar Threads: need advice
Forum Title Date
DIY Server and Workstation Builds Need Advice Building a 16+ ZFS NAS Oct 17, 2019
DIY Server and Workstation Builds First Time Builder Needs Advice Aug 3, 2019
DIY Server and Workstation Builds Need RAM replacement advice Feb 8, 2019
DIY Server and Workstation Builds Need advice on Massive New Plex server Nov 5, 2018
DIY Server and Workstation Builds Need some advice on finalizing my massive NAS upgrade Oct 9, 2018

Share This Page