workstation for huge graphics files

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
So your basic requirements for the acquisition box are
-x16 slot for your Data drive
-x8 (or onboard 10g) for NIC.
-4 Cores with high clock speed (or more depending on Budget, prioritize clock over cores on clients)

options:
1630v4 (for clock), x10sri (w/ipmi, for x16 slot) /x10sra (no ipmi), ddr4 2400
W2125, x11SRA (no ipmi), ddr4 2666
Xeon gold 5122, X11SPM-TPF, ddr4 2666 (includes 10g fibre onboard)
 

aag

Member
Jun 4, 2016
74
5
8
The Win2016 server has to be a server mainly because (I was told) the tape backup software only runs on server OS (and our IT department would be happy to take care of updates etc. at zero cost to my budget). Apart from that, there will be only 2-3 concurrent users. The computations will mostly involve Imaris, ImageJ/Fiji, and a bunch of python scripts which we have developed in-house. Hence the requirements are not all that different from those of the clients.

I also like the idea of having very similar hardware, which would make it possible to swap machines or offload tasks if need be. But the question still remains: which of the thousand Xeons and motherboards would be a good match?

Some of the image analytics is extremely computationally intensive: just one process runs on the current platform for several days. However, I want to move away from doing these things on our PCs, and would rather rent processing time at the National Supercomputing Center (which we can access at subsidized low rates, actually almost-free).
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
The question is whether your software can take advantage of new CPU features like AVX/512 which would then benefit form Xeon SP's or not (which then means older tech is fine too). Then o/c the question whether the software is capable of multithreading=>
Multi Core/Processor Support
Imaris has extended support for multi core/processor machines since 6.2.0. Most image processing functions are multithreaded. Tests are done with up to 8 cores where Imaris still scales nicely. ImarisBatch scales even better because each job runs in its own 'Imaris'.

Also maybe you can use GPU to speed things up? Its not entirely clear whether thats for displaying or computation.

If the conversion to NSC is easy then go for that o/c, if not get a local solution first.
 

aag

Member
Jun 4, 2016
74
5
8
I doubt that any of our tasks use AVX/512, but I'd be happy to pay some overhead in order to delay obsolescence.
So, shall I spec the Supermicro X11SRA-RF together wiht a lesser Xeon (but which one)? This board is only offered in Mid/Mini-Tower or 4U/Tower chassis, which takes more space than rackmount but of course offers more flexibility if we need to add on stuff.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
As I said if you are happy with 4 cores on clients then "W2125, x11SRA (no ipmi), ddr4 2666"

But no IPMI, no rackchassis per default, so custom build. Also badly manageable in the datacenter.
Get one from Dell/HEP with 10g onboard + x16 slot for storage

or move up to "Xeon gold 5122, X11SPM-TPF, ddr4 2666 (includes 10g fibre onboard)" or 6144 if you think you need more cores
 
  • Like
Reactions: aag

Blinky 42

Active Member
Aug 6, 2015
615
232
43
48
PA, USA
For new HW, you could pretty easy build up very beefy workstations within your budget and leave a lot of $ on the table to enhance the central server / backup solutions. You obviously need a decent amount of main memory and should get a decent amount of NVMe PCI storage as well.

Just a few minutes futzing on Thinkmate's site (since the configurator is fast):
Mid Tower HPX XF4-2460v4 dual e5-2600v4 motherboard
  • 2x e5-2637v4 (for High speed cores, 4 each @ 3.5GHz, if you apps do better with more cores, go for e5-2650v4 for 12Cores each @ 2.2G simialr $)
  • 8x 64G 2400 DDR4 ECC = 512G Total
  • NVidia Quadro P1000 4G DDR5 (adjust up as needed...)
  • 1x Mellanox 40/56G MCX413A
  • 2x P4500 4TB NVMe SSD
Is only $16.5k list price. You get 1/2TB of memory, 40G network, and 8TB of in-workstation SSD. You can add 4x10TB drives for lots of spinning rust storage to make the workstation even more stand-alone. I went with older server CPUs because they are a big cheaper right now so you can spend more on the memory and SSDs.
If you are spending $ on networking go for 40+G (the configurator doesn't let you pick a 50G card but I would go for MCX416A or higher for 50G support for more future-proofing). You can get similar systems from all the major vendors as well. AMD systems, and newer Intel systems work just as well and possibly even better, but the point is you can get the large amount of memory and NVMe you need without spending a fortune.

If you are thinking 50k.. you can get 2 of these workstations and then have 17K left to put into enhancing your central server/tape, or buy a 3rd workstation for another person to work through your data sets and get more productivity if you have the people available.
 
  • Like
Reactions: aag

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
I think he needs to add a switch to the bill as well.
Surprising you don't have 10G in the datacenter though. Might want to double check
 
  • Like
Reactions: aag

aag

Member
Jun 4, 2016
74
5
8
OK, you guys have convinced me that it may be worth exploring a 40GB network. The Mellanox MCX416A adapter goes (theoretically) up to 100GB/s and costs less than $1K; I can afford three of them without problems.

The question however is: which switch would be able to cope with this speed? I guess that it would make sense to buy a Mellanox switch as well. The SN2010 would do the job with 4 high-speed ports. At $5.5K, it would be within budget. However, I think that I need to buy the appropriate SFP modules in addition, right? But which ones? The Mellanox web site is very confusing...
 
Last edited:

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
You could inquire at Mellanox's for a PoC setup of Cards & Switch with a rebate. If you do let me know what they are asking for it;)
 

aag

Member
Jun 4, 2016
74
5
8
Perfect! I am meeting our IT guys this Friday, and will propose them the specifications above.

I am very happy to have updated my HW knowledge. While I have no ambition to become a real expert, ultimately I will be held responsible for getting results out of this system. Hence it is appropriate that I spend a few hours familiarizing myself with the details. Now, if the local experts propose a dramatically different configuration, I will at least be able to ask (somewhat) informed questions.

A million thanks to all discussants!
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,314
1,484
113
49
HSV and SFO
A little late to the party, but this is a fascinating project that parallels what I had to do which much, much lower data volume.

My project consisted of getting paper converted to pdfs. Similar to your challenge, the image acquision process is the slowest, and yet batchable. This provides room for pipelining the work where one machine can scan (or batch scan), while another can review. This can be done in a super-simple way of just having two workstations with the appropriate speed local storage connected directly via 40G+ connections. Share the drives to each other and one can act as a 'server' and the other a client or have the flexibility to change it up as needed.

For backups, you could have another system/server designated for warm storage along with the tape backup. These backups could be run at the end of the day when the workstations are not being used, or the system/server could just copy from the system that's scanning on a regular basis--1hr, 6hr, etc. And you could just put two network cards in the server and two more in the workstations to have the equivalent of a switched network without the switch. I'm sure if it costs the same though, that a switch and just one network card each would be a more traditional approach.

One important question that I saw missing in a lot of the post is how large is your overall data set going to be and will it all need to be online, nearline? or is archived okay? There's a huge difference in just needing a day or two's worth of data online versus the whole project online.

Also, you mentioned rdp as a way to access one of the systems if you're running it headless. My advice on this is--don't. Even in its fastest incarnation, any remote control software will not be as fast as simply having another monitor and keyboard or a kvm switch.

And I've noticed you haven't mentioned gpus. While the processors will be fast for processing, the native video will be puny by comparison. Getting some top-of-line workstation or gaming gpus with fast pixel and texture rates will help with screen drawing immensely. Even in my simple pdf project, an older gaming gpu buries the native video usually by a factor of at least 2-3x. Don't underestimate the impact of this portion of your project since you are dealing with a lot of images that ultimately need to be drawn on the screen.

I think almost everyone will be interested in hearing what you end up going with, so please do stop by and let us know what you end up with and how well it works. :)
 

aag

Member
Jun 4, 2016
74
5
8
Thank you for these thoughts.
  • Storage: we are accumulating at least 15 TBytes/month. We have bought a QNAP with 3 or 4 SDDs and 12 HDDs (14 TB each) configured as RAID6. For the time being, it is attached to a 1Gb/s network, and that is unaccceptably slow.
  • we have ordered two of the Mellanox 40/100 Gb/s cards that were advised above. The intent to is create a direct link via fiber, just as you say. Once we have more workstations, we will buy the Mellanox switch ($5k).
best
AAG
 

aag

Member
Jun 4, 2016
74
5
8
I need advice on one more thing. The backup/archiving solution is not fully worked out yet.

  • LTO7 tapes hold 6 TBytes and cost ca. $60 each, and a tape autoloader with 8-10 slots is ca. $3.5k.
  • For comparison, HDDs of that size cost 3x as much (ca. $180) but they do not necessitate anything except a drive adapter and are easier to manage than tape (no special software etc.). They can also be shipped to our collaborators in Stanford, who may not avail of a tape device in their lab.
The financial break-even would be reached at 30 HDDs, after which the cost of the HDD solution would become higher than that of the tapes. This corresponds to ca. 1 year of data. But of course one could buy higher-capacity HDDs, and get some additional savings.

I am really at loss here. In principle, I like the idea with the tapes - but I also see a lot of drawbacks. Any opinions?
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
ok, check

You have around 50k usd for this project that should include
- 3 state of the art Windows workstations (or Server edition), 256G RAM, TB local and remote data storage for data acquisition/processing

- storage where you want to work onto or backup therefor fast network (10G+)
- backup/ archiving of data (say 50TB at least), data seems critical

For a first concept i would calculate

-5 k per Windows workstation (does not matter if W10 or Server edition) = 15 k
- You need a good 10k Switch = 5 k

- You need a fast and secure storage, use ZFS. You want performance, data security and at least 50T of data
A single Raid-6 seems not fast enough. You may also need a secure/crash resistent write behaviour.
I would use a 3 x 6 raid-z2 setup that gives 3 x the iops of a raid-6 with 128G RAM and an Intel Slog for secure sync writes

Storage server, 50 TB min usable, 128k RAM, 20 disks 6 or 8 TB (more disks is faster), Intel Optane Slog = 10k
Use regular ZFS snaps (versining and Randomware protection)

- You need a backup system with the option to sync even open files, can work under load and can process 50T of data
Best to achieve this is a second ZFS server with ZFS replication. You can reduce amount of RAM and use a slower pool layout like 1 x raid-z2/3. Use regular ZFS scrubbing jobs to avoid long term storage problems (bit rot)

Backup server 1, 50TB, 32 GB RAM, 10 x 8 TB disks = 6k

If data is as critical, use a second backup server at a different location
Backup server 2, 50TB, 32 GB RAM, 10 x 8 TB disks = 6k

Sum: 40-50k

Example:
Use SuperMicro systems based on the mainboard X11SPH-nCTPF | Motherboards | Products - Super Micro Computer, Inc. with an Xeon Silver. For the workstations add an Intel Optane 900 for best of all local performance (480 or 960 GB as system/based disk, optionally an additional SSD but prefer working from NAS as this offers snaps and a better security)

For Storage and Backup use
Supermicro | Products | SuperServers | 2U | 5029P-E1CTR12L with Intel Xeon
or a case with same mainboard and more slots, best with a passive backplane (no expander) , ex SC846BA-R1K28B | 4U | Chassis | Products | Super Micro Computer, Inc.

My suggestion for storage OS
Best for ZFS (fastest, best integration, most features, ZFS encryption): Oracle Solaris (commercial OS)
Next best: Solaris Clones, check OmniOS
For management you can use my napp-it on either

Next best for ZFS are Free-BSD based solutions (ex FreeNAS)
 
  • Like
Reactions: Samir

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,314
1,484
113
49
HSV and SFO
Thank you for these thoughts.
  • Storage: we are accumulating at least 15 TBytes/month. We have bought a QNAP with 3 or 4 SDDs and 12 HDDs (14 TB each) configured as RAID6. For the time being, it is attached to a 1Gb/s network, and that is unaccceptably slow.
  • we have ordered two of the Mellanox 40/100 Gb/s cards that were advised above. The intent to is create a direct link via fiber, just as you say. Once we have more workstations, we will buy the Mellanox switch ($5k).
best
AAG
Thank you. I would highly advise against RAID5/6 if data is important to you. Drives will fail and when they do, the rebuild will take time, and as mentioned before just another 1/2 drive failures and all your data is gone.

If you don't necessarily need to have all the data on a single volume, I'd recommend raid1 or at the most raid0+1 for smaller 15tb volumes that would hold a day's data. Then these drives themselves could be removed and be your archive as you've noticed that drives can be pretty cheap related to tape as prices drop.

Back in the early 1990s, we looked at many hard drive alternatives to archive data, but what we found was that in all the hassle with using an alternative platform, even with the cost savings, it wasn't as cheap as just getting more drives in the end. And it's still about the same today.

If cost for archive is going to be an issue, have you considered compressing the originals using a loss-less compression? Depending on how these files compress, if you can achieve ratios like 10:1, a 1TB file can fit on a 100/128GB bluray. It would be tedious to have a single file on a disc, but that might be cheaper than buying a hard drive or two every day. But of course, this adds another step and process in the whole workflow as compression of the file and writing the blueray will take some time.
 

aag

Member
Jun 4, 2016
74
5
8
Thanks to everybody, this was a real eye-opener for me, totally educative. Now the deed is done. Mellanox 40/100Gb and a bunch of high-end PCs were ordered. I will now open a new thread related to backups, which are still unsolved, and where the potential for disasters is immense...
 
  • Like
Reactions: Samir