Search results

B
SXM2 over PCIe

Titan V/Quadro GV100 are the last fp64-capable cards with display outputs so they have some value for scientific simulations, especially if you're a researcher running commercial software that doesn't like living on the cloud. For language modeling, in rough order of viability: 3090/3090 Ti...
- bayleyw
- Post #187
- Monday at 11:36 PM
- Forum: Machine Learning, Deep Learning, and AI
B
Learning self hosted AI/machine learning, budget server build questions

Language models can be partitioned across multiple GPUs *with the caveat* that only one GPU is active at any one time. This is a huge caveat, because for regular mortals (and even minor startups) this caps your memory bandwidth at about 1 Tbyte/sec and therefore puts an upper limit on your token...
- bayleyw
- Post #8
- Apr 8, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
ES Xeon Discussion

I have E0 running on an X13SEI and it works great.
- bayleyw
- Post #3,302
- Apr 7, 2024
- Forum: Processors and Motherboards
B
ES Xeon Discussion

is Q03J actually E0 like the listings say? all the specs feel like a D0 (35x turbo multiplier, no part number string), but $570 for a 60-core E0 would be a good deal - the cheapest octal channel C741 boards are several hundred dollars cheaper than a W790-SAGE.
- bayleyw
- Post #3,274
- Apr 4, 2024
- Forum: Processors and Motherboards
B
Supermicro H11SSL-i not detecting all RAM

given that fresh dimms worked the dimms are probably bad? I used to say it was impossible that six dimms would all be bad but after receiving 3 out of 8 ddr5 rdimms with missing resistors in perfect packaging, I'm no longer so confident...
- bayleyw
- Post #35
- Apr 3, 2024
- Forum: Processors and Motherboards
B
putting together a cheap xeon video editing system

This is a hard question. I can say for certain that a 2011-3 Xeon is not the right choice because the single threaded performance is mediocre and Premiere Pro is not exactly known for its thread scalability. You can see here that going from 24 cores to 64 cores barely moves performance. The...
- bayleyw
- Post #4
- Apr 3, 2024
- Forum: Processors and Motherboards
B
Nvidia CMP 100HX (CMP100-210) Tensor cores working?

Yeah that's on par with P100 performance, which has no tensor cores.
- bayleyw
- Post #18
- Apr 2, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
Nvidia CMP 100HX (CMP100-210) Tensor cores working?

I don't think the tensor cores are working. I get 17 its/sec on SD 1.5 on a V100-16GB without TensorRT.
- bayleyw
- Post #16
- Apr 2, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

you're really underestimating the flakiness of taobao sellers, shipping, and cooling (and also apparently the flakiness of the pcie signals...)
- bayleyw
- Post #179
- Mar 30, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
Epyc Genoa Build Advice

hard to make suggestions for a $40k build with no well defined workload profile. my generic suggestions: (1) epyc is not a workstation platform. if you're buying a $40k system with someone elses's money, I'd suggest an oem TR PRO system with a return policy and vendor support. (2) profile your...
- bayleyw
- Post #9
- Mar 29, 2024
- Forum: Processors and Motherboards
B
Learning self hosted AI/machine learning, budget server build questions

image and video generation doesn't shard across multiple GPUs so the other three P40s will not be very useful. the 4x P40 thing is for people who want to get really big language models to run on a budget. in theory, you get 1.4 tbytes/sec of bandwidth across four cards on a bandwidth limited use...
- bayleyw
- Post #4
- Mar 28, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

were the GPUs that were falling off the bus the ones connected to the slot further from the CPU by chance?
- bayleyw
- Post #175
- Mar 28, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
Proof that Asrock (and SM, others) are/were gatekeeping Milan on cheaper boards for no reason

I wonder how stable it is. The 1st gen Epyc systems uniformly lack official Milan support, maybe there is some electrical gremlin that rears its head under heavy I/O load or something.
- bayleyw
- Post #18
- Mar 28, 2024
- Forum: Processors and Motherboards
B
GPU recommendation for mass text processing, summarizing and data analysis, serving API requests etc

the 22G 2080 Ti are on aliexpress, they are hard to find on ebay (there is one vendor who claims to be in palo alto on ebay, but he is frequently OOS). I shouldn't be so hard on amd, after all we need competition in the industry :rolleyes: and amd is getting better at supporting compute use...
- bayleyw
- Post #6
- Mar 27, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
GPU recommendation for mass text processing, summarizing and data analysis, serving API requests etc

under $500: 2080 Ti 22G ($430) best deal: used 3090 (about $750) you want a relatively recent nvidia card with tensor cores, ideally ampere or newer but for now turing also works. you also need enough vram to hold the model, so about 5 bits per parameter, plus the KV cache which can vary...
- bayleyw
- Post #2
- Mar 26, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

do you get any MCE's in dmesg before the devices drop off the bus? if so definitely telltale of bad PCIe signaling.
- bayleyw
- Post #170
- Mar 26, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

ok, thanks. so about 1200 all in for the GPUs and trimmings except the case, or 1900 if you add the waterblocks (are there any cheaper than the Bykski's?)
- bayleyw
- Post #163
- Mar 26, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

if you go by 'median trustworthy looking' ebay/taobao pricing it is 750 for the GPUs, 300 for the board not counting shipping, and 200ish for the heatsinks shipped from china (or 300 from the US), 1250 total. also, I am not sure if folks here have actually used an sxm2 based system (i've dealt...
- bayleyw
- Post #160
- Mar 25, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

800 for the GPUs, plus AOM-SXMV, risers, heatsinks, fans, and whatever contraption you build to mechanically hold it all together gets pretty close to 4x 2080Ti (the $150 V100s aren't repeatable right now), and I'd definitely pay a bit more to get 88GB instead of 64GB. for 60 bucks more you can...
- bayleyw
- Post #158
- Mar 25, 2024
- Forum: Machine Learning, Deep Learning, and AI
B
SXM2 over PCIe

somewhat off topic but speaking of mixed precision on a budget, check out these beauties (eBay link): for $165 you get 2x16+1x8 out of each socket at 3 slot spacing which lets you build hives of 2x NVLINK plus shared memory communication to a NIC (which is not as good as RDMA but at least you...
- bayleyw
- Post #148
- Mar 25, 2024
- Forum: Machine Learning, Deep Learning, and AI

Top Bottom