Hi everyone,
I'm building a GPU cluster and encountering a challenge with the GPU topology. Here's my setup:
GPU0 GPU1 GPU2 GPU3 GPU4 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE NODE NODE NODE 0-31 0 N/A
GPU1 NODE X NODE NODE NODE 0-31 0 N/A
GPU2 NODE NODE X NODE NODE 0-31 0 N/A
GPU3 NODE NODE NODE X PHB 0-31 0 N/A
GPU4 NODE NODE NODE PHB X 0-31 0 N/A
What I've Tried:
Is it possible to reconfigure or adjust the setup so that GPU4 connects via NODE instead of PHB, thereby maximizing performance? If so, how can I achieve this?
Thanks in advance for your insights!
I'm building a GPU cluster and encountering a challenge with the GPU topology. Here's my setup:
- Motherboard: ASRock Rack ROMED8-2T
- GPU Configuration: 5 x RTX 3090 GPUs inserted into PCIe slots 1–4 and 6, connected with PCIe 4.0 risers.
GPU0 GPU1 GPU2 GPU3 GPU4 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE NODE NODE NODE 0-31 0 N/A
GPU1 NODE X NODE NODE NODE 0-31 0 N/A
GPU2 NODE NODE X NODE NODE 0-31 0 N/A
GPU3 NODE NODE NODE X PHB 0-31 0 N/A
GPU4 NODE NODE NODE PHB X 0-31 0 N/A
What I've Tried:
- Moved GPU from slot 6 to slot 5:
- The system wouldn't post and keeps restarting.
- Moved GPU from slot 6 to slot 7:
- The system posts successfully, but GPU4 still uses PHB to connect with GPU2.
Is it possible to reconfigure or adjust the setup so that GPU4 connects via NODE instead of PHB, thereby maximizing performance? If so, how can I achieve this?
Thanks in advance for your insights!