Patrick

How-to Guide How to find which NUMA node a GPU is attached to

Since many STH'ers have multiple NUMA node servers, whether dual Xeon or even a single AMD EPYC, or Threadripper, knowing where a GPU is attached to can be important.

For example, in a dual AMD EPYC system, how can one find which of the 8 NUMA nodes a GPU is attached to?

A quick tip is to install hwloc utilities using (on Debian / Ubuntu)
Code:
sudo apt install hwloc
From here you can use lstopo or hwloc-ls. Be careful as you will want to set an output format. Here is an example of a GPU on NUMA node 2 in the eight NUMA node server:
Code:
$ lstopo --of console
Machine (252GB total)
  Package L#0
    NUMANode L#0 (P#0 31GB)
      L3 L#0 (8192KB)
        L2 L#0 (512KB) + L1d L#0 (32KB) + L1i L#0 (64KB) + Core L#0
          PU L#0 (P#0)
          PU L#1 (P#64)
        L2 L#1 (512KB) + L1d L#1 (32KB) + L1i L#1 (64KB) + Core L#1
          PU L#2 (P#1)
          PU L#3 (P#65)
        L2 L#2 (512KB) + L1d L#2 (32KB) + L1i L#2 (64KB) + Core L#2
          PU L#4 (P#2)
          PU L#5 (P#66)
        L2 L#3 (512KB) + L1d L#3 (32KB) + L1i L#3 (64KB) + Core L#3
          PU L#6 (P#3)
          PU L#7 (P#67)
      L3 L#1 (8192KB)
        L2 L#4 (512KB) + L1d L#4 (32KB) + L1i L#4 (64KB) + Core L#4
          PU L#8 (P#4)
          PU L#9 (P#68)
        L2 L#5 (512KB) + L1d L#5 (32KB) + L1i L#5 (64KB) + Core L#5
          PU L#10 (P#5)
          PU L#11 (P#69)
        L2 L#6 (512KB) + L1d L#6 (32KB) + L1i L#6 (64KB) + Core L#6
          PU L#12 (P#6)
          PU L#13 (P#70)
        L2 L#7 (512KB) + L1d L#7 (32KB) + L1i L#7 (64KB) + Core L#7
          PU L#14 (P#7)
          PU L#15 (P#71)
      HostBridge L#0
        PCIBridge
          PCIBridge
            PCI 1a03:2000
              GPU L#0 "card0"
              GPU L#1 "controlD64"
        PCIBridge
          PCI 8086:0953
    NUMANode L#1 (P#1 31GB)
      L3 L#2 (8192KB)
        L2 L#8 (512KB) + L1d L#8 (32KB) + L1i L#8 (64KB) + Core L#8
          PU L#16 (P#8)
          PU L#17 (P#72)
        L2 L#9 (512KB) + L1d L#9 (32KB) + L1i L#9 (64KB) + Core L#9
          PU L#18 (P#9)
          PU L#19 (P#73)
        L2 L#10 (512KB) + L1d L#10 (32KB) + L1i L#10 (64KB) + Core L#10
          PU L#20 (P#10)
          PU L#21 (P#74)
        L2 L#11 (512KB) + L1d L#11 (32KB) + L1i L#11 (64KB) + Core L#11
          PU L#22 (P#11)
          PU L#23 (P#75)
      L3 L#3 (8192KB)
        L2 L#12 (512KB) + L1d L#12 (32KB) + L1i L#12 (64KB) + Core L#12
          PU L#24 (P#12)
          PU L#25 (P#76)
        L2 L#13 (512KB) + L1d L#13 (32KB) + L1i L#13 (64KB) + Core L#13
          PU L#26 (P#13)
          PU L#27 (P#77)
        L2 L#14 (512KB) + L1d L#14 (32KB) + L1i L#14 (64KB) + Core L#14
          PU L#28 (P#14)
          PU L#29 (P#78)
        L2 L#15 (512KB) + L1d L#15 (32KB) + L1i L#15 (64KB) + Core L#15
          PU L#30 (P#15)
          PU L#31 (P#79)
      HostBridge L#4
        PCIBridge
          PCI 8086:1521
            Net L#2 "eno1"
          PCI 8086:1521
            Net L#3 "eno2"
          PCI 8086:1521
            Net L#4 "eno3"
          PCI 8086:1521
            Net L#5 "eno4"
        PCIBridge
          PCI 1022:7901
    NUMANode L#2 (P#2 31GB)
      L3 L#4 (8192KB)
        L2 L#16 (512KB) + L1d L#16 (32KB) + L1i L#16 (64KB) + Core L#16
          PU L#32 (P#16)
          PU L#33 (P#80)
        L2 L#17 (512KB) + L1d L#17 (32KB) + L1i L#17 (64KB) + Core L#17
          PU L#34 (P#17)
          PU L#35 (P#81)
        L2 L#18 (512KB) + L1d L#18 (32KB) + L1i L#18 (64KB) + Core L#18
          PU L#36 (P#18)
          PU L#37 (P#82)
        L2 L#19 (512KB) + L1d L#19 (32KB) + L1i L#19 (64KB) + Core L#19
          PU L#38 (P#19)
          PU L#39 (P#83)
      L3 L#5 (8192KB)
        L2 L#20 (512KB) + L1d L#20 (32KB) + L1i L#20 (64KB) + Core L#20
          PU L#40 (P#20)
          PU L#41 (P#84)
        L2 L#21 (512KB) + L1d L#21 (32KB) + L1i L#21 (64KB) + Core L#21
          PU L#42 (P#21)
          PU L#43 (P#85)
        L2 L#22 (512KB) + L1d L#22 (32KB) + L1i L#22 (64KB) + Core L#22
          PU L#44 (P#22)
          PU L#45 (P#86)
        L2 L#23 (512KB) + L1d L#23 (32KB) + L1i L#23 (64KB) + Core L#23
          PU L#46 (P#23)
          PU L#47 (P#87)
      HostBridge L#7
        PCIBridge
          PCI 10de:1b82
            GPU L#6 "card1"
            GPU L#7 "renderD128"
    NUMANode L#3 (P#3 31GB)
      L3 L#6 (8192KB)
        L2 L#24 (512KB) + L1d L#24 (32KB) + L1i L#24 (64KB) + Core L#24
          PU L#48 (P#24)
          PU L#49 (P#88)
        L2 L#25 (512KB) + L1d L#25 (32KB) + L1i L#25 (64KB) + Core L#25
          PU L#50 (P#25)
          PU L#51 (P#89)
        L2 L#26 (512KB) + L1d L#26 (32KB) + L1i L#26 (64KB) + Core L#26
          PU L#52 (P#26)
          PU L#53 (P#90)
        L2 L#27 (512KB) + L1d L#27 (32KB) + L1i L#27 (64KB) + Core L#27
          PU L#54 (P#27)
          PU L#55 (P#91)
      L3 L#7 (8192KB)
        L2 L#28 (512KB) + L1d L#28 (32KB) + L1i L#28 (64KB) + Core L#28
          PU L#56 (P#28)
          PU L#57 (P#92)
        L2 L#29 (512KB) + L1d L#29 (32KB) + L1i L#29 (64KB) + Core L#29
          PU L#58 (P#29)
          PU L#59 (P#93)
        L2 L#30 (512KB) + L1d L#30 (32KB) + L1i L#30 (64KB) + Core L#30
          PU L#60 (P#30)
          PU L#61 (P#94)
        L2 L#31 (512KB) + L1d L#31 (32KB) + L1i L#31 (64KB) + Core L#31
          PU L#62 (P#31)
          PU L#63 (P#95)
  Package L#1
    NUMANode L#4 (P#4 31GB)
      L3 L#8 (8192KB)
        L2 L#32 (512KB) + L1d L#32 (32KB) + L1i L#32 (64KB) + Core L#32
          PU L#64 (P#32)
          PU L#65 (P#96)
        L2 L#33 (512KB) + L1d L#33 (32KB) + L1i L#33 (64KB) + Core L#33
          PU L#66 (P#33)
          PU L#67 (P#97)
        L2 L#34 (512KB) + L1d L#34 (32KB) + L1i L#34 (64KB) + Core L#34
          PU L#68 (P#34)
          PU L#69 (P#98)
        L2 L#35 (512KB) + L1d L#35 (32KB) + L1i L#35 (64KB) + Core L#35
          PU L#70 (P#35)
          PU L#71 (P#99)
      L3 L#9 (8192KB)
        L2 L#36 (512KB) + L1d L#36 (32KB) + L1i L#36 (64KB) + Core L#36
          PU L#72 (P#36)
          PU L#73 (P#100)
        L2 L#37 (512KB) + L1d L#37 (32KB) + L1i L#37 (64KB) + Core L#37
          PU L#74 (P#37)
          PU L#75 (P#101)
        L2 L#38 (512KB) + L1d L#38 (32KB) + L1i L#38 (64KB) + Core L#38
          PU L#76 (P#38)
          PU L#77 (P#102)
        L2 L#39 (512KB) + L1d L#39 (32KB) + L1i L#39 (64KB) + Core L#39
          PU L#78 (P#39)
          PU L#79 (P#103)
      HostBridge L#9
        PCIBridge
          PCI 1022:7901
            Block(Disk) L#8 "sda"
    NUMANode L#5 (P#5 31GB)
      L3 L#10 (8192KB)
        L2 L#40 (512KB) + L1d L#40 (32KB) + L1i L#40 (64KB) + Core L#40
          PU L#80 (P#40)
          PU L#81 (P#104)
        L2 L#41 (512KB) + L1d L#41 (32KB) + L1i L#41 (64KB) + Core L#41
          PU L#82 (P#41)
          PU L#83 (P#105)
        L2 L#42 (512KB) + L1d L#42 (32KB) + L1i L#42 (64KB) + Core L#42
          PU L#84 (P#42)
          PU L#85 (P#106)
        L2 L#43 (512KB) + L1d L#43 (32KB) + L1i L#43 (64KB) + Core L#43
          PU L#86 (P#43)
          PU L#87 (P#107)
      L3 L#11 (8192KB)
        L2 L#44 (512KB) + L1d L#44 (32KB) + L1i L#44 (64KB) + Core L#44
          PU L#88 (P#44)
          PU L#89 (P#108)
        L2 L#45 (512KB) + L1d L#45 (32KB) + L1i L#45 (64KB) + Core L#45
          PU L#90 (P#45)
          PU L#91 (P#109)
        L2 L#46 (512KB) + L1d L#46 (32KB) + L1i L#46 (64KB) + Core L#46
          PU L#92 (P#46)
          PU L#93 (P#110)
        L2 L#47 (512KB) + L1d L#47 (32KB) + L1i L#47 (64KB) + Core L#47
          PU L#94 (P#47)
          PU L#95 (P#111)
    NUMANode L#6 (P#6 31GB)
      L3 L#12 (8192KB)
        L2 L#48 (512KB) + L1d L#48 (32KB) + L1i L#48 (64KB) + Core L#48
          PU L#96 (P#48)
          PU L#97 (P#112)
        L2 L#49 (512KB) + L1d L#49 (32KB) + L1i L#49 (64KB) + Core L#49
          PU L#98 (P#49)
          PU L#99 (P#113)
        L2 L#50 (512KB) + L1d L#50 (32KB) + L1i L#50 (64KB) + Core L#50
          PU L#100 (P#50)
          PU L#101 (P#114)
        L2 L#51 (512KB) + L1d L#51 (32KB) + L1i L#51 (64KB) + Core L#51
          PU L#102 (P#51)
          PU L#103 (P#115)
      L3 L#13 (8192KB)
        L2 L#52 (512KB) + L1d L#52 (32KB) + L1i L#52 (64KB) + Core L#52
          PU L#104 (P#52)
          PU L#105 (P#116)
        L2 L#53 (512KB) + L1d L#53 (32KB) + L1i L#53 (64KB) + Core L#53
          PU L#106 (P#53)
          PU L#107 (P#117)
        L2 L#54 (512KB) + L1d L#54 (32KB) + L1i L#54 (64KB) + Core L#54
          PU L#108 (P#54)
          PU L#109 (P#118)
        L2 L#55 (512KB) + L1d L#55 (32KB) + L1i L#55 (64KB) + Core L#55
          PU L#110 (P#55)
          PU L#111 (P#119)
    NUMANode L#7 (P#7 31GB)
      L3 L#14 (8192KB)
        L2 L#56 (512KB) + L1d L#56 (32KB) + L1i L#56 (64KB) + Core L#56
          PU L#112 (P#56)
          PU L#113 (P#120)
        L2 L#57 (512KB) + L1d L#57 (32KB) + L1i L#57 (64KB) + Core L#57
          PU L#114 (P#57)
          PU L#115 (P#121)
        L2 L#58 (512KB) + L1d L#58 (32KB) + L1i L#58 (64KB) + Core L#58
          PU L#116 (P#58)
          PU L#117 (P#122)
        L2 L#59 (512KB) + L1d L#59 (32KB) + L1i L#59 (64KB) + Core L#59
          PU L#118 (P#59)
          PU L#119 (P#123)
      L3 L#15 (8192KB)
        L2 L#60 (512KB) + L1d L#60 (32KB) + L1i L#60 (64KB) + Core L#60
          PU L#120 (P#60)
          PU L#121 (P#124)
        L2 L#61 (512KB) + L1d L#61 (32KB) + L1i L#61 (64KB) + Core L#61
          PU L#122 (P#61)
          PU L#123 (P#125)
        L2 L#62 (512KB) + L1d L#62 (32KB) + L1i L#62 (64KB) + Core L#62
          PU L#124 (P#62)
          PU L#125 (P#126)
        L2 L#63 (512KB) + L1d L#63 (32KB) + L1i L#63 (64KB) + Core L#63
          PU L#126 (P#63)
          PU L#127 (P#127)
As you can see, under HostBridge L#7 we can find the PCIe GPU.

Another feature of this is that you can clearly see how caches relate to CPU cores (P#'s) which will help you quickly target physical cores.
  • Like
Reactions: Dreece
Author
Patrick
Views
378
First release
Last update
Rating
0.00 star(s) 0 ratings

More resources from Patrick