1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
Patrick

How-to Guide How to find which NUMA node a GPU is attached to

Important for multiple NUMA node servers

  1. Patrick
    Since many STH'ers have multiple NUMA node servers, whether dual Xeon or even a single AMD EPYC, or Threadripper, knowing where a GPU is attached to can be important.

    For example, in a dual AMD EPYC system, how can one find which of the 8 NUMA nodes a GPU is attached to?

    A quick tip is to install hwloc utilities using (on Debian / Ubuntu)
    Code:
    sudo apt install hwloc
    From here you can use lstopo or hwloc-ls. Be careful as you will want to set an output format. Here is an example of a GPU on NUMA node 2 in the eight NUMA node server:
    Code:
    $ lstopo --of console
    Machine (252GB total)
      Package L#0
        NUMANode L#0 (P#0 31GB)
          L3 L#0 (8192KB)
            L2 L#0 (512KB) + L1d L#0 (32KB) + L1i L#0 (64KB) + Core L#0
              PU L#0 (P#0)
              PU L#1 (P#64)
            L2 L#1 (512KB) + L1d L#1 (32KB) + L1i L#1 (64KB) + Core L#1
              PU L#2 (P#1)
              PU L#3 (P#65)
            L2 L#2 (512KB) + L1d L#2 (32KB) + L1i L#2 (64KB) + Core L#2
              PU L#4 (P#2)
              PU L#5 (P#66)
            L2 L#3 (512KB) + L1d L#3 (32KB) + L1i L#3 (64KB) + Core L#3
              PU L#6 (P#3)
              PU L#7 (P#67)
          L3 L#1 (8192KB)
            L2 L#4 (512KB) + L1d L#4 (32KB) + L1i L#4 (64KB) + Core L#4
              PU L#8 (P#4)
              PU L#9 (P#68)
            L2 L#5 (512KB) + L1d L#5 (32KB) + L1i L#5 (64KB) + Core L#5
              PU L#10 (P#5)
              PU L#11 (P#69)
            L2 L#6 (512KB) + L1d L#6 (32KB) + L1i L#6 (64KB) + Core L#6
              PU L#12 (P#6)
              PU L#13 (P#70)
            L2 L#7 (512KB) + L1d L#7 (32KB) + L1i L#7 (64KB) + Core L#7
              PU L#14 (P#7)
              PU L#15 (P#71)
          HostBridge L#0
            PCIBridge
              PCIBridge
                PCI 1a03:2000
                  GPU L#0 "card0"
                  GPU L#1 "controlD64"
            PCIBridge
              PCI 8086:0953
        NUMANode L#1 (P#1 31GB)
          L3 L#2 (8192KB)
            L2 L#8 (512KB) + L1d L#8 (32KB) + L1i L#8 (64KB) + Core L#8
              PU L#16 (P#8)
              PU L#17 (P#72)
            L2 L#9 (512KB) + L1d L#9 (32KB) + L1i L#9 (64KB) + Core L#9
              PU L#18 (P#9)
              PU L#19 (P#73)
            L2 L#10 (512KB) + L1d L#10 (32KB) + L1i L#10 (64KB) + Core L#10
              PU L#20 (P#10)
              PU L#21 (P#74)
            L2 L#11 (512KB) + L1d L#11 (32KB) + L1i L#11 (64KB) + Core L#11
              PU L#22 (P#11)
              PU L#23 (P#75)
          L3 L#3 (8192KB)
            L2 L#12 (512KB) + L1d L#12 (32KB) + L1i L#12 (64KB) + Core L#12
              PU L#24 (P#12)
              PU L#25 (P#76)
            L2 L#13 (512KB) + L1d L#13 (32KB) + L1i L#13 (64KB) + Core L#13
              PU L#26 (P#13)
              PU L#27 (P#77)
            L2 L#14 (512KB) + L1d L#14 (32KB) + L1i L#14 (64KB) + Core L#14
              PU L#28 (P#14)
              PU L#29 (P#78)
            L2 L#15 (512KB) + L1d L#15 (32KB) + L1i L#15 (64KB) + Core L#15
              PU L#30 (P#15)
              PU L#31 (P#79)
          HostBridge L#4
            PCIBridge
              PCI 8086:1521
                Net L#2 "eno1"
              PCI 8086:1521
                Net L#3 "eno2"
              PCI 8086:1521
                Net L#4 "eno3"
              PCI 8086:1521
                Net L#5 "eno4"
            PCIBridge
              PCI 1022:7901
        NUMANode L#2 (P#2 31GB)
          L3 L#4 (8192KB)
            L2 L#16 (512KB) + L1d L#16 (32KB) + L1i L#16 (64KB) + Core L#16
              PU L#32 (P#16)
              PU L#33 (P#80)
            L2 L#17 (512KB) + L1d L#17 (32KB) + L1i L#17 (64KB) + Core L#17
              PU L#34 (P#17)
              PU L#35 (P#81)
            L2 L#18 (512KB) + L1d L#18 (32KB) + L1i L#18 (64KB) + Core L#18
              PU L#36 (P#18)
              PU L#37 (P#82)
            L2 L#19 (512KB) + L1d L#19 (32KB) + L1i L#19 (64KB) + Core L#19
              PU L#38 (P#19)
              PU L#39 (P#83)
          L3 L#5 (8192KB)
            L2 L#20 (512KB) + L1d L#20 (32KB) + L1i L#20 (64KB) + Core L#20
              PU L#40 (P#20)
              PU L#41 (P#84)
            L2 L#21 (512KB) + L1d L#21 (32KB) + L1i L#21 (64KB) + Core L#21
              PU L#42 (P#21)
              PU L#43 (P#85)
            L2 L#22 (512KB) + L1d L#22 (32KB) + L1i L#22 (64KB) + Core L#22
              PU L#44 (P#22)
              PU L#45 (P#86)
            L2 L#23 (512KB) + L1d L#23 (32KB) + L1i L#23 (64KB) + Core L#23
              PU L#46 (P#23)
              PU L#47 (P#87)
          HostBridge L#7
            PCIBridge
              PCI 10de:1b82
                GPU L#6 "card1"
                GPU L#7 "renderD128"
        NUMANode L#3 (P#3 31GB)
          L3 L#6 (8192KB)
            L2 L#24 (512KB) + L1d L#24 (32KB) + L1i L#24 (64KB) + Core L#24
              PU L#48 (P#24)
              PU L#49 (P#88)
            L2 L#25 (512KB) + L1d L#25 (32KB) + L1i L#25 (64KB) + Core L#25
              PU L#50 (P#25)
              PU L#51 (P#89)
            L2 L#26 (512KB) + L1d L#26 (32KB) + L1i L#26 (64KB) + Core L#26
              PU L#52 (P#26)
              PU L#53 (P#90)
            L2 L#27 (512KB) + L1d L#27 (32KB) + L1i L#27 (64KB) + Core L#27
              PU L#54 (P#27)
              PU L#55 (P#91)
          L3 L#7 (8192KB)
            L2 L#28 (512KB) + L1d L#28 (32KB) + L1i L#28 (64KB) + Core L#28
              PU L#56 (P#28)
              PU L#57 (P#92)
            L2 L#29 (512KB) + L1d L#29 (32KB) + L1i L#29 (64KB) + Core L#29
              PU L#58 (P#29)
              PU L#59 (P#93)
            L2 L#30 (512KB) + L1d L#30 (32KB) + L1i L#30 (64KB) + Core L#30
              PU L#60 (P#30)
              PU L#61 (P#94)
            L2 L#31 (512KB) + L1d L#31 (32KB) + L1i L#31 (64KB) + Core L#31
              PU L#62 (P#31)
              PU L#63 (P#95)
      Package L#1
        NUMANode L#4 (P#4 31GB)
          L3 L#8 (8192KB)
            L2 L#32 (512KB) + L1d L#32 (32KB) + L1i L#32 (64KB) + Core L#32
              PU L#64 (P#32)
              PU L#65 (P#96)
            L2 L#33 (512KB) + L1d L#33 (32KB) + L1i L#33 (64KB) + Core L#33
              PU L#66 (P#33)
              PU L#67 (P#97)
            L2 L#34 (512KB) + L1d L#34 (32KB) + L1i L#34 (64KB) + Core L#34
              PU L#68 (P#34)
              PU L#69 (P#98)
            L2 L#35 (512KB) + L1d L#35 (32KB) + L1i L#35 (64KB) + Core L#35
              PU L#70 (P#35)
              PU L#71 (P#99)
          L3 L#9 (8192KB)
            L2 L#36 (512KB) + L1d L#36 (32KB) + L1i L#36 (64KB) + Core L#36
              PU L#72 (P#36)
              PU L#73 (P#100)
            L2 L#37 (512KB) + L1d L#37 (32KB) + L1i L#37 (64KB) + Core L#37
              PU L#74 (P#37)
              PU L#75 (P#101)
            L2 L#38 (512KB) + L1d L#38 (32KB) + L1i L#38 (64KB) + Core L#38
              PU L#76 (P#38)
              PU L#77 (P#102)
            L2 L#39 (512KB) + L1d L#39 (32KB) + L1i L#39 (64KB) + Core L#39
              PU L#78 (P#39)
              PU L#79 (P#103)
          HostBridge L#9
            PCIBridge
              PCI 1022:7901
                Block(Disk) L#8 "sda"
        NUMANode L#5 (P#5 31GB)
          L3 L#10 (8192KB)
            L2 L#40 (512KB) + L1d L#40 (32KB) + L1i L#40 (64KB) + Core L#40
              PU L#80 (P#40)
              PU L#81 (P#104)
            L2 L#41 (512KB) + L1d L#41 (32KB) + L1i L#41 (64KB) + Core L#41
              PU L#82 (P#41)
              PU L#83 (P#105)
            L2 L#42 (512KB) + L1d L#42 (32KB) + L1i L#42 (64KB) + Core L#42
              PU L#84 (P#42)
              PU L#85 (P#106)
            L2 L#43 (512KB) + L1d L#43 (32KB) + L1i L#43 (64KB) + Core L#43
              PU L#86 (P#43)
              PU L#87 (P#107)
          L3 L#11 (8192KB)
            L2 L#44 (512KB) + L1d L#44 (32KB) + L1i L#44 (64KB) + Core L#44
              PU L#88 (P#44)
              PU L#89 (P#108)
            L2 L#45 (512KB) + L1d L#45 (32KB) + L1i L#45 (64KB) + Core L#45
              PU L#90 (P#45)
              PU L#91 (P#109)
            L2 L#46 (512KB) + L1d L#46 (32KB) + L1i L#46 (64KB) + Core L#46
              PU L#92 (P#46)
              PU L#93 (P#110)
            L2 L#47 (512KB) + L1d L#47 (32KB) + L1i L#47 (64KB) + Core L#47
              PU L#94 (P#47)
              PU L#95 (P#111)
        NUMANode L#6 (P#6 31GB)
          L3 L#12 (8192KB)
            L2 L#48 (512KB) + L1d L#48 (32KB) + L1i L#48 (64KB) + Core L#48
              PU L#96 (P#48)
              PU L#97 (P#112)
            L2 L#49 (512KB) + L1d L#49 (32KB) + L1i L#49 (64KB) + Core L#49
              PU L#98 (P#49)
              PU L#99 (P#113)
            L2 L#50 (512KB) + L1d L#50 (32KB) + L1i L#50 (64KB) + Core L#50
              PU L#100 (P#50)
              PU L#101 (P#114)
            L2 L#51 (512KB) + L1d L#51 (32KB) + L1i L#51 (64KB) + Core L#51
              PU L#102 (P#51)
              PU L#103 (P#115)
          L3 L#13 (8192KB)
            L2 L#52 (512KB) + L1d L#52 (32KB) + L1i L#52 (64KB) + Core L#52
              PU L#104 (P#52)
              PU L#105 (P#116)
            L2 L#53 (512KB) + L1d L#53 (32KB) + L1i L#53 (64KB) + Core L#53
              PU L#106 (P#53)
              PU L#107 (P#117)
            L2 L#54 (512KB) + L1d L#54 (32KB) + L1i L#54 (64KB) + Core L#54
              PU L#108 (P#54)
              PU L#109 (P#118)
            L2 L#55 (512KB) + L1d L#55 (32KB) + L1i L#55 (64KB) + Core L#55
              PU L#110 (P#55)
              PU L#111 (P#119)
        NUMANode L#7 (P#7 31GB)
          L3 L#14 (8192KB)
            L2 L#56 (512KB) + L1d L#56 (32KB) + L1i L#56 (64KB) + Core L#56
              PU L#112 (P#56)
              PU L#113 (P#120)
            L2 L#57 (512KB) + L1d L#57 (32KB) + L1i L#57 (64KB) + Core L#57
              PU L#114 (P#57)
              PU L#115 (P#121)
            L2 L#58 (512KB) + L1d L#58 (32KB) + L1i L#58 (64KB) + Core L#58
              PU L#116 (P#58)
              PU L#117 (P#122)
            L2 L#59 (512KB) + L1d L#59 (32KB) + L1i L#59 (64KB) + Core L#59
              PU L#118 (P#59)
              PU L#119 (P#123)
          L3 L#15 (8192KB)
            L2 L#60 (512KB) + L1d L#60 (32KB) + L1i L#60 (64KB) + Core L#60
              PU L#120 (P#60)
              PU L#121 (P#124)
            L2 L#61 (512KB) + L1d L#61 (32KB) + L1i L#61 (64KB) + Core L#61
              PU L#122 (P#61)
              PU L#123 (P#125)
            L2 L#62 (512KB) + L1d L#62 (32KB) + L1i L#62 (64KB) + Core L#62
              PU L#124 (P#62)
              PU L#125 (P#126)
            L2 L#63 (512KB) + L1d L#63 (32KB) + L1i L#63 (64KB) + Core L#63
              PU L#126 (P#63)
              PU L#127 (P#127)
    
    As you can see, under HostBridge L#7 we can find the PCIe GPU.

    Another feature of this is that you can clearly see how caches relate to CPU cores (P#'s) which will help you quickly target physical cores.