Linux command – Stressful Application Test (NUMA)

Loading

前面有使用過 Stressful Application Test (Stressapptest) https://benjr.tw/96740 這邊針對他的記憶體 NUMA 測試來做說明.

測試環境為 Ubuntu16.04 64bits

NUMA (Non-uniform memory access)

NUMA (Non-uniform memory access) 把 CPU 與記憶體區分成不同的結點 Node (不同的 CPU 各自擁有記憶體),彼此的 CPU 節點再透過 QPI (Intel QuickPath Interconnect) 這個介面做溝通.
關於 NUMA 請參考 https://benjr.tw/96788

與 NUMA 相關參數

  • –local_numa :
    choose memory regions associated with each CPU to be tested by that CPU ,使用該節點上的記憶體空間.
  • –remote_numa :
    choose memory regions not associated with each CPU to be tested by that .使用另一個節點的記憶體空間.

測試環境為 Ubuntu 16.04 64bits
透過 numastat 可以看到我的系統有兩個 Node (Node0 與 Node1)

root@ubuntu:~# numastat
                           node0           node1
numa_hit                  684516          565302
numa_miss                      0               0
numa_foreign                   0               0
interleave_hit              4908           15701
local_node                682033          546987
other_node                  2483           18315

上面的數值所代表

  • numa_hit
    Memory successfully allocated on this node as intended.
    記憶體成功配置至此節點
  • numa_miss
    Memory allocated on this node despite the process preferring some different node. Each numa_miss has a numa_foreign on another node.
    原先預定的節點的記憶體不足,而配置至此節點. numa_miss 與另一個節點的 numa_foreign 是相對應的.
  • numa_foreign
    Memory intended for this node, but actually allocated on some different node. Each numa_foreign has a numa_miss on another node.
    原先預定至此節點的記憶體但被配置至其他節點上. numa_foreign 與另一個節點的 numa_miss 是相對應的.
  • interleave_hit
    Interleaved memory successfully allocated on this node as intended.
    The number of interleave policy allocations that were intended for a specific node and succeeded there.
  • local_node
    Memory allocated on this node while a process was running on it.
    該節點上的程序成功配置到該節點的記憶體空間.
  • other_node
    Memory allocated on this node while a process was running on some other node.
    該節點上的程序,成功配置到另一個節點的記憶體空間.

測試 local_numa

root@ubuntu:~# stressapptest -s 30 -M 500 --local_numa
Log: Commandline - stressapptest -s 30 -M 500 --local_numa
Stats: SAT revision 1.0.6_autoconf, 64 bit binary
Log: buildd @ kapok on Wed Jan 21 17:09:35 UTC 2015 from open source release
Log: 1 nodes, 16 cpus.
Log: Defaulting to 16 copy threads
Log: Prefer plain malloc memory allocation.
Log: Using memaligned allocation at 0x7fc80ed5d000.
Stats: Starting SAT, 500M, 30 seconds
Log: Region mask: 0x1
Log: Seconds remaining: 20
Log: Seconds remaining: 10
Stats: Found 0 hardware incidents
Stats: Completed: 409174.00M in 30.00s 13637.74MB/s, with 0 hardware incidents, 0 errors
Stats: Memory Copy: 409174.00M at 13638.94MB/s
Stats: File Copy: 0.00M at 0.00MB/s
Stats: Net Copy: 0.00M at 0.00MB/s
Stats: Data Check: 0.00M at 0.00MB/s
Stats: Invert Data: 0.00M at 0.00MB/s
Stats: Disk: 0.00M at 0.00MB/s

Status: PASS - please verify no corrected errors

local_node 明顯增加 node0 – 682033(測試前) 701210(測試後) , node1 – 546987(測試前) 571896(測試後) , other_node 沒有增加.

root@ubuntu:~# numastat
                           node0           node1
numa_hit                  703693          590211
numa_miss                      0               0
numa_foreign                   0               0
interleave_hit              4908           15701
local_node                701210          571896
other_node                  2483           18315

測試 remote_numa

root@ubuntu:~# stressapptest -s 30 -M 500 --remote_numa
Log: Commandline - stressapptest -s 30 -M 500 --remote_numa
Stats: SAT revision 1.0.6_autoconf, 64 bit binary
Log: buildd @ kapok on Wed Jan 21 17:09:35 UTC 2015 from open source release
Log: 1 nodes, 16 cpus.
Log: Defaulting to 16 copy threads
Log: Prefer plain malloc memory allocation.
Log: Using memaligned allocation at 0x7fe0490a9000.
Stats: Starting SAT, 500M, 30 seconds
Log: Region mask: 0x1
Log: Seconds remaining: 20
Log: Seconds remaining: 10
Stats: Found 0 hardware incidents
Stats: Completed: 419376.00M in 30.01s 13976.86MB/s, with 0 hardware incidents, 0 errors
Stats: Memory Copy: 419376.00M at 13977.69MB/s
Stats: File Copy: 0.00M at 0.00MB/s
Stats: Net Copy: 0.00M at 0.00MB/s
Stats: Data Check: 0.00M at 0.00MB/s
Stats: Invert Data: 0.00M at 0.00MB/s
Stats: Disk: 0.00M at 0.00MB/s

Status: PASS - please verify no corrected errors

Remote NUMA 測試出來的值 Memory Copy: 419376.00M , 13977.69MB/s 與 Local NUMA – Memory Copy: 409174.00M , 13638.94MB/s 並無明顯的差別.

local_node 明顯增加 node0 – 701210 (測試前) 728102(測試後) , node1 – 571896(測試前) 598015(測試後) ,奇怪的是 other_node 沒有增加.

root@ubuntu:~# numastat
                           node0           node1
numa_hit                  730585          616330
numa_miss                      0               0
numa_foreign                   0               0
interleave_hit              4908           15701
local_node                728102          598015
other_node                  2483           18315

透過 #numastat -m 來觀察整個 NUMA 記憶體使用 (MemTotal , MemFree , MemUsed) ,可以看到不管使用 local_numa 或是 remote_numa 來測試其記憶體分配皆是對等平均分配方式,不確認是程式問題還是系統限制,讓 stressapptest local_numa , remote_numa 不如預期.
NUMA 測試建議可以透過 #numacl 指令來限制,使用指定處理器搭配指定記憶體.

root@ubuntu:~# numactl --interleave=0,1 ./stressapptest -s 180 -M 32000

關於 numactl 指令說明請參考 https://benjr.tw/96788

沒有解決問題,試試搜尋本站其他內容

One thought on “Linux command – Stressful Application Test (NUMA)

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步了解 Akismet 如何處理網站訪客的留言資料