[問題] 使用mlockall不能完全避免page fault?
正在試著評估RPi中如果用mlockall把memory鎖住會不會改善latency
用著名的cyclictest (v0.92)+perf得到以下結果:
sudo perf stat ./cyclictest -p 90 - m -c 0 -i 3000 -n -h 250 -q -l 10000
# Total: 000009985
# Min Latencies: 00038
# Avg Latencies: 00082
# Max Latencies: 00386
# Histogram Overflows: 00015
 Performance counter stats for
'./cyclictest -p 90 -m -c 0 -i 3000 -n -h 250 -q -l 10000':
        818.925000      task-clock (msec)         #    0.027 CPUs utilized
            13,362      context-switches          #    0.016 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                56      page-faults               #    0.068 K/sec
       471,078,551      cycles                    #    0.575 GHz                      (50.34%)
       282,495,112      stalled-cycles-frontend   #   59.97% frontend cycles idle     (51.67%)
        13,419,172      stalled-cycles-backend    #    2.85% backend  cycles idle     (52.93%)
        68,489,877      instructions              #    0.15  insns per cycle
                                                  #    4.12  stalled cycles per insn  (38.41%)
         7,553,254      branches                  #    9.223 M/sec                    (30.02%)
         1,627,813      branch-misses             #   21.55% of all branches          (34.01%)
      30.232651000 seconds time elapsed
如果不加-m參數(不用mlockall):
sudo perf stat ./cyclictest -p 90 -c 0 -i 3000 -n -h 250 -q -l 10000
# Total: 000009988
# Min Latencies: 00038
# Avg Latencies: 00080
# Max Latencies: 00407
# Histogram Overflows: 00012
 Performance counter stats for
'./cyclictest -p 90 -c 0 -i 3000 -n -h 250 -q -l 10000':
        772.978000      task-clock (msec)         #    0.026 CPUs utilized
            13,363      context-switches          #    0.017 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                66      page-faults               #    0.085 K/sec
       444,135,743      cycles                    #    0.575 GHz                      (41.26%)
       271,762,254      stalled-cycles-frontend   #   61.19% frontend cycles idle     (48.87%)
         8,522,179      stalled-cycles-backend    #    1.92% backend  cycles idle     (56.53%)
        65,640,536      instructions              #    0.15  insns per cycle
                                                  #    4.14  stalled cycles per insn  (37.62%)
         7,453,674      branches                  #    9.643 M/sec                    (34.44%)
         1,584,489      branch-misses             #   21.26% of all branches          (25.24%)
      30.197211000 seconds time elapsed
看起來Max latencies會因為-m變小一點
我的問題在於,page-faults只有因為-m變稍小一點,並沒有完全解決
請問這是正常的嗎?我還以為mlockall住就不會有PF了。
感謝
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 90.41.67.118
※ 文章網址: https://www.ptt.cc/bbs/LinuxDev/M.1446054583.A.EB6.html
推
10/29 16:24, , 1F
10/29 16:24, 1F
→
10/29 19:28, , 2F
10/29 19:28, 2F
做了個實驗:把loop提高10倍看PF次數有沒有提高
有mlockall的情況下:page-faults維持在55-56沒增加
Performance counter stats for
'./cyclictest -p 90 -m -c 0 -i 3000 -n -h 250 -q -l 100000':
       7202.248000      task-clock (msec)         #    0.024 CPUs utilized
           130,818      context-switches          #    0.018 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                55      page-faults               #    0.008 K/sec
     4,079,431,733      cycles                    #    0.566 GHz                      (48.12%)
     2,569,771,515      stalled-cycles-frontend   #   62.99% frontend cycles idle     (49.99%)
        69,883,756      stalled-cycles-backend    #    1.71% backend  cycles idle     (51.78%)
       643,633,565      instructions              #    0.16  insns per cycle
                                                  #    3.99  stalled cycles per insn  (34.40%)
        72,253,517      branches                  #   10.032 M/sec                    (32.91%)
        15,166,468      branch-misses             #   20.99% of all branches          (31.47%)
     300.240982143 seconds time elapsed
沒有mlockall:page-faults維持在66-67
Performance counter stats for
'./cyclictest -p 90 -c 0 -i 3000 -n -h 250 -q -l 100000':
       7181.634000      task-clock (msec)         #    0.024 CPUs utilized
           130,892      context-switches          #    0.018 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                67      page-faults               #    0.009 K/sec
     4,072,629,665      cycles                    #    0.567 GHz                      (49.76%)
     2,537,027,318      stalled-cycles-frontend   #   62.29% frontend cycles idle     (49.79%)
        70,191,503      stalled-cycles-backend    #    1.72% backend  cycles idle     (50.05%)
       627,997,620      instructions              #    0.15  insns per cycle
                                                  #    4.04  stalled cycles per insn  (34.31%)
        71,914,012      branches                  #   10.014 M/sec                    (33.07%)
        15,190,645      branch-misses             #   21.12% of all branches          (33.44%)
     300.195795144 seconds time elapsed
看起來loop增加並沒有增加page-faults... (不管有無mlockall)
※ 編輯: wtchen (90.41.214.241), 10/29/2015 19:46:17
※ 編輯: wtchen (90.41.214.241), 10/29/2015 19:51:13
推
10/29 21:58, , 3F
10/29 21:58, 3F
→
10/30 04:00, , 4F
10/30 04:00, 4F
→
10/30 04:01, , 5F
10/30 04:01, 5F
→
10/30 04:02, , 6F
10/30 04:02, 6F
→
10/30 04:02, , 7F
10/30 04:02, 7F
→
10/30 04:04, , 8F
10/30 04:04, 8F
→
10/30 16:51, , 9F
10/30 16:51, 9F
→
10/30 16:51, , 10F
10/30 16:51, 10F
→
10/31 00:00, , 11F
10/31 00:00, 11F
→
10/31 00:06, , 12F
10/31 00:06, 12F
→
10/31 00:07, , 13F
10/31 00:07, 13F
→
10/31 00:07, , 14F
10/31 00:07, 14F
→
10/31 00:08, , 15F
10/31 00:08, 15F
推
11/07 05:35, , 16F
11/07 05:35, 16F
→
11/07 05:36, , 17F
11/07 05:36, 17F
→
11/07 05:37, , 18F
11/07 05:37, 18F
LinuxDev 近期熱門文章
PTT數位生活區 即時熱門文章