沒用過 Fio 的請先參考使用介紹 https://benjr.tw/34632
測試環境為 CentOS 7 x86_64 (虛擬機)
效能測試需求如下:
- Hard drive – /dev/sdb
- iodepth – 4
- block size – 4k , 32k
- RW – 100% Read , 100% write
直接使用 Fio Job file (configuration) 檔案來設定會比較適合完整的效能測試,先編輯好我們要測試的不同參數值,Job file 兩段 [global] : 共用的設定參數 , [job] :特別指定的參數.
[root@localhost ~]# vi fio.cfg [global] filename=/dev/sdb direct=1 ioengine=libaio time_based runtime=10 iodepth=4 refill_buffers group_reporting wait_for_previous ramp_time=5 [JOB1] bs=4k rw=read [JOB2] bs=32k rw=read [JOB3] bs=4k rw=write [JOB4] bs=32k rw=write
使用參數說明:
- filename
指定要測試的磁碟. - direct
預設值為 0 ,必須設定為 1 才會測試到真實的 non-buffered I/O. - ioengine
定義如何跑 I/O 的方式, libaio 是 Linux 本身非同步(asynchronous) I/O 的方式.
其他還有 sync , psync , vsync , posixaio , mmap , splice , syslet-rw , sg , null , net , netsplice , cpuio , guasi , external. - time_based
測試以時間為單位,另外一種方式是以 kb_base (kilobyte). - runtime
這一測試所需的時間,單位為 秒. - iodepth=16
同一時間有多少 I/O 在做存取,越多不代表存儲裝置表現會更好,通常是 RAID 時須要設大一點. - refill_buffers
refill_buffers 為預設值,應該是跟 I/O Buffer 有關 (refill the IO buffers on every submit),把 Buffer 填滿就不會跑到 Buffer 的值. - group_reporting
如果 numjobs 有指定,設定 group_reporting 報告會以 per-group 的顯示方式,而不是預設的 per-job (會顯示所有個別 numjobs 的測試結果) - wait_for_previous
預設所有的 Job 會一起執行,wait_for_previous 可以讓依序一個接著一個執行. - ramp_time
設定 ramp_time 會讓測試開始的一段時間不統計到整體效能裡,避免測試是跑在 cache 裏. - bs=4k
bs 或是 blocksize ,也就是檔案寫入大小,預設值為 4K,如何設定這個值,因為不同性質的儲存裝置需要不同的值.看你是 File Server,Web server , Database … 設定都會不一樣. - rw
可以設定的參數如下,通常在跑效能時會使用單純的 read 與 write,其他可以使用參數如下:- read : Sequential reads. (循序讀)
- write : Sequential writes. (循序寫)
- trim : Sequential trim.
- randread : Random reads. (隨機讀)
- randwrite : Random writes. (隨機寫)
- randtrim : Random trim.
- rw : Mixed sequential reads and writes. (循序讀寫)
- readwrite : Sequential read and write mix (循序混合讀寫)
- randrw : Mixed random reads and writes. (隨機讀寫)
- trimwrite : Trim and write mix, trims preceding writes.
參考使用參數
- cpumask=int
FIO 可以指定要使用哪一顆 CPU (邏輯處理器) 來運算,指定方式為 Mask (遮罩) 的方式.
CPU 0 (0001) -> 1 (cpumask=1) <- 這是由第一顆核心來處理
CPU 1 (0010) -> 2 (cpumask=2) <- 這是由第二顆核心來處理
CPU 0+1 (0011) -> 3 (cpumask=3) <- 代表要由一,二顆核心來處理
執行測試.
[root@localhost ~]# fio fio.cfg JOB1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 JOB2: (g=1): rw=read, bs=32K-32K/32K-32K/32K-32K, ioengine=libaio, iodepth=4 JOB3: (g=2): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 JOB4: (g=3): rw=write, bs=32K-32K/32K-32K/32K-32K, ioengine=libaio, iodepth=4 fio-2.2.8 Starting 4 processes Jobs: 1 (f=1): [_(3),W(1)] [67.0% done] [0KB/36812KB/0KB /s] [0/1150/0 iops] [eta 00m:30s]
JOB1 bs=4k , rw=read
JOB1: (groupid=0, jobs=1): err= 0: pid=2429: Tue Mar 6 01:16:16 2018 read : io=444704KB, bw=44466KB/s, iops=11116, runt= 10001msec slat (usec): min=48, max=3387, avg=80.71, stdev=34.02 clat (usec): min=145, max=11311, avg=274.97, stdev=139.44 lat (usec): min=227, max=11457, avg=357.59, stdev=148.98 clat percentiles (usec): | 1.00th=[ 203], 5.00th=[ 209], 10.00th=[ 215], 20.00th=[ 217], | 30.00th=[ 231], 40.00th=[ 239], 50.00th=[ 247], 60.00th=[ 262], | 70.00th=[ 282], 80.00th=[ 306], 90.00th=[ 346], 95.00th=[ 402], | 99.00th=[ 604], 99.50th=[ 844], 99.90th=[ 2096], 99.95th=[ 2832], | 99.99th=[ 4128] bw (KB /s): min= 0, max=47480, per=95.76%, avg=42580.40, stdev=10136.57 lat (usec) : 250=51.62%, 500=46.49%, 750=1.28%, 1000=0.24% lat (msec) : 2=0.26%, 4=0.10%, 10=0.01%, 20=0.01% cpu : usr=8.91%, sys=137.32%, ctx=1645, majf=0, minf=37 IO depths : 1=0.1%, 2=0.1%, 4=148.6%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=111173/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=4
JOB2 bs=32k , rw=read
JOB2: (groupid=1, jobs=1): err= 0: pid=2440: Tue Mar 6 01:16:16 2018 read : io=3260.4MB, bw=333794KB/s, iops=10430, runt= 10001msec slat (usec): min=46, max=1894, avg=92.46, stdev=44.82 clat (usec): min=1, max=4643, avg=294.65, stdev=145.48 lat (usec): min=62, max=4713, avg=381.22, stdev=173.96 clat percentiles (usec): | 1.00th=[ 185], 5.00th=[ 185], 10.00th=[ 185], 20.00th=[ 187], | 30.00th=[ 187], 40.00th=[ 195], 50.00th=[ 209], 60.00th=[ 262], | 70.00th=[ 378], 80.00th=[ 430], 90.00th=[ 486], 95.00th=[ 532], | 99.00th=[ 660], 99.50th=[ 764], 99.90th=[ 1256], 99.95th=[ 1592], | 99.99th=[ 2672] bw (KB /s): min= 6, max=490368, per=92.76%, avg=309641.90, stdev=144782.12 lat (usec) : 2=0.01%, 4=0.01%, 100=0.01%, 250=59.48%, 500=32.17% lat (usec) : 750=7.81%, 1000=0.34% lat (msec) : 2=0.16%, 4=0.02%, 10=0.01% cpu : usr=9.39%, sys=130.26%, ctx=6507, majf=0, minf=65 IO depths : 1=0.1%, 2=0.1%, 4=134.4%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=104318/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=4
JOB3 bs=4k , rw=write
JOB3: (groupid=2, jobs=1): err= 0: pid=2441: Tue Mar 6 01:16:16 2018 write: io=392044KB, bw=39200KB/s, iops=9799, runt= 10001msec slat (usec): min=48, max=1586, avg=87.01, stdev=38.36 clat (usec): min=95, max=153225, avg=320.77, stdev=1946.09 lat (usec): min=204, max=153283, avg=408.86, stdev=1946.08 clat percentiles (usec): | 1.00th=[ 229], 5.00th=[ 231], 10.00th=[ 235], 20.00th=[ 249], | 30.00th=[ 262], 40.00th=[ 266], 50.00th=[ 274], 60.00th=[ 282], | 70.00th=[ 290], 80.00th=[ 302], 90.00th=[ 330], 95.00th=[ 362], | 99.00th=[ 764], 99.50th=[ 844], 99.90th=[ 1080], 99.95th=[ 1752], | 99.99th=[140288] bw (KB /s): min= 0, max=45512, per=95.01%, avg=37242.40, stdev=12902.47 lat (usec) : 100=0.01%, 250=20.68%, 500=77.56%, 750=0.71%, 1000=0.93% lat (msec) : 2=0.09%, 50=0.01%, 100=0.01%, 250=0.02% cpu : usr=9.48%, sys=127.43%, ctx=606, majf=0, minf=30 IO depths : 1=0.1%, 2=0.1%, 4=148.2%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=98008/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=4
JOB4 bs=32k , rw=write
JOB4: (groupid=3, jobs=1): err= 0: pid=2444: Tue Mar 6 01:16:16 2018 write: io=262304KB, bw=26035KB/s, iops=813, runt= 10075msec slat (usec): min=48, max=2753, avg=111.01, stdev=80.11 clat (usec): min=206, max=246160, avg=4803.33, stdev=19448.08 lat (usec): min=345, max=246720, avg=4932.55, stdev=19452.24 clat percentiles (usec): | 1.00th=[ 474], 5.00th=[ 644], 10.00th=[ 716], 20.00th=[ 804], | 30.00th=[ 868], 40.00th=[ 932], 50.00th=[ 1004], 60.00th=[ 1096], | 70.00th=[ 1240], 80.00th=[ 1448], 90.00th=[ 2160], 95.00th=[ 3376], | 99.00th=[83456], 99.50th=[162816], 99.90th=[164864], 99.95th=[242688], | 99.99th=[246784] bw (KB /s): min= 6, max=37211, per=95.61%, avg=24892.42, stdev=11341.37 lat (usec) : 250=0.01%, 500=1.29%, 750=12.05%, 1000=36.12% lat (msec) : 2=38.98%, 4=7.24%, 10=0.52%, 100=3.22%, 250=0.60% cpu : usr=4.50%, sys=32.14%, ctx=3705, majf=0, minf=32 IO depths : 1=0.1%, 2=0.1%, 4=338.5%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8194/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=4
也有測試會專注在 Latency 這一項目.
- lat (latency)
用以統計量測 total latency numbers. - slat (submission latency)
用以統計量測 submission latency numbers. - clat (completion latency)
用以統計量測 completion latency numbers.
想要得到越低的 latency (延遲值),Fio 建議設定 ionice 值或是透過參數 prioclass 與 nice .
- nice=int
Run job with given nice value. - prio=int
Set I/O priority value of this job between 0 (highest) and 7 (lowest). - prioclass=int
Set I/O priority class.
綜合測試結果
Run status group 0 (all jobs): READ: io=444704KB, aggrb=44465KB/s, minb=44465KB/s, maxb=44465KB/s, mint=10001msec, maxt=10001msec Run status group 1 (all jobs): READ: io=3260.4MB, aggrb=333793KB/s, minb=333793KB/s, maxb=333793KB/s, mint=10001msec, maxt=10001msec Run status group 2 (all jobs): WRITE: io=392044KB, aggrb=39200KB/s, minb=39200KB/s, maxb=39200KB/s, mint=10001msec, maxt=10001msec Run status group 3 (all jobs): WRITE: io=262304KB, aggrb=26035KB/s, minb=26035KB/s, maxb=26035KB/s, mint=10075msec, maxt=10075msec Disk stats (read/write): sdb: ios=305472/172840, merge=0/0, ticks=64102/84028, in_queue=147775, util=94.56%
最後一段就是效能的統計資料,其所代表的結果如下:
- io: Number of megabytes I/O performed.
時間內所執行的 IO 次數.
如果要提高 IOPS 的值,FIO 建議使用參數 disable_lat=1 , disable_clat=1 , disable_slat=1 ,FIO 在進行測試時會統計系統其他數據 延遲(Latency) 是其中一項,也就是在進行 FIO 時,不再統計延遲(Latency)的數據.- lat (latency)
用以統計量測 total latency numbers,如果 slat 或是 clat 有設定時,這個選項也必須 disable. - slat (submission latency)
用以統計量測 submission latency numbers. - clat (completion latency)
用以統計量測 completion latency numbers.
- lat (latency)
- aggrb: Aggregate bandwidth of threads in the group.
平均的測試頻寬結果. - minb: Minimum average bandwidth a thread saw.
因為效能有時高有時低,所以這個值代表平均最小測試頻寬結果. - maxb: Maximum average bandwidth a thread saw.
因為效能有時高有時低,所以這個值代表平均最大測試頻寬結果. - mint: Shortest runtime of threads in the group.
因為執行時間時高有時低,所以這個值代表 threads 執行平均最短所需的時間. - maxt: Longest runtime of threads in the group.
因為執行時間時高有時低,所以這個值代表 threads 執行平均最大所需的時間.
沒有解決問題,試試搜尋本站其他內容
One thought on “Fio 效能測試”