block/genhd: use seq_put_decimal_ull for diskstats decimal values
seq_printf is costly. For each block device, 19 decimal values are
yielded in /proc/diskstats via seq_printf. On a system with 16 logical
block devices, profiling for open/read/close sequences shows seq_printf
took ~75% samples of diskstats_show:
diskstats_show(92.626%
2269372/
2450040)
seq_printf(76.026%
1725313/
2269372)
vsnprintf(99.163%
1710866/
1725313)
format_decode(26.597% 455040/
1710866)
number(19.554% 334542/
1710866)
memcpy_orig(4.183% 71570/
1710866)
...
srso_return_thunk(0.009% 148/
1725313)
part_stat_read_all(8.030% 182236/
2269372)
One million rounds of open/read/close /proc/diskstats takes:
real 0m37.687s
user 0m0.264s
sys 0m32.911s
On average, each sequence tooks ~0.032ms
With this patch, most decimal values are yield via seq_put_decimal_ull,
performance is significantly improved:
real 0m20.792s
user 0m0.316s
sys 0m20.463s
On average, each sequence tooks ~0.020ms, a ~37.5% improvement.
Signed-off-by: David Wang <00107082@163.com>
Link: https://lore.kernel.org/r/20241108054500.4251-1-00107082@163.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>