docs.daveops.net

Snippets for yer computer needs

Systems Performance

“Systems Performance by Brendan Gregg”

60 second Linux perf troubleshooting

uptime
dmesg | tail
vmstat 1
mpstat -P ALL 1
pidstat 1
iostat -xz 1
free -m
sar -n DEV 1
sar -n TCP,ETCP 1
top

Performance tuning

Most to least effective

sysbench

benchmark CPU

sysbench --test=cpu --cpu-max-prime=20000 run

benchmark I/O

sysbench --test=fileio --file-total-size=10G prepare
sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --init-rnd=on --max-time=300 --max_requests=0 run
sysbench --test=fileio --file-total-size=10G cleanup

ftrace

trace-cmd

# list available plugins/events
trace-cmd list

perf

https://perf.wiki.kernel.org/index.php/Main_Page

benchmarking

Be aware of subtle floating point rounding errors that can occur from code path changes (eg hitting the CPU registers vs main memory)

eBPF

bpftrace

# list all syscall tracepoints
bpftrace -l 'tracepoint:syscalls:*'

# run a bpftrace program
bpftrace -e 'tracepoint:syscalls:sys_enter_openat {printf "%s\n", comm}'

# get BPF instructions
bpftrace -v program.bt
probe /filter/ { action }

builtins:

var desc
pid process id
tid thread id
uid user id
username username
comm process or command name
curtask current taskstruct as u64
nsecs current time in nanoseconds
elapsed time in nanoseconds since bpftrace start
kstack kernel stack trace
ustack user-level stack trace
arg0…argn function arguments
args tracepoint arguments
retval function return value
func function name
probe full probe name

types:

var desc
@name global
@name [key] hash (map)
@name [tid] thread-local
$name scratch

bpftool

# show loaded bpf programs
bpftool prog show

# dump BPF instructions of a program (here 123)
bpftool prog dump xlated id 123

Don’t make changes until you’ve profiled

Assuming code performance is a power law, a small percentage of LOC will actually affect the over runtime of the program. If you aren’t profiling your code, you have a small percentage chance of affecting the runtime performance.

Using time

desc field
time spent in kernel sys
time spent in userland user
stopwatch time real

note that sys and user combined don’t necessarily equal real (CPU has other processes to deal with, etc)