mpstat

  • english
  • spanish

One of the most basic variables to monitor on a system is CPU usage, something we usually do interactively with top or with vmstat when we want to script the process.

But there’s a problem with multi-CPU systems, with either physical or virtual CPUs, because in these cases vmstat shows the average usage across all CPUs. Depending on the software or hardware architecture, a CPU at 100% can become a bottleneck and produce process blockings ever if the other CPUs are completely idle, while vmstat would show us just a 50% CPU usage. This is quite typical with single CPU systems with HyperThreading technology.

An alternative to vmstat for these situations is mpstat, from the sysstat package, that shows the individual per-CPU usage rates. Very useful when writing a script for displaying graphs with Cacti or raising alarms with Nagios.

# mpstat -P ALL
Linux 2.6.9-023stab046.2-enterprise (domain.com) 	25/09/08
20:39:02     CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
20:39:02     all    0,79    0,00    0,17    5,74    0,00    0,00   93,30      0,00
20:39:02       0    0,87    0,01    0,19    6,85    0,00    0,00   92,09      0,00
20:39:02       1    0,87    0,00    0,18    5,94    0,00    0,00   93,00      0,00
20:39:02       2    0,74    0,00    0,16    5,14    0,00    0,00   93,96      0,00
20:39:02       3    0,68    0,00    0,16    5,02    0,00    0,00   94,15      0,00

iostat & iotop: I/O debugging

  • english
  • spanish

A couple of months ago, we had an interesting issue at a customer: an application wasn’t performing well, but the system had more than 20% CPU idle and wasn’t swapping memory, so it wasn’t a lack of resources. After a deeper look into vmstat, we saw a constant 30% of CPU in I/O state. We had some kind of I/O bottleneck.

To discover the root of the issue we used two programs:

  • iostat (comes with the sysstat package): similar to vmstat or ifstat, but shows I/O operations per device and partition, updating its output every X seconds.
  • iotop: like the classic top, sorting the processes according to their I/O rate.

By using these two utilities it’s quite easy to discover which process is creating the I/O bottleneck, and on which particular device.

In our case, the problem was a RAID controller that was giving a terrible writing performance, coupled with a process that was doing around 15 small, random access writes per second.

Shell-script: timestamp

  • english
  • spanish

A one-liner function that, when piped with the output of another command, prepends each line of that command’s output with a timestamp.

Very useful with commands that output a series of lines periodically but without a timestamp (like vmstat), so that you can’t just send their output to a file and go back to it later without a timeframe.

$ function timestamp { while read l; do d=`date +%H:%M:%S`; echo -e "$d $l"; done; }
$ vmstat 1 | timestamp 12:17:03 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- 12:17:03 r b swpd free buff cache si so bi bo in cs us sy id wa 12:17:03 1 0 4 16740 42816 395164 0 0 6 5 174 11 3 1 96 0 12:17:04 0 0 4 16656 42816 395184 0 0 0 0 393 510 1 1 98 0 12:17:05 0 0 4 16656 42816 395184 0 0 0 0 391 781 2 1 98 0 12:17:06 1 0 4 16656 42824 395176 0 0 0 84 462 976 3 1 95 0 12:17:07 0 0 4 16656 42824 395184 0 0 0 0 433 1545 11 3 86 0 12:17:08 0 0 4 16656 42824 395184 0 0 0 0 356 807 1 2 97 0