Scheduling jitter measurement for UNIX

jwatte's picture

2009-03-12 latency2.txt README --

This program measures scheduling latency/jitter for a CPU bound process on a UNIX system. It is intended to determine the suitability of various virtualization products to hosting real-time processes such as game servers. To compile it, use g++ (make sure it has been properly installed!):

$ g++ -o latency2 latency2.cpp

When run, it times the scheduling jitter of a CPU bound process, and prints a summary statistic line to the standard output about once a minute.

To use it, start one copy per CPU on the machine. Wait five minutes, and then look at the output of each copy. Most important is the "biggest" number -- if it's higher than 30 milliseconds, chances are that the virtualization technology gets in the way. On a "raw metal" machine, you typically see jitter of up to 3 milliseconds. You can also calculate the standard deviation from the data provided, using sum squared and count. A large standard deviation means that the jitter is noisy, which would be bad.

For example, on a dual-core machine:

$ ./latency2 > core1.txt &
$ ./latency2 > core2.txt &

# wait 5 minutes

$ killall latency2

Now, check the performance of each copy by looking at the data in the generated txt files. The units are in "seconds" or "each" as appropriate:

diff 0.000000 avg 0.000000 biggest 0.000000 smallest 0.000000 sumsq 0.000000 count 1.000000, runtime 0.000001
diff 0.000001 avg 0.000000 biggest 0.000206 smallest 0.000000 sumsq 0.000026 count 396915411.000000, runtime 60.000002

The first example line is always printed for the first measurement, to show the program has started; it is not a useful measurement. The following lines are printed once a minute. Because the unit is seconds, you can see that the biggest jitter on this machine over the duration of a minute was 0.21 milliseconds -- a good measurement, showing that the machine is quiescent. "diff" is the last measurement at the time of logging; "avg" is the average jitter time, "sumsq" and "count" can be used to calculate standard deviation, and "runtime" is the number of seconds for which the process has run so far.

This program is released into the public domain by Jon Watte, Forterra Systems.
Do with it what you wish, but we make no guarantees about its merchantability or fitness for a particular purposes -- all warranties, express or implied, are specifically disclaimed.

latency2.zip2.07 KB