Measuring scheduling latency on virtualized versus bare hardware

jwatte's picture

The scheduling latency is the longest time that a "ready" process may have to wait before it actually gets to run.
On a system with a single CPU/core, and many ready processes, this latency will be very long.
On a system that is idle, and has many free CPUs/cores, it should theoretically be zero.
However, it will actually be some number greater than zero, because of interactions between the software and the system (interrupts, etc), as well as interactions between the system and the platform it runs on.
This is especially true with systems that run on virtual hardware, such as virtual private servers, Amazon Elastic Compute Cloud, and similar pay-as-you-go hosting environments.

Scheduling latency doesn't matter so much for a traditional web server. However, for real-time systems, such as VoIP services, game servers, or similar, a scheduling latency of as little as 30 milliseconds (1/30th of a second) may render the service unacceptably degraded.

I've seen some hosts on ECC have over a second of scheduling latency! I may be getting "one CPU's worth" of compute time, but certainly not at a scheduling interval that's useful to a real-time server.

To measure the actual scheduling latency of a piece of hardware, you can compile and run this simple program (assuming your OS is some form of Linux). It will measure how long it takes a process to go between two successive readings of the system clock. If that time is noticeably larger than just the function or system call overhead, then what happened was that the program got interrupted in the middle, and then brought back into running, thus effectively measuring the scheduling latency of the system for a running process.

#include <time.h>
#include <stdio.h>
  calclat - calculate scheduling latency (on a Linux host)
  By Jon Watte -- released into the Public Domain
  build with
  g++ -o calclat calclat.cpp -lrt
  run it, and see what your worst case scheduling latency is
double read_clock()
    struct timespec tv;
    clock_gettime(CLOCK_REALTIME, &tv);
    return tv.tv_sec + 1e-9 * tv.tv_nsec;
int main()
    double worst = 0;
    time_t start, stop;
    stop = start;
    fprintf(stderr, "this will take 10 seconds\n");
    while (stop - start < 10)
        for (int i = 0; i < 1000; ++i)
            double a = read_clock();
            double b = read_clock();
            if (b - a > worst)
                worst = b - a;
    fprintf(stdout, "worst latency: %.3f ms\n", worst * 1000.0);
    return 0;