• Bug#1095028: xvfb: Race condition in xvfb-run

    From Santiago Vila@21:1/5 to All on Sun May 25 13:40:02 2025
    XPost: linux.debian.maint.x

    severity 1093686 grave
    affects 1093686 src:rhythmbox src:merkaartor src:libktorrent src:maliit-keyboard src:kf6-kconfig
    thanks

    Hello Simon. I was going to comment on #1093686 against src:rhythmbox, but then I realized
    this is very likely a manifestation of this bug in xvfb-run, which is probably the reason why several other packages FTBFS randomly as well.

    In fact, all the packages below use xvfb-run in their tests, and I get random failures
    with the following failure rates:

    0.260 rhythmbox (26/100)
    reported as #1093686
    0.270 merkaartor (27/100)
    (not reported yet)
    0.340 libktorrent (34/100)
    (not reported yet)
    0.400 maliit-keyboard (40/100)
    (not reported yet)
    0.411 kf6-kconfig (37/90)
    (not reported yet)

    Build logs available here:

    https://people.debian.org/~sanvila/build-logs/xvfb/

    According to the general guidelines given by Paul in Bug #1057562,
    I think all the above issues should be considered as RC.

    (Note: In the above, I'm using a threshold of 1/4, which is a little
    bit more permissive than the 1/6 figure suggested by Paul).

    Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Ole Streicher on Mon Jun 2 13:30:01 2025
    XPost: linux.debian.maint.x

    On Sun, 02 Feb 2025 at 20:05:41 +0100, Ole Streicher wrote:
    the build of the "giza" package currently fails in some environments
    due to an issue with xvfb-run [1], which is used from debian/rules.

    The problem is that the xvfb-run script only checks that Xvfb is
    running (by signalling with signal 0), but not whether it is actually
    active (and accepts X client calls):

    wait || :
    if kill -0 $XVFBPID 2>/dev/null; then
    break

    Note that xvfb-run does have logic that is intended to wait for Xvfb to be ready: it sets SIGUSR1 to be handled (with a trivial handler), and then
    the wait(1) builtin in the quoted section has (is meant to have?) two terminating conditions:

    1. the xvfb-run script receives SIGUSR1, terminating wait(1)
    unsuccessfully, after which Xvfb should already have its listening
    socket ready to receive requests, `kill -0 $XVFBPID` should still
    succeed, and then whatever tests we are running should also succeed;

    2. or the Xvfb process terminates early due to an error, after which
    wait(1) exits successfully, but then `kill -0 $XVFBPID` fails
    (and then xvfb-run also fails)

    But perhaps that logic is wrong? This is not a straightforward thing to
    do correctly in shell script, and perhaps using a Perl or C helper would
    be more reliable.

    Xserver(1) documents two ways to wait for the X server to be ready. One
    is the SIGUSR1 mechanism used by xvfb-run. The other is to run it with
    option `-displayfd FD`, which makes it choose and output a display
    number on the given fd, similar to `dbus-daemon --print-address=FD` (see
    also test/simple-xinit.c and libxkbcommon_1.7.0-2/test/xvfb-wrapper.c).
    This is not compatible with `xvfb-run --server-num` because it doesn't
    allow the caller to influence the display number to use, but perhaps
    `xvfb-run -a` could use -displayfd?

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)