• timeout in shutdown, mutt killed by SIGKILL

    From Vincent Lefevre@21:1/5 to All on Wed May 29 03:00:01 2024
    During the latest shutdown:

    May 29 01:55:05 qaa systemd[1]: Stopping session-2.scope - Session 2 of User vinc17...
    [...]
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Stopping timed out. Killing. May 29 01:55:26 qaa systemd[1]: session-2.scope: Killing process 2990 (mutt) with signal SIGKILL.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Failed with result 'timeout'. May 29 01:55:26 qaa systemd[1]: Stopped session-2.scope - Session 2 of User vinc17.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Consumed 8h 17min 54.832s CPU time.
    May 29 01:55:26 qaa systemd[1]: Stopping systemd-logind.service - User Login Management...
    May 29 01:55:26 qaa systemd[1]: Stopping user@1000.service - User Manager for UID 1000...
    [...]

    Note: I had set DefaultTimeoutStopSec=20s because then is normally
    sufficient.

    The mutt process was running in GNU Screen. The current mailbox
    was just a local one. Mutt does no network connections (I do not
    use IMAP). I general, I quit it before logging out, but this time,
    it seems that I forgot.

    But I don't understand why there was a timeout. Does this mean that
    mutt didn't react to SIGTERM? Any reason?

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Curt@21:1/5 to Vincent Lefevre on Wed May 29 18:20:01 2024
    On 2024-05-29, Vincent Lefevre <vincent@vinc17.net> wrote:
    During the latest shutdown:

    May 29 01:55:05 qaa systemd[1]: Stopping session-2.scope - Session 2 of User vinc17...
    [...]
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Stopping timed out. Killing. May 29 01:55:26 qaa systemd[1]: session-2.scope: Killing process 2990 (mutt) with signal SIGKILL.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Failed with result 'timeout'.
    May 29 01:55:26 qaa systemd[1]: Stopped session-2.scope - Session 2 of User vinc17.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Consumed 8h 17min 54.832s CPU time.
    May 29 01:55:26 qaa systemd[1]: Stopping systemd-logind.service - User Login Management...
    May 29 01:55:26 qaa systemd[1]: Stopping user@1000.service - User Manager for UID 1000...
    [...]

    Note: I had set DefaultTimeoutStopSec=20s because then is normally sufficient.

    The mutt process was running in GNU Screen. The current mailbox
    was just a local one. Mutt does no network connections (I do not
    use IMAP). I general, I quit it before logging out, but this time,
    it seems that I forgot.

    But I don't understand why there was a timeout. Does this mean that
    mutt didn't react to SIGTERM? Any reason?


    sudo journalctl -b-1?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent Lefevre@21:1/5 to Curt on Thu May 30 18:10:02 2024
    On 2024-05-29 16:13:05 -0000, Curt wrote:
    On 2024-05-29, Vincent Lefevre <vincent@vinc17.net> wrote:
    During the latest shutdown:

    May 29 01:55:05 qaa systemd[1]: Stopping session-2.scope - Session 2 of User vinc17...
    [...]
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Stopping timed out. Killing.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Killing process 2990 (mutt) with signal SIGKILL.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Failed with result 'timeout'.
    May 29 01:55:26 qaa systemd[1]: Stopped session-2.scope - Session 2 of User vinc17.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Consumed 8h 17min 54.832s CPU time.
    May 29 01:55:26 qaa systemd[1]: Stopping systemd-logind.service - User Login Management...
    May 29 01:55:26 qaa systemd[1]: Stopping user@1000.service - User Manager for UID 1000...
    [...]

    Note: I had set DefaultTimeoutStopSec=20s because then is normally sufficient.

    The mutt process was running in GNU Screen. The current mailbox
    was just a local one. Mutt does no network connections (I do not
    use IMAP). I general, I quit it before logging out, but this time,
    it seems that I forgot.

    But I don't understand why there was a timeout. Does this mean that
    mutt didn't react to SIGTERM? Any reason?


    sudo journalctl -b-1?

    See above. There's nothing else about mutt in the journal.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent Lefevre@21:1/5 to Max Nikulin on Thu May 30 18:20:01 2024
    On 2024-05-30 00:19:30 +0700, Max Nikulin wrote:
    On 29/05/2024 07:44, Vincent Lefevre wrote:
    But I don't understand why there was a timeout. Does this mean that
    mutt didn't react to SIGTERM? Any reason?

    Have you tried to send SIGTERM to mutt?

    I didn't. AFAIK, systemd sends a SIGTERM to all the processes of the
    session: that's the

    May 29 01:55:05 qaa systemd[1]: Stopping session-2.scope - Session 2 of User vinc17...

    but this timed out, apparently due to the mutt process, which was
    still running, so that systemd sent a SIGKILL (which succeeded).

    May 29 01:55:26 qaa systemd[1]: session-2.scope: Stopping timed out. Killing. May 29 01:55:26 qaa systemd[1]: session-2.scope: Killing process 2990 (mutt) with signal SIGKILL.

    If it ignores this signal or the reaction is some prompt then you
    need to find another way to stop mutt and to configure systemd user
    session to do it on logout (shutdown).

    A SIGTERM normally kills mutt. In signal.c, signals are blocked
    "while doing critical ops", namely for compressing/decompressing
    (which I don't use) and when locking a mailbox (but I don't see
    why this would have happened). Otherwise the signal handler just
    does that:

    static void exit_handler (int sig)
    {
    curs_set (1);
    endwin (); /* just to be safe */

    exit_print_string (Caught_Signal_L10N ? Caught_Signal_L10N : "Caught signal ");
    #if SYS_SIGLIST_DECLARED
    exit_print_string (sys_siglist[sig]);
    #else
    #if (__sun__ && __svr4__)
    exit_print_string (_sys_siglist[sig]);
    #else
    #if (__alpha && __osf__)
    exit_print_string (__sys_siglist[sig]);
    #else
    exit_print_int (sig);
    #endif
    #endif
    #endif
    exit_print_string (Exiting_L10N ? Exiting_L10N : "... Exiting.\n");
    exit (0);
    }

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent Lefevre@21:1/5 to Max Nikulin on Fri May 31 10:20:01 2024
    On 2024-05-31 10:02:57 +0700, Max Nikulin wrote:
    Do you mean that mutt properly exits unless it receives SIGTERM in the
    course of shutdown process?

    I think that this was not the first time I did a shutdown while Mutt
    was still running. But this was the first time it did not exit.

    I would try to enable debug log in mutt. There is a chance that
    networking is already disabled (or some other system unit important
    for mutt is already stopped) when systemd tries to kill mutt first
    time.

    No, according to the logs, the network was stopped after SIGKILL
    was sent to Mutt:

    May 29 01:55:26 qaa systemd[1]: session-2.scope: Stopping timed out. Killing. May 29 01:55:26 qaa systemd[1]: session-2.scope: Killing process 2990 (mutt) with signal SIGKILL.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Failed with result 'timeout'. May 29 01:55:26 qaa systemd[1]: Stopped session-2.scope - Session 2 of User vinc17.
    May 29 01:55:26 qaa systemd[1]: session-2.scope: Consumed 8h 17min 54.832s CPU time.
    May 29 01:55:26 qaa systemd[1]: Stopping systemd-logind.service - User Login Management...
    [...]
    May 29 01:55:26 qaa systemd[1]: Stopping NetworkManager.service - Network Manager...
    May 29 01:55:26 qaa NetworkManager[1238]: <info> [1716940526.2966] caught SIGTERM, shutting down normally.
    May 29 01:55:26 qaa NetworkManager[1238]: <info> [1716940526.2975] dhcp4 (eth0): canceled DHCP transaction

    Note that mutt is not the only case of timeout at shutdown. There's
    also

    May 25 22:53:32 qaa systemd[2230]: xdg-document-portal.service: State 'stop-sigterm' timed out. Killing.
    May 25 22:53:32 qaa systemd[2230]: xdg-document-portal.service: Killing process 116031 (xdg-document-po) with signal SIGKILL.
    May 25 22:53:32 qaa systemd[2230]: xdg-document-portal.service: Killing process 116037 (fusermount3) with signal SIGKILL.
    May 25 22:53:32 qaa systemd[2230]: xdg-document-portal.service: Main process exited, code=killed, status=9/KILL
    May 25 22:53:32 qaa systemd[2230]: xdg-document-portal.service: Failed with result 'timeout'.

    This one is a known issue:

    https://github.com/flatpak/xdg-desktop-portal/issues/999

    I'm wondering whether there could be a same cause.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent Lefevre@21:1/5 to Vincent Lefevre on Fri May 31 11:20:01 2024
    On 2024-05-31 10:10:32 +0200, Vincent Lefevre wrote:
    On 2024-05-31 10:02:57 +0700, Max Nikulin wrote:
    Do you mean that mutt properly exits unless it receives SIGTERM in the course of shutdown process?

    I think that this was not the first time I did a shutdown while Mutt
    was still running. But this was the first time it did not exit.

    I've tried a shutdown with Mutt still running in GNU Screen
    a couple of times, and I got no timeout, i.e. Mutt exited as
    expected.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent Lefevre@21:1/5 to Max Nikulin on Tue Jun 4 02:40:01 2024
    On 2024-05-31 19:05:45 +0700, Max Nikulin wrote:
    Do you see an attempt to send SIGTERM to mutt before timeout and SIGKILL?

    Unfortunately, there was no information from systemd. Some daemons
    log a received SIGTERM, but mutt isn't a daemon.

    What other processes survived first step? Are there something suspicious before SIGKILL stage?

    According to the journalctl output, only mutt survived (the only
    process to receive a SIGKILL). And nothing suspicious.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)