• Bug#1070027: in package dpkg marked as pending

    From Vincent Lefevre@1:229/2 to Guillem Jover on Thu Jul 25 11:10:04 2024
    XPost: linux.debian.bugs.dist
    From: vincent@vinc17.net

    Control: reopen -1
    Control: severity -1 important

    On 2024-07-11 00:49:20 +0200, Guillem Jover wrote:
    libdpkg: Try to print the executable name of the lock contending process

    Just printing the PID is not very useful to try to track down the
    contending process as its presence might be momentary and might no
    longer be present when the user tries to look for that specific PID.

    Try to get the executable name to give a better hint to what might be
    going wrong.

    Closes: #1070027

    Since the problem occurred again, this time on the dpkg upgrade itself:

    Unpacking dpkg (1.22.9) over (1.22.7) ...
    Setting up dpkg (1.22.9) ...
    dpkg: error: dpkg frontend lock was locked by /usr/bin/apt-get process with pid 326336

    But this is rather useless information. Perhaps it should also
    write the full "ps -ef" output (something like that) to a file,
    though this may be too late.

    According to the journalctl output:

    Jul 25 10:30:36 qaa systemd[1]: Reload requested from client PID 326242 ('systemctl') (unit session-2.scope)...
    Jul 25 10:30:36 qaa systemd[1]: Reloading...
    Jul 25 10:30:37 qaa systemd[1]: Reloading finished in 153 ms.
    Jul 25 10:30:37 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 10:30:37 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 10:30:37 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    but again, I did *not* do a systemctl. So either systemd is completely
    broken, or perhaps the systemctl was done by dpkg itself. Note that in
    this latter case (I would not be surprised, because when this happens,
    this is *always* during an upgrade), even if aptitude had some fix of
    frontend locking, there would still be a conflict between aptitude and
    dpkg, both leading to a request a lock.

    The PPID (with the process name) of 326242 would have been useful,
    but systemd does not give it.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Vincent Lefevre@1:229/2 to Vincent Lefevre on Thu Jul 25 12:00:01 2024
    XPost: linux.debian.bugs.dist
    From: vincent@vinc17.net

    Control: retitle -1 dpkg failure at "Setting up" due to dpkg frontend lock by apt-get via apt-daily.service

    On 2024-07-25 11:03:50 +0200, Vincent Lefevre wrote:
    According to the journalctl output:

    Jul 25 10:30:36 qaa systemd[1]: Reload requested from client PID 326242 ('systemctl') (unit session-2.scope)...
    Jul 25 10:30:36 qaa systemd[1]: Reloading...
    Jul 25 10:30:37 qaa systemd[1]: Reloading finished in 153 ms.
    Jul 25 10:30:37 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 10:30:37 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 10:30:37 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    but again, I did *not* do a systemctl. So either systemd is completely broken, or perhaps the systemctl was done by dpkg itself.

    In /var/lib/dpkg/info/dpkg.postinst I can see

    systemctl --system daemon-reload >/dev/null || true

    I'm wondering whether this is the cause.

    This line is there due to

    # Due to #932360 in debhelper we need to explicitly tell systemd to reload.

    Note that in this latter case (I would not be surprised, because
    when this happens, this is *always* during an upgrade), even if
    aptitude had some fix of frontend locking, there would still be a
    conflict between aptitude and dpkg, both leading to a request a
    lock.

    The reload is triggered with just

    dpkg -i /var/cache/apt/archives/dpkg_1.22.9_amd64.deb

    so aptitude and even apt aren't even involved in this bug:

    Jul 25 11:47:56 qaa systemd[1]: Reload requested from client PID 333083 ('systemctl') (unit session-2.scope)...
    Jul 25 11:47:56 qaa systemd[1]: Reloading...
    Jul 25 11:47:56 qaa systemd[1]: Reloading finished in 139 ms.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Vincent Lefevre@1:229/2 to Vincent Lefevre on Thu Jul 25 13:40:02 2024
    XPost: linux.debian.bugs.dist
    From: vincent@vinc17.net

    On 2024-07-25 11:49:42 +0200, Vincent Lefevre wrote:
    The reload is triggered with just

    dpkg -i /var/cache/apt/archives/dpkg_1.22.9_amd64.deb

    so aptitude and even apt aren't even involved in this bug:

    Jul 25 11:47:56 qaa systemd[1]: Reload requested from client PID 333083 ('systemctl') (unit session-2.scope)...
    Jul 25 11:47:56 qaa systemd[1]: Reloading...
    Jul 25 11:47:56 qaa systemd[1]: Reloading finished in 139 ms.

    And if the timer time has been reached, the apt-daily service is run.

    To try to reproduce the issue, I've copied

    /usr/lib/systemd/system/apt-daily.service
    /usr/lib/systemd/system/apt-daily.timer

    into /etc/systemd/system and edited them.

    For /etc/systemd/system/apt-daily.service, I've changed the ExecStart
    script pathname and edited the copy of the apt.systemd.daily script to
    set VERBOSE=1 and ducplicate the following lines

    # check if we can lock the cache and if the cache is clean
    if command -v apt-get >/dev/null && ! eval apt-get check $XAPTOPT $XSTDERR ; then
    debug_echo "error encountered in cron job with \"apt-get check\"."
    exit 0
    fi

    6 times (as "apt-get check" attempts to lock the cache, thus
    giving more chance of failure).

    For /etc/systemd/system/apt-daily.timer, I've changed the following
    lines to get

    OnCalendar=*-*-* *:*:00
    RandomizedDelaySec=1m

    thus reaching the timer time more often.

    Now, whether I use

    aptitude reinstall dpkg dpkg-dev
    or
    apt install --reinstall dpkg dpkg-dev
    or
    dpkg -i /var/cache/apt/archives/dpkg_1.22.9_amd64.deb \
    /var/cache/apt/archives/dpkg-dev_1.22.9_all.deb

    I can get a failure in the apt.systemd.daily script:

    Jul 25 12:59:17 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 12:59:17 qaa apt.systemd.daily[370789]: error encountered in cron job with "apt-get check".
    Jul 25 12:59:17 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 12:59:17 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    With VERBOSE=2, I could get more details about this failure:

    Jul 25 13:22:16 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 13:22:16 qaa apt.systemd.daily[378644]: E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 378454 (apt)
    Jul 25 13:22:16 qaa apt.systemd.daily[378644]: E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
    Jul 25 13:22:16 qaa apt.systemd.daily[378637]: error encountered in cron job with "apt-get check".
    Jul 25 13:22:16 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 13:22:16 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    Here, the failure occurs in the apt.systemd.daily, because the
    process used for the upgrade got the lock first. But it could
    be the other way round, as this could be seen with aptitude.

    --
    Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Guillem Jover@1:229/2 to Vincent Lefevre on Wed Aug 7 11:10:01 2024
    XPost: linux.debian.bugs.dist
    From: guillem@debian.org

    Control: reassign -1 apt
    Control: retitle -1 apt: daily systemd service handles dpkg frontend locking incorrectly?

    Hi!

    [ Please see the entire bug report for more details, otherwise I'm
    sure Vincent can fill in any missing pieces. ]

    On Thu, 2024-07-25 at 11:03:50 +0200, Vincent Lefevre wrote:
    Control: reopen -1
    Control: severity -1 important

    On 2024-07-11 00:49:20 +0200, Guillem Jover wrote:
    libdpkg: Try to print the executable name of the lock contending process

    Just printing the PID is not very useful to try to track down the contending process as its presence might be momentary and might no
    longer be present when the user tries to look for that specific PID.

    Try to get the executable name to give a better hint to what might be
    going wrong.

    Closes: #1070027

    I added this mostly as a debugging aid, even though I had already kind
    of tracked the surrounding area for this, I guess I was expecting a
    clone and reassign or a new bug report filed on apt, but probably I
    should have done that myself, or not closed the report for the
    improved debugging output (just adding a reference).

    Since the problem occurred again, this time on the dpkg upgrade itself:

    Unpacking dpkg (1.22.9) over (1.22.7) ...
    Setting up dpkg (1.22.9) ...
    dpkg: error: dpkg frontend lock was locked by /usr/bin/apt-get process with pid 326336

    But this is rather useless information. Perhaps it should also
    write the full "ps -ef" output (something like that) to a file,
    though this may be too late.

    Printing more stuff could certainly be helpful, but would be annoying.
    I'd expect something like this bug not being common, so I'm not sure
    (at least right now) it's worth it to implement further output, or
    relying on ps or pstree or similar (which would seem rather meh).

    According to the journalctl output:

    Jul 25 10:30:36 qaa systemd[1]: Reload requested from client PID 326242 ('systemctl') (unit session-2.scope)...
    Jul 25 10:30:36 qaa systemd[1]: Reloading...
    Jul 25 10:30:37 qaa systemd[1]: Reloading finished in 153 ms.
    Jul 25 10:30:37 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 10:30:37 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 10:30:37 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    but again, I did *not* do a systemctl. So either systemd is completely broken, or perhaps the systemctl was done by dpkg itself. Note that in
    this latter case (I would not be surprised, because when this happens,
    this is *always* during an upgrade), even if aptitude had some fix of frontend locking, there would still be a conflict between aptitude and
    dpkg, both leading to a request a lock.

    A reload is the common operation expected when installing any systemd
    service file, so if that would cause a dpkg lock issue, then that
    would be a general problem. This is specific to the interaction of the apt-daily.service and how it ends up locking things.

    On Thu, 2024-07-25 at 13:27:52 +0200, Vincent Lefevre wrote:
    On 2024-07-25 11:49:42 +0200, Vincent Lefevre wrote:
    The reload is triggered with just

    dpkg -i /var/cache/apt/archives/dpkg_1.22.9_amd64.deb

    so aptitude and even apt aren't even involved in this bug:

    Jul 25 11:47:56 qaa systemd[1]: Reload requested from client PID 333083 ('systemctl') (unit session-2.scope)...
    Jul 25 11:47:56 qaa systemd[1]: Reloading...
    Jul 25 11:47:56 qaa systemd[1]: Reloading finished in 139 ms.

    And if the timer time has been reached, the apt-daily service is run.

    To try to reproduce the issue, I've copied

    /usr/lib/systemd/system/apt-daily.service
    /usr/lib/systemd/system/apt-daily.timer

    into /etc/systemd/system and edited them.

    For /etc/systemd/system/apt-daily.service, I've changed the ExecStart
    script pathname and edited the copy of the apt.systemd.daily script to
    set VERBOSE=1 and ducplicate the following lines

    # check if we can lock the cache and if the cache is clean
    if command -v apt-get >/dev/null && ! eval apt-get check $XAPTOPT $XSTDERR ; then
    debug_echo "error encountered in cron job with \"apt-get check\"."
    exit 0
    fi

    6 times (as "apt-get check" attempts to lock the cache, thus
    giving more chance of failure).

    For /etc/systemd/system/apt-daily.timer, I've changed the following
    lines to get

    OnCalendar=*-*-* *:*:00
    RandomizedDelaySec=1m

    thus reaching the timer time more often.

    Now, whether I use

    aptitude reinstall dpkg dpkg-dev
    or
    apt install --reinstall dpkg dpkg-dev
    or
    dpkg -i /var/cache/apt/archives/dpkg_1.22.9_amd64.deb \
    /var/cache/apt/archives/dpkg-dev_1.22.9_all.deb

    I can get a failure in the apt.systemd.daily script:

    Jul 25 12:59:17 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 12:59:17 qaa apt.systemd.daily[370789]: error encountered in cron job with "apt-get check".
    Jul 25 12:59:17 qaa systemd[1]: apt-daily.service: Deactivated successfully. Jul 25 12:59:17 qaa systemd[1]: Finished apt-daily.service - Daily apt download activities.

    With VERBOSE=2, I could get more details about this failure:

    Jul 25 13:22:16 qaa systemd[1]: Starting apt-daily.service - Daily apt download activities...
    Jul 25 13:22:16 qaa apt.systemd.daily[378644]: E: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 378454 (apt)
    Jul 25 13:22:16 qaa apt.systemd.daily[378644]: E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
    Jul 25 13:22:16 qaa apt.systemd.daily[378637]: error encountered in cron job with "apt-get check".
    Jul 25 13:22:16 qaa systemd[1]: apt-daily.service: Deactivated successfully.

    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)