• Bug#1104582: needrestart breaks lxc networking by restarting nftables.s

    From Daniel =?utf-8?Q?Gr=C3=B6ber?=@21:1/5 to All on Fri May 2 11:40:01 2025
    Package: needrestart
    Version: 3.6-4+deb12u3
    Severity: serious
    Justification: Breaks unrelated software
    X-Debbugs-Cc: elbrus@debian.org, dxld@darkboxed.org

    Hi Patrick,

    I investigated a curious networking problem in Debian's autopkgtest infrastructure along with Paul. We found that a recent (innocent) nftables update caused needrestart to trigger a nftables.service restart which
    flushed volatile firewall rules installed into the kernel by
    lxc. Specifically by lxc-net.service see /usr/libexec/lxc/lxc-net.

    I think we should add an exception for nftables to $nrconf{override_rc} to avoid this problem since there doesn't seem to be any point in restarting
    it for security purposes.

    Thanks,
    --Daniel

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEV6G/FbT2+ZuJ7bKf05SBrh55rPcFAmgUkjgACgkQ05SBrh55 rPfUIQ/6AuktDkmw1Do8/ZsKh8FtLDLayc8sF6OB0VsIaoSvN/QGEb7YqvdZHDM1 3YUWOC4toJ7AQ+OFBUtAPRsZBd9R7Yi0Fb6Scj02u3ZwnmICvO2whknqB80swZmc gZJAsfaPTMxanPv5DqvpIoP96SJPDgM0RlTzvW8tk2H4uA8Ea37toLMxk5jX8i7A E8jgwVbFQMDAzbm2AkU0SZK6Ew4KJfXOYUgxMKj7glLkQo9GcGr2GxveOx1YqADQ AFxKW8NhNv1vfY7WnqT/V+puZ0BmwdvUFdEpITIp+XA6mIqaecqaie+z2M/xN1NR sgnbqKAgCFBSsY8zjTEVT/xSQ9gJw62tNFD01GjZY8LEs7wdON802YwdZsJyKEaL o54vMK/RJlQQwsKUXs6KXljw4fA/Z2ViIiIweW1e66VCcpbzuRu+vUgdURQPi+ep Ejy18WNrMDFrBNF1rCKgSpU4aDfRZDsO1kKPBPznJu2QnphVvF6OGzm8y0plQ47m WO9u7tLnD1Rvyv7iDVQST6gF3zyT2ciAnYogGdt5C0ytUOkJAVMwo++VjdRmnBe8 MS8TDpgIyFeo6ANEg8ICDam/kqV5/GS/chaHwBAhqSfvithexilTiLBh9BdyRHE2 Tlo2oOoMgQGDnevyhUv9/eU+FbStGxt//fkx6i6Sej7nWPPrnHY=
    =OdFx
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Liske@21:1/5 to All on Fri May 2 12:00:02 2025
    Hi Daniel,

    On 02.05.25 11:37, Daniel Gröber wrote:
    I investigated a curious networking problem in Debian's autopkgtest infrastructure along with Paul. We found that a recent (innocent) nftables update caused needrestart to trigger a nftables.service restart which

    I wonder why needrestart selects this service at all. Could you provide
    the output of `needrestart -v` for this?


    flushed volatile firewall rules installed into the kernel by
    lxc. Specifically by lxc-net.service see /usr/libexec/lxc/lxc-net.

    I think we should add an exception for nftables to $nrconf{override_rc} to avoid this problem since there doesn't seem to be any point in restarting
    it for security purposes.

    ACK, IMHO it should be completely ignored and one should consider the
    same for iptables. But I still wonder why the service gets selected at all…


    Cheers,
    Thomas

    (upstream)


    Thanks,
    --Daniel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Daniel =?utf-8?Q?Gr=C3=B6ber?=@21:1/5 to Thomas Liske on Fri May 2 15:00:01 2025
    Hi Thomas, Hi Chris,

    On Fri, May 02, 2025 at 11:47:24AM +0200, Thomas Liske wrote:
    I wonder why needrestart selects this service at all. Could you provide the output of `needrestart -v` for this?

    Unfortunately we already restarted all the affected nodes. Do you want me
    to try and recreate the problem in debvm?

    I think we should add an exception for nftables to $nrconf{override_rc} to avoid this problem since there doesn't seem to be any point in restarting it for security purposes.

    ACK, IMHO it should be completely ignored and one should consider the same for iptables. But I still wonder why the service gets selected at all…

    My assumption: because the executable changed due to the migration of
    1.1.2-1 to testing on 04-26. We saw the nftables service was restarted on affected nodes on 04-27 at about 6am i.e. almost certainly because of unattended-upgrades.

    Why do you think it should be ignored already?

    On Fri, May 02, 2025 at 11:50:20AM +0200, Chris Hofstaedtler wrote:
    On Fri, May 02, 2025 at 11:37:04AM +0200, Daniel Gröber wrote:
    Justification: Breaks unrelated software
    (IMO needrestart is not "unrelated" here.)

    I do think the severity is justified, nftables is the "unrelated" software needrestart is breaking. See
    https://release.debian.org/trixie/rc_policy.txt first section:

    * makes unrelated software on the system (or the whole system)
    break

    Isn't this really a bug in nftables and maybe lxc? If restarting a
    service wipes its configuration, maybe it should be fixed there.

    I did consider that. Unfortunately too much networking software is already doing things this way to my knowledge and personal dismay. Think: docker, libvirt etc.

    I think nftables should support a .d directory to fix this "properly".

    However since most upstream sofware treats the firwall as runtime state
    there's going to be an impedance mismatch if we just stick this in /etc/nftables.d.

    So really we'd have to also support /run/nftables.d to allow representing
    the intended semantics and I'm personally just not convinced that's the
    right thing to do yet since I haven't thought of this runtime-state consideration before.

    For now adding an exception for nftables fixes the immediate issue and
    doesn't have any downsides I can see and I'll carry the .d thing forward in
    any case -- now with a good example issue for why it's needed :-)

    Thanks,
    --Daniel

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEV6G/FbT2+ZuJ7bKf05SBrh55rPcFAmgUvv8ACgkQ05SBrh55 rPfe6hAAolhhCJtJNI3njffZwHBYaxd2bhGu1F+d2zVxJwAkxenqCC0wSMQORxfZ PI6L9jSvfz4GiqXBrpZvmOVn0DU4fWfyitCdFQG+Se/cQ2lijMrHvQWV1vie0lxm 4CYdvJPRJqU0ijH1exMOSxASMikgjSBKjrH3P5Y6ms36gjIt6AV8/FXu83NaAB0k 9WFxCSykuwZ4o9YChuQJgFYBgdC1dTdz7h1Q3OC7wcJUOiO4H9Y3Og/y4qXIKEEF aCWdtCYOGL3GVLkBc0R1+fJaFnA2btZIOgPx1Hdg8Xr+wRQf6sLz0vvjS78z5O1X jrYHZ3pLCGJa4SbPspMIODmF8A4VaKr4wAXfF5bj/sMGSQmz9ySuO+4p/Sssltta 2z4rxnjEjk6ywzg0gnqmm9miOBEh7lInhZXclRV9GyQ9l2fh8Q8zMZZRH1+uDg8f bl4QvYHEROtE8tITjlkF/MtgVEn6a4R3qPcOtuT3X2Kjjjhi/niS8oCP8beXXM+B 9M65DkrnjR5WONl7SUVMs/aB+F/KIGBytg43dointD2WfV8jcVmOWzqc82sFQ3er ezw7yYjLP6grLSMY/vMVl5zzift7BJtMFMJvGls8CyewIIyv7KQLYgBfXlO2/5Ox EVwPu4SKFeQNey4xd7v2ESWUy3+gr+vywULnJYXhYzSVXKzSnoQ=
    =0O9u
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Daniel =?utf-8?Q?Gr=C3=B6ber?=@21:1/5 to All on Fri May 2 15:30:02 2025
    Control: severity -1 normal

    On Fri, May 02, 2025 at 02:48:15PM +0200, Daniel Gröber wrote:
    On Fri, May 02, 2025 at 11:47:24AM +0200, Thomas Liske wrote:
    I wonder why needrestart selects this service at all. Could you provide the output of `needrestart -v` for this?

    Unfortunately we already restarted all the affected nodes. Do you want me
    to try and recreate the problem in debvm?

    I recreated the situation in a trixie debvm I do indeed not see needrestart trying to restart nftables.service but only lxc-net.service due to dnsmasq being linked against libnftables1 which changed SONAME and consequently
    moved.

    So my assumption that nftables was being (directly) restarted by
    needrestart is probably invalid. See below for -v output.

    I reviewed lxc-net and it only flushes it's own table(s) not the whole
    ruleset. It runs:

    flush table ip6 lxc
    flush table inet lxc

    and doesn't otherwise seem to trigger a nftables.service restart
    otherwise. Odd.

    I'm downgrading the severity then and will try to find another explaination
    for what happened here.

    --Daniel

    Log:

    root@testvm:~# /usr/sbin/needrestart -v
    [main] eval /etc/needrestart/needrestart.conf
    [main] needrestart v3.11
    [main] running in root mode
    [Core] Using UI 'NeedRestart::UI::stdio'...
    [main] systemd detected
    [main] vm detected
    [main] #7541 is /usr/sbin/dnsmasq
    [main] #7541 uses deleted /usr/lib/x86_64-linux-gnu/libnftables.so.1.1.0
    [main] #7541 is not a child
    [main] #7541 exe => /usr/sbin/dnsmasq
    [main] #7541 is lxc-net.service
    [main] inside container or vm, skipping microcode checks
    [Kernel] Linux: kernel release 6.12.22-cloud-amd64, kernel version #1 SMP PREEMPT_DYNAMIC Debian 6.12.22-1 (2025-04-10)
    Failed to load NeedRestart::Kernel::kFreeBSD: [Kernel/kFreeBSD] Not running on GNU/kFreeBSD!
    [Kernel/Linux] /boot/vmlinuz-6.12.22-cloud-amd64 => 6.12.22-cloud-amd64 (debian-kernel@lists.debian.org) #1 SMP PREEMPT_DYNAMIC Debian 6.12.22-1 (2025-04-10) [6.12.22-cloud-amd64]*
    [Kernel/Linux] Expected linux version: 6.12.22-cloud-amd64

    Running kernel seems to be up-to-date.

    Restarting services...
    Services to be restarted:
    Restart «lxc-net.service»? [yNas?]
    Service restarts being deferred:
    systemctl restart lxc-net.service

    No containers need to be restarted.

    No user sessions are running outdated binaries.

    No VM guests are running outdated hypervisor (qemu) binaries on this host.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEV6G/FbT2+ZuJ7bKf05SBrh55rPcFAmgUxxwACgkQ05SBrh55 rPcbDw/6Ao2fiVoQdH/Ep596IemN7QxB/CTWWXLBtT/Rsu2y0d6vDuVo6eT8fEc0 NeRnatheERjcnJQ0ghJBIOy1CDspBtcW+YFmw94saJ5ZXll9ax8y1+PCMN+lUqBH KiGWqkUHznw7VWg/Hs/eSLDL48ZpvbZ6qRz6Pcs1VC6TTBsrfG7UAkovMzQgAYaw OqNUP5O78baLT7Nq9H6anKOQlIRqpKLSffAYR4hPY2GHoJH57g+yGxImSKEPb1xA onn7mXG5hkSYEo2YDaxVjAtlYdgMQxwNOmBEVbBJrkihXogsR3TQezUq3qcLKlkB tPaWeUh8HIfbUgYM8SoOTde+FYLmiwTyqIo3i6xpli89T89f4uOAvbAe/Fg1NOyJ dEbvMCov/94kfGzU0kjzt+UD56WDQgltKfgJPNNjCSePBB0P9yQQ3OCINwU1OOrL itzPVXe5E/DTVVhrDWdsQFiH34Mnf84ElG0iE38cXDKC9Ao6zDfbwy7w1XJxQK7d ZxDi7gCRiLQmrnOr4S9x/rRARrxzVeyoT9QHazorR3vzpI6RtTHzMYPBnwHvYS7T Lgz5Z37snCqhC/5qR3+vHvm2Kg0w6ZA1Qo1U7wQKrIZlVr7x4Yqp4vK9VlgAaFOX Y8myDkLi2QcmmXP1SUxbITPH2LkhAAzORcOb8QjN0/zrSimsLw0=
    =hA/I
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)