tag 1086028 + patch
tag 1087809 + patch
tag 1093200 + patch
thanks
Hi!
I've finally managed to reproduce this EFAULT in QEMU (using an
Erlang-based script which is shipped in the wings3d source package):
1) I've installed Debian bookworm for mips64el in qemu-system-mips64el virtual machine (version from unstable), and upgraded it to the
current unstable (machine is loongson3-virt, cpu is Loongson-3A4000).
2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the
virtual machine won't boot, with clock=rt sometimes it boots,
sometimes it hangs). The full QEMU command line is:
qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \
-smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \
-kernel vmlinuz-loongson-3 \
-rtc clock=rt \
-initrd initrd.img-loongson-3 -drive if=none,file=hda1.bin,id=hd,format=raw \
-net nic -net tap,ifname=tap0,script=/bin/true \
-device virtio-blk-pci,drive=hd -append "root=/dev/vda1 console=ttyS0" \
-nographic
Here kernel and initrd can be either stock 6.1.123-1 version or
6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for
me using the newest 6.12.12-1 kernel (it complains that it can't
uncompress initrd, I don't know why).
4) I've install the build dependencies of wings3d (basically, only erlang-base is necessary)
5) I've extracted the wings3d source package (from stable: https://packages.debian.org/source/stable/wings3d)
6) I've added the following line as the second line to wings3d-2.2.9/intl_tools/gen_char_hrl
%%! +S 4:4 +SDcpu 4:4 +c false
(The first two options enable multiple threads, the last one allows
some workaround for the case when monotonic clock jumps backwards,
which appears to be the case for QEMU with SMP enabled).
7) I've run this gen_char_hrl in a loop until it fails.
The result is that with the stock 6.1.123-1 kernel approximately in 1%
cases the script aborts with message:
signal-dispatcher thread got unexpected error: efault (14)
which is exactly the error that prevents Erlang (and many Erlang-based packages) from building on mips64el.
On the other hand, with the patched kernel the script loop is still
running for more than 24 hours (a few thousands runs) without
aborting. So I'm now fairly confident that the patch fixes the bug.
I'm not sure if there's no adverse effects caused by the patch, so
it'd be better to try it on real hardware as well.
The patch is derived from the thread [1]. It reverses commit [2] with
an additional change, which is necessary because of changes in
expand_stack() introduced in commit [3].
[1] https://lore.kernel.org/all/mvmplxraqmd.fsf@suse.de/T/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184
Hi,
On Thu, Feb 13, 2025 at 01:35:13PM +0300, Sergei Golovan wrote:
tag 1086028 + patch
tag 1087809 + patch
tag 1093200 + patch
thanks
Hi!
I've finally managed to reproduce this EFAULT in QEMU (using an Erlang-based script which is shipped in the wings3d source package):
1) I've installed Debian bookworm for mips64el in qemu-system-mips64el virtual machine (version from unstable), and upgraded it to the
current unstable (machine is loongson3-virt, cpu is Loongson-3A4000).
2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the virtual machine won't boot, with clock=rt sometimes it boots,
sometimes it hangs). The full QEMU command line is:
qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \
-smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \
-kernel vmlinuz-loongson-3 \
-rtc clock=rt \
-initrd initrd.img-loongson-3 -drive if=none,file=hda1.bin,id=hd,format=raw \
-net nic -net tap,ifname=tap0,script=/bin/true \
-device virtio-blk-pci,drive=hd -append "root=/dev/vda1 console=ttyS0" \
-nographic
Here kernel and initrd can be either stock 6.1.123-1 version or
6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for
me using the newest 6.12.12-1 kernel (it complains that it can't
uncompress initrd, I don't know why).
4) I've install the build dependencies of wings3d (basically, only erlang-base is necessary)
5) I've extracted the wings3d source package (from stable: https://packages.debian.org/source/stable/wings3d)
6) I've added the following line as the second line to wings3d-2.2.9/intl_tools/gen_char_hrl
%%! +S 4:4 +SDcpu 4:4 +c false
(The first two options enable multiple threads, the last one allows
some workaround for the case when monotonic clock jumps backwards,
which appears to be the case for QEMU with SMP enabled).
7) I've run this gen_char_hrl in a loop until it fails.
The result is that with the stock 6.1.123-1 kernel approximately in 1% cases the script aborts with message:
signal-dispatcher thread got unexpected error: efault (14)
which is exactly the error that prevents Erlang (and many Erlang-based packages) from building on mips64el.
On the other hand, with the patched kernel the script loop is still
running for more than 24 hours (a few thousands runs) without
aborting. So I'm now fairly confident that the patch fixes the bug.
I'm not sure if there's no adverse effects caused by the patch, so
it'd be better to try it on real hardware as well.
The patch is derived from the thread [1]. It reverses commit [2] with
an additional change, which is necessary because of changes in expand_stack() introduced in commit [3].
[1] https://lore.kernel.org/all/mvmplxraqmd.fsf@suse.de/T/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184
Just one observation, as we talked about this issue in our weekly
Kernel team meeting: Reverting
4bce37a68ff884e821a02a731897a8119e0c37b7 might not be an option as
this is part of the upstream fixes for adressing CVE-2023-3269.
Some information about the CVE: https://www.openwall.com/lists/oss-security/2023/07/05/1 https://github.com/lrh2000/StackRot https://www.openwall.com/lists/oss-security/2023/07/28/1
This means that this needs to be adressed (upstream, for 6.1.y) in a
way that it does not break the CVE fix but unbreaks the mips64el
situation.
Ben aims to look into it.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 47:54:28 |
Calls: | 10,397 |
Calls today: | 5 |
Files: | 14,066 |
Messages: | 6,417,282 |
Posted today: | 1 |