• Re: Code density (was: Why I've Dropped In)

    From Scott Lurndal@21:1/5 to Anton Ertl on Tue Jun 17 15:11:27 2025
    anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
    antispam@fricas.org (Waldek Hebisch) writes:

    IIUC 64-bit
    ARM droppend most of space saving features of Thumb2

    ARM A64 is a completely differeny ISA than ARM A32/T32. In
    particular, all instructions are encoded in 32 bits, so there is no
    trace of anything that one might see as coming from T32.

    I suspect that Waldek was referring specifically to ARMv8,
    which allowed an implementation to optionally support AArch32 instructions
    and most of the T32 instruction set (but not T16 or Jazelle) at the least privileged exception (privilege) level.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Waldek Hebisch on Tue Jun 17 14:17:42 2025
    antispam@fricas.org (Waldek Hebisch) writes:
    Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    However, apparently code size is important enough in some markets that
    ARM introduced not just Thumb, but Thumb2 (aka ARM T32), and MIPS
    followed with MIPS16 (repeating the Thumb mistake) and later MicroMIPS
    (which allows mixing 16-bit and 32-bit encodings); Power specified VLE
    (are there any implementations of that?); and RISC-V specified the C
    extension, which is implemented widely AFAICT.

    AFAICS main target of those are small embedded microcontrollers
    running code mostly from flash.

    Yes, that seems to be the original market.

    Some Linux distributions took advantage of smaller instructions
    and compiled a lot of programs to save space. But I doubt that
    possibility of such a saving would be enough to motivate
    developement of more space efficient encoding.

    Maybe not, but RV64G is obviously not an instruction set for small
    embedded microcontrollers, and yet most implementations of RV64G
    include the C extension (compressed instructions). Maybe the idea is
    that making the I-cache larger to achieve the same miss rate would
    cost more area than implementing the C extension.

    IIUC 64-bit
    ARM droppend most of space saving features of Thumb2

    ARM A64 is a completely differeny ISA than ARM A32/T32. In
    particular, all instructions are encoded in 32 bits, so there is no
    trace of anything that one might see as coming from T32.

    ARM A64 takes a different approach to space saving: It has more
    instructions than A32 (or, say, RISC-V) which allow to encode the same functionality with fewer instructions.

    As an example, the load-pair and store-pair instructions are
    relatively common. They are not only used around function calls, at
    the prologue and epilogue of functions, but I have seen compilers
    generate them when you access two adjacent fields of a struct or two
    adjacent elements of an array. And the cool thing is that this saves
    not only the instruction space for a second instruction, a load-pair
    or store-pair instruction also results in only one memory access
    unless a cache-line boundary is straddled.

    so
    apparently they did not consider them important enough for
    bigger machines.

    Looking at the text size measurements I made (see below), ARM A64 is
    the densest instruction set with a fixed-size encoding, and is in the
    same ballpark as AMD64, so they were not oblivious of code density
    concerns. My guess is that they wanted a fixed-size encoding because
    that makes decoding many instructions per cycle cheaper, and it avoids
    the need for a uop cache: IIRC ARM used a uop cache for their
    higher-end cores as long as they still supported T32, but eliminated
    that once they went for pure A64 cores.

    Here are the text size numbers:

    Debian numbers from <2024Jan4.101941@mips.complang.tuwien.ac.at>:

    bash grep gzip
    595204 107636 46744 armhf
    599832 101102 46898 riscv64
    796501 144926 57729 amd64
    829776 134784 56868 arm64
    853892 152068 61124 i386
    891128 158544 68500 armel
    892688 168816 64664 s390x
    1020720 170736 71088 mips64el
    1168104 194900 83332 ppc64el


    NetBSD numbers from <2025Mar4.093916@mips.complang.tuwien.ac.at>:

    libc ksh pax ed
    1102054 124726 66218 26226 riscv-riscv32
    1077192 127050 62748 26550 riscv-riscv64
    1030288 150686 79852 31492 mvme68k
    779393 155764 75795 31813 vax
    1302254 171505 83249 35085 amd64
    1229032 178332 89180 36876 evbarm-aarch64
    1539052 179055 82280 34717 amd64-daily
    1374961 184458 96971 37218 i386
    1247476 185792 96728 42028 evbarm-earmv7hf
    1333952 187452 96328 39472 sparc
    1586608 204032 106896 45408 evbppc
    1536144 204320 106768 43232 hppa
    1397024 216832 109792 48512 sparc64
    1538536 222336 107776 44912 evbmips-mips64eb
    1623952 243008 122096 50640 evbmips-mipseb
    1689920 251376 120672 51168 alpha
    2324752 2259984 1378000 ia64

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)