• binary files /usr/bin/[ and /usr/bin/test differ

    From Yassine Chaouche@21:1/5 to All on Mon Nov 18 11:50:01 2024
    Dear debian and linux enthusiasts,

    Have you ever stopped and wondered:
    Are `/usr/bin/[` and `/usr/bin/test` truly unique across all unices?

    # diff /usr/bin/\[ /usr/bin/test
    Binary files /usr/bin/[ and /usr/bin/test differ
    # ls /usr/bin/\[
    -rwxr-xr-x 1 root root 67K Sep 20 2022 '/usr/bin/['
    # ls /usr/bin/test
    -rwxr-xr-x 1 root root 59K Sep 20 2022 /usr/bin/test
    # sys.distro
    Debian GNU/Linux 12 \n \l

    #

    I can think of one single reason to keep them separate:
    /usr/bin/[ needs to ensure there’s a closing ].
    test doesn't.

    If that's the only reason,
    then the two commands must be sharing a huge chunk of the same DNA!
    So, why keep them separate?
    Is this about some old Unix tradition?
    an optimization somewhere somehow?

    I’m throwing the question to you,
    brilliant minds of the honorable debian users list,
    to demystify this sibling rivalry.
    Thoughts, insights, and conspiracy theories welcome!

    Best,

    --
    yassine -- sysadm
    https://about.me/ychaouche

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arno Lehmann@21:1/5 to All on Mon Nov 18 12:00:01 2024
    Am 18.11.2024 um 11:45 schrieb Yassine Chaouche:
    Dear debian and linux enthusiasts,

    Have you ever stopped and wondered:
    Are `/usr/bin/[` and `/usr/bin/test` truly unique across all unices?

    interesting question (and observation below). I can't say I ever really
    cared, and I'm not even sure now. But:

    $ LANG=C /usr/bin/\[ --help | head -n 2
    Usage: test EXPRESSION
    or: test

    is also nice to find.


    Unfortunately, I'm not brilliant at all.

    But I'm eager to see if Greg has something to educate us ;-)

    Cheers,

    Arno

    --
    Arno Lehmann

    IT-Service Lehmann
    Sandstr. 6, 49080 Osnabrück

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From tomas@tuxteam.de@21:1/5 to Yassine Chaouche on Mon Nov 18 12:00:01 2024
    On Mon, Nov 18, 2024 at 11:45:53AM +0100, Yassine Chaouche wrote:
    Dear debian and linux enthusiasts,

    Have you ever stopped and wondered:
    Are `/usr/bin/[` and `/usr/bin/test` truly unique across all unices?

    # diff /usr/bin/\[ /usr/bin/test
    Binary files /usr/bin/[ and /usr/bin/test differ
    # ls /usr/bin/\[
    -rwxr-xr-x 1 root root 67K Sep 20 2022 '/usr/bin/['
    # ls /usr/bin/test
    -rwxr-xr-x 1 root root 59K Sep 20 2022 /usr/bin/test
    # sys.distro
    Debian GNU/Linux 12 \n \l

    #

    I can think of one single reason to keep them separate:
    /usr/bin/[ needs to ensure there’s a closing ].
    test doesn't.

    If that's the only reason,
    then the two commands must be sharing a huge chunk of the same DNA!
    So, why keep them separate?
    Is this about some old Unix tradition?
    an optimization somewhere somehow?

    Help yourseof :)

    https://sources.debian.org/src/coreutils/

    (Of course, apt-get source coreutils would do the same).

    Cheers
    --
    t

    -----BEGIN PGP SIGNATURE-----

    iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCZzsb5AAKCRAFyCz1etHa RuysAJ4m0e83ahSQL0sepkO31cKFh5KfMQCfaZM02cJRbYs9awC0sy9eBn7baKM=
    =gtHq
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From tomas@tuxteam.de@21:1/5 to Yassine Chaouche on Mon Nov 18 12:40:02 2024
    On Mon, Nov 18, 2024 at 12:30:03PM +0100, Yassine Chaouche wrote:
    Le 11/18/24 à 11:50, tomas@tuxteam.de a écrit :

    Help yourseof :)

    https://sources.debian.org/src/coreutils/

    (Of course, apt-get source coreutils would do the same).

    Cheers


    Thank you tomas,

    [...]

    Thank *you* for reporting back :-)

    Thing is, I'm currently at $DAYJOB with few cycles to spare. I wanted
    to raise awareness of sources.debian.org (it has a full-text search,
    yay) and of the incredible resource it is to have full access to the
    source exactly as it is when your binary was built.

    Thank you again for having gone the full way :-)

    Cheers
    --
    t

    -----BEGIN PGP SIGNATURE-----

    iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCZzsnLQAKCRAFyCz1etHa RrtJAJ9WG1LbUVMp+PkH8NSalFexoCvXTACfTBdEmJ68XRLqRfEwiY58tRyjOXU=
    =Sso+
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Yassine Chaouche@21:1/5 to All on Mon Nov 18 12:40:02 2024
    Le 11/18/24 à 11:50, tomas@tuxteam.de a écrit :

    Help yourseof :)

    https://sources.debian.org/src/coreutils/

    (Of course, apt-get source coreutils would do the same).

    Cheers


    Thank you tomas,

    After a second reading of https://sources.debian.org/src/coreutils/9.5-1/src/test.c/,
    it seems that the [ binary, the [ shell builtin and the test command all share the same C source code (test.c)
    which has different #ifdef branches to handle all three outputs.


    #define TEST_STANDALONE 1

    this ensures that code for the test binary is executed,
    otherwise it produces the test shell builtin (somehow?)


    #ifndef LBRACKET
    # define LBRACKET 0
    #endif

    /* The official name of this program (e.g., no 'g' prefix). */
    #if LBRACKET
    # define PROGRAM_NAME "["
    #else
    # define PROGRAM_NAME "test"
    #endif


    That creates the appropriate program name,
    depending on wether we want [ or not.


    if (LBRACKET)
    {
    /* Recognize --help or --version, but only when invoked in the
    "[" form, when the last argument is not "]". Use direct
    parsing, rather than parse_long_options, to avoid accepting
    abbreviations. POSIX allows "[ --help" and "[ --version" to
    have the usual GNU behavior, but it requires "test --help"
    and "test --version" to exit silently with status 0. */
    if (margc == 2)
    {
    if (STREQ (margv[1], "--help"))
    usage (EXIT_SUCCESS);

    if (STREQ (margv[1], "--version"))
    {
    version_etc (stdout, PROGRAM_NAME, PACKAGE_NAME, Version, AUTHORS,
    (char *) nullptr);
    test_main_return (EXIT_SUCCESS);
    }
    }
    if (margc < 2 || !STREQ (margv[margc - 1], "]"))
    test_syntax_error (_("missing %s"), quote ("]"));

    --margc;
    }


    That seems to be the code that looks for the closing bracket,
    if test is invoked as [


    So it seems [ is created with a -DLBRACKET option.

    Best,


    --
    yassine -- sysadm
    http://about.me/ychaouche

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Greg Wooledge@21:1/5 to Arno Lehmann on Mon Nov 18 13:40:01 2024
    On Mon, Nov 18, 2024 at 11:57:05 +0100, Arno Lehmann wrote:
    Am 18.11.2024 um 11:45 schrieb Yassine Chaouche:
    Dear debian and linux enthusiasts,

    Have you ever stopped and wondered:
    Are `/usr/bin/[` and `/usr/bin/test` truly unique across all unices?

    interesting question (and observation below). I can't say I ever really cared, and I'm not even sure now. But:

    $ LANG=C /usr/bin/\[ --help | head -n 2
    Usage: test EXPRESSION
    or: test

    is also nice to find.


    Unfortunately, I'm not brilliant at all.

    But I'm eager to see if Greg has something to educate us ;-)

    POSIX doesn't care whether you ship separate binary files or a single
    binary file to implement commands. It's the implementor's choice;
    in this case, the implementor is GNU coreutils.

    If you go back far enough in time, you'll probably find that the two
    programs shared a single binary file, and either used a hard link or
    a symbolic link from one to the other. That was fashionable in the
    past, primarily as a means of reducing disk space usage.

    In recent years, GNU has tried to do away with the whole "this program
    may be invoked by any of the following names, and it changes it behavior
    based on which name it finds in argv[0]" approach. From <https://www.gnu.org/prep/standards/standards.html#Standards-for-Interfaces-Generally>:

    4.5 Standards for Interfaces Generally
    Please don’t make the behavior of a utility depend on the name used
    to invoke it. It is useful sometimes to make a link to a utility with
    a different name, and that should not change what it does. Thus,
    if you make foo a link to ls, the program should behave the same
    regardless of which of those names is used to invoke it.

    Instead, use a run time option or a compilation switch or both to
    select among the alternate behaviors. You can also build two versions
    of the program, with different default behaviors, and install them
    under two different names.

    If you look around a bit, you'll find that commands which *used* to share
    a common file (grep/egrep/fgrep, gzip/gunzip/zcat) have started shipping
    a single compiled binary file and a set of wrapper shell scripts for
    the alternative names. E.g.:

    -rwxr-xr-x 1 root root 41 Jan 24 2023 /usr/bin/egrep
    -rwxr-xr-x 1 root root 41 Jan 24 2023 /usr/bin/fgrep
    -rwxr-xr-x 1 root root 203152 Jan 24 2023 /usr/bin/grep

    A wrapper script can't be used to differentiate test and [, however,
    because of the way the final argument is treated.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arno Lehmann@21:1/5 to All on Mon Nov 18 14:10:01 2024
    Thanks Greg!

    That was... well, as expected ;-)

    Am 18.11.2024 um 13:35 schrieb Greg Wooledge:
    On Mon, Nov 18, 2024 at 11:57:05 +0100, Arno Lehmann wrote:
    ...
    Unfortunately, I'm not brilliant at all.

    But I'm eager to see if Greg has something to educate us ;-)

    POSIX doesn't care whether you ship separate binary files or a single
    binary file

    so far I'm aware of that...

    ... find that the two
    programs shared a single binary file, and either used a hard link or
    a symbolic link from one to the other. That was fashionable in the
    past, primarily as a means of reducing disk space usage.

    Indeed that is kind of what I expected.

    In recent years, GNU has tried to do away with the whole "this program
    may be invoked by any of the following names, and it changes it behavior based on which name it finds in argv[0]" approach. From <https://www.gnu.org/prep/standards/standards.html#Standards-for-Interfaces-Generally>:

    And that is something I wasn't aware of.

    ...
    If you look around a bit, you'll find that commands which *used* to share
    a common file (grep/egrep/fgrep, gzip/gunzip/zcat) have started shipping
    a single compiled binary file and a set of wrapper shell scripts for
    the alternative names. E.g.:

    As well as that and your examples. Well, I could have known. As it
    happens, I've been working a bit with unexpected system states on lvm
    using storage, and thus might have been slightly biased the other way:

    $ ls -l /sbin/ | grep -- '-> lvm' | wc -l
    43

    A wrapper script can't be used to differentiate test and [, however,
    because of the way the final argument is treated.

    Actually, I'm kind of sure it would be possible, but it would also be
    pretty stupid to do so, as such a "wrapper" script would probably be
    more complex than the binary it wraps... also, I've been provoking you
    already ;-)

    Thanks,

    Arno


    --
    Arno Lehmann

    IT-Service Lehmann
    Sandstr. 6, 49080 Osnabrück

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Stone@21:1/5 to Yassine Chaouche on Mon Nov 18 15:10:01 2024
    On Mon, Nov 18, 2024 at 11:45:53AM +0100, Yassine Chaouche wrote:
    So, why keep them separate?
    Is this about some old Unix tradition?
    an optimization somewhere somehow?

    Because gnu policy is command behavior to not be dependent on the name
    of the binary. Historically gnu utilities were often coinstalled with
    vendor utilities, using a g- prefix (e.g., gmake), and having a syntax
    change as a side effect of a name change was/is undesirable. There's no
    good reason to combine the utilities, as the savings is only a few
    kbytes on a modern system measured in gigabytes if not terabytes.

    In practice, neither bin/test nor bin/[ are used much on a modern
    system--your shell almost certainly has a built-in version which takes precedence in most situations.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)