• [gentoo-dev] [PATCH v2 1/1] esed.eclass: new eclass

    From Ionen Wolkens@21:1/5 to All on Fri Jun 3 13:40:01 2022
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>
    ---
    eclass/esed.eclass | 201 +++++++++++++++++++++++++++++++++++++++++++++
    1 file changed, 201 insertions(+)
    create mode 100644 eclass/esed.eclass

    diff --git a/eclass/esed.eclass b/eclass/esed.eclass
    new file mode 100644
    index 00000000000..f327c3bbdf4
    --- /dev/null
    +++ b/eclass/esed.eclass
    @@ -0,0 +1,201 @@
    +# Copyright 2022 Gentoo Authors
    +# Distributed under the terms of the GNU General Public License v2
    +
    +# @ECLASS: esed.eclass
    +# @MAINTAINER:
    +# Ionen Wolkens <ionen@gentoo.org>
    +# @AUTHOR:
    +# Ionen Wolkens <ionen@gentoo.org>
    +# @SUPPORTED_EAPIS: 8
    +# @BLURB: sed(1) wrappers that die if expressions did not modify any files
    +# @EXAMPLE:
    +#
    +# @CODE
    +# esed 's/a/b/' src/file.c # -i is default, dies if 'a' does not become 'b'
    +#
    +# enewsed 's/a/b/' project.pc.in "${T}"/project.pc # stdin/out not supported +#
    +# esedfind . -type f -name '*.c' -esed 's/a/b/' # dies if zero files changed +#
    +# local esedexps=(
    +# # dies if /any/ of these did nothing, -e 's/a/b/' -e 's/c/d/
  • From Oskari Pirhonen@21:1/5 to Ionen Wolkens on Sat Jun 4 02:20:01 2022
    On Fri, Jun 03, 2022 at 07:36:46AM -0400, Ionen Wolkens wrote:
    ... snip ...

    + # Roughly attempt to find files in arguments by checking if it's a
    + # readable file (aka s/// is not a file) and does not start with -
    + # (unless after --), then store contents for comparing after sed.
    + local contents=() endopts files=()
    + for ((i=1; i<=${#}; i++)); do
    + if [[ ${!i} == -- && ! -v endopts ]]; then
    + endopts=1
    + elif [[ ${!i} =~ ^(-i|--in-place)$ && ! -v endopts ]]; then
    + # detect rushed sed -i -> esed -i, -i also silently breaks enewsed
    + die "passing ${!i} to ${FUNCNAME[0]} is invalid"
    + elif [[ ${!i} =~ ^(-f|--file)$ && ! -v endopts ]]; then
    + i+=1 # ignore script files
    + elif [[ ( ${!i} != -* || -v endopts ) && -f ${!i} && -r ${!i} ]]; then
    + files+=( "${!i}" )
    +
    + # 2>/dev/null to silence null byte warnings if sed binary files
    + { contents+=( "$(<"${!i}")" ); } 2>/dev/null \
    + || die "failed to read: ${!i}"
    + fi
    + done
    + (( ${#files[@]} )) || die "no readable files found from '${*}' arguments"
    +
    + local verbose
    + [[ ${ESED_VERBOSE} ]] && type diff &>/dev/null && verbose=1
    +
    + local changed newcontents
    + if [[ -v _esed_output ]]; then
    + [[ -v verbose ]] &&
    + einfo "${FUNCNAME[0]}: sed ${*} > ${_esed_output} ..." +
    + sed "${@}" > "${_esed_output}" \
    + || die "failed to run: sed ${*} > ${_esed_output}"
    +
    + { newcontents=$(<"${_esed_output}"); } 2>/dev/null \
    + || die "failed to read: ${_esed_output}"
    +
    + local IFS=$'\n' # sed concats with newline even if none at EOF + contents=${contents[*]}
    + unset IFS
    +
    + if [[ ${contents} != "${newcontents}" ]]; then
    + changed=1
    +
    + [[ -v verbose ]] &&
    + diff -u --color --label="${files[*]}" --label="${_esed_output}" \
    + <(echo "${contents}") <(echo "${newcontents}")
    + fi

    ... snip ...

    I'm not 100% convinced that it will give you anything meaningful. The
    warning about ignoring NULL is not so much noise as it is bash warning
    you that you're probably not doing something correctly. In this case,
    you're not pulling _all_ the contents of the file:

    [ /tmp ]
    oskari@dj3ntoo λ printf "ab\0cd" >test.dat
    [ /tmp ]
    oskari@dj3ntoo λ hd test.dat
    00000000 61 62 00 63 64 |ab.cd|
    00000005
    [ /tmp ]
    oskari@dj3ntoo λ var=$(< test.dat)
    bash: warning: command substitution: ignored null byte in input
    [ /tmp ]
    oskari@dj3ntoo λ printf "$var" | hd
    00000000 61 62 63 64 |abcd|
    00000004

    If it's a binary file, there's a decent chance the NULL's are
    significant. Now, consider the following hypothetical example where we
    want to remove the NULL's:

    [ /tmp ]
    oskari@dj3ntoo λ printf "ab\0cd" | sed -e 's/\x00//' | hd
    00000000 61 62 63 64 |abcd|
    00000004

    Testing for (in)equality between pre- and post-sed contents is
    reasonable enough in most cases. This time, though, it would fail to
    detect anything has changed since the pre-sed contents have their NULL's unintentionally stripped, whereas the post-sed contents have them
    intentionally stripped.

    While I personally don't think that running sed on binary files is a
    good idea in the first place, it's still relevant since the end result
    would be an incorrect answer to the question of "Did sed actually do anything?".

    On the other hand, saving a set of pre- and post-sed hashes like Ulrich suggested would give the expected result.

    - Oskari

    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQQfOU+JeXjo4uxN6vCp8he9GGIfEQUCYpqkDwAKCRCp8he9GGIf EbgaAPwKNGDRgfT6aXHwksMcpGWTrfcP69Ik2iZVJm2us6HMlgD/bKRf/VTw8eQQ ZkgPxfzdgSSmUbxVkgwsVRKfNBuCKwQ=
    =wSaq
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ionen Wolkens@21:1/5 to Oskari Pirhonen on Sat Jun 4 09:10:02 2022
    On Fri, Jun 03, 2022 at 07:15:17PM -0500, Oskari Pirhonen wrote:
    [snip[
    Testing for (in)equality between pre- and post-sed contents is
    reasonable enough in most cases. This time, though, it would fail to
    detect anything has changed since the pre-sed contents have their NULL's unintentionally stripped, whereas the post-sed contents have them intentionally stripped.

    While I personally don't think that running sed on binary files is a
    good idea in the first place, it's still relevant since the end result
    would be an incorrect answer to the question of "Did sed actually do anything?".

    Yeah one of the primary motivation to silence this was elsewhere, I
    don't think silencing matters for esed and if binary files are being
    modified may as well use sed(1) directly instead (this is something
    that'd need to be more intensively verified on bumps than with esed
    anyway).

    Use is also very uncommon, although sed is still "handy" when changing
    just 1 instruction and want to avoid an extra dependency on a proper
    binary patching method.

    On the other hand, saving a set of pre- and post-sed hashes like Ulrich suggested would give the expected result.

    If really wanted to solve this yes, although it may make sense to say
    this eclass is not for binary files. The talk to add bash-only "erepl"
    makes it rather difficult to preserve nulls (mapfile silently strip \0
    without even a warning). mapfile -d '' could allow to restore them
    but ideally want to iterate on lines to do per-line pattern matches.
    Is it possible to hack away something to preserve? Probably.. but it's
    going to make this messy and I'm not sure it's worth it.

    erepl is also worse because they'll be lost in output and not just
    during comparison.

    --
    ionen

    -----BEGIN PGP SIGNATURE-----

    iQEzBAABCAAdFiEEx3SLh1HBoPy/yLVYskQGsLCsQzQFAmKbAxIACgkQskQGsLCs QzTsmQgAszUAlSxAYWkc06bwVB9Pp+72visCBj2A5MUFOZpbyzaVsr7IxkfFqa2m 3mdxyKbqcmoG4mkgFxKkHfl+R/jrNea0eHZ2JXtobBkRw3aqXyI07zQyhVSDiqsj pZvQbNvl8ypT85/65jc83tS//APRwz26eHJ2GcWArxOSHVxRvBEe5VOT6E1k5t0t qr9tc5PlofQpzKuaEnbL0imlQdLhWmV7T3w+D15hOflQgUDRH4rCtyQietpB1AxC k+MIokmxVRlW7KvvtL3JsOi2amOVHsw6n+0GrMNW1tXx+JFKVsKVI+uKlITZ9ITa Zo+4MdRmumi9qPEQbGUr7UqCat6jVQ==
    =XyF7
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)