This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple
modulo N addition, maybe N being 2^16.
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple
modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work
out in any useful way. Even if you try to anticipate the impact of
adding the checksum, that only gives you a different checksum, that you
then need to anticipate further... ad infinitum.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
Un bel giorno Rick C digitò:
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to workI'm probably not understanding what you mean, but normally the checksum is stored in a memory section which is not subjected to the checksum calculation itself.
out in any useful way. Even if you try to anticipate the impact of
adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad infinitum.
The actual implementation depends on the tools you are using. Many linkers support this directly: you specify the memory section(s) subjected to checksum calculation, the type of checksum (CRC16, CRC32 etc) and the
memory section that will store the checksum.
Here is a technical note for IAR: https://www.iar.com/knowledge/support/technical-notes/general/checksum-calculation-with-xlink/
A "poor man" solution is to do it manually:
-In the source code, declare your checksum initializing to a known, fixed value (e.g. 0xDEADBEEF)
-Run the program with a debugger; set a breakpoint when it calculates the checksum (and fails), and write down the correct checksum
-Using a binary editor, find the fixed value into the executable binary,
and replace it with the correct value.
On 2023-04-20 5:06, Rick C wrote:
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.Some decades ago I was involved with a project for an 8052-based device, which was required to perform a code-check-sum check at boot.
We decided to use a byte-per-byte xor checksum and make the correct check-sum be zero. We had a code module (possibly in assembler, I don't remember) that defined a one-byte "adjustment" constant in code memory.
For each new version of the code, we first set the adjustment constant
to zero, then ran the program, and it usually reported an error at boot because the check-sum was not zero. We then changed the adjustment
constant to the actual reported checksum, C say, and that zeroed the check-sum because C xor C = 0. Bingo. You can use this method to make
the checksum anything you like, for example hex 55.
With a more advanced order-sensitive check-sum such as a CRC you could
use the same method if you also ensure (by linker commands) that the adjustment value is always the last value that enters in the computed check-sum (assuming that the linking order of the other code modules is
not incidentally changed when the value of the adjustment constant is changed).
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple
modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work
out in any useful way. Even if you try to anticipate the impact of
adding the checksum, that only gives you a different checksum, that
you then need to anticipate further... ad infinitum.
I'm not thinking of any special checksum generator that excludes the
checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to
achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X
when the checksum data is zero. The goal would then be to include
the checksum data value Y in the file, that would change X to Y.
Given the properties of the module N checksum, this would appear to
be impossible for the general case, unless... Add another data
value, called, checksum normalizer. This data value checksums with
the original checksum to give the result zero. Then, when the
checksum is also added, the resulting checksum is, in fact, the
checksum. Another way of looking at this is to add a value that
combines with the added checksum, to be zero, leaving the original
checksum intact.
This might be inordinately hard for a CRC, but a simple checksum
would not be an issue, I think. At least, this could work in
software, where data can be included in an image file as itself. In
a device like an FPGA, it might not be included in the bit stream
file so directly... but that might depend on where in the device it
is inserted. Memory might have data that is stored as itself. I'll
need to look into that.
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple
modulo N addition, maybe N being 2^16.
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embed aTake a look at the old xmodem/ymodem CRC. It was designed such that
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
when the CRC was sent immediately following the data, a receiver
computing CRC over the whole incoming packet (data and CRC both) would
get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather a simple >modulo N addition, maybe N being 2^16.Sorry, I don't know a way to do it with a modular checksum.
YMMV, but I think 16-bit CRC is pretty simple.
George
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C <gnuarm.deletethisbit@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
Take a look at the old xmodem/ymodem CRC. It was designed such that
when the CRC was sent immediately following the data, a receiver
computing CRC over the whole incoming packet (data and CRC both) would
get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather a simple
modulo N addition, maybe N being 2^16.
Sorry, I don't know a way to do it with a modular checksum.
YMMV, but I think 16-bit CRC is pretty simple.
George
On Thursday, April 20, 2023 at 11:33:28?AM UTC-4, George Neuner wrote:
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embed a
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
Take a look at the old xmodem/ymodem CRC. It was designed such that
when the CRC was sent immediately following the data, a receiver
computing CRC over the whole incoming packet (data and CRC both) would
get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use
xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather a simpleSorry, I don't know a way to do it with a modular checksum.
modulo N addition, maybe N being 2^16.
YMMV, but I think 16-bit CRC is pretty simple.
George
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
Even so, I'm not trying to validate the file. I'm trying to come up
with a substitute for a time stamp or version number. I don't want
to have to rely on my consistency in handling the version number
correctly. This would be a backup in case there was more than one
version released, even only within the "lab", that were different. A >checksum that could be read by the controlling software would do the
job.
I have run into this before, where the version number was not a 100% >indication of the uniqueness of an executable. The checksum would be
a second indicator.
I should mention that I'm not looking for a solution that relies on
any specific details of the tools.
On Thursday, April 20, 2023 at 11:33:28 AM UTC-4, George Neuner
wrote:
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embedTake a look at the old xmodem/ymodem CRC. It was designed such
a checksum in a code module to report the checksum, is there a
way of doing this? It's a bit like being your own grandfather, I
think.
that when the CRC was sent immediately following the data, a
receiver computing CRC over the whole incoming packet (data and CRC
both) would get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use
xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather aSorry, I don't know a way to do it with a modular checksum. YMMV,
simple modulo N addition, maybe N being 2^16.
but I think 16-bit CRC is pretty simple.
George
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
Even so, I'm not trying to validate the file. I'm trying to come up
with a substitute for a time stamp or version number. I don't want
to have to rely on my consistency in handling the version number
correctly. This would be a backup in case there was more than one
version released, even only within the "lab", that were different. A checksum that could be read by the controlling software would do the
job.
I have run into this before, where the version number was not a 100% indication of the uniqueness of an executable. The checksum would be
a second indicator.
I should mention that I'm not looking for a solution that relies on
any specific details of the tools.
The method to check for a proper constant value after the whole
block and CRC are received and put through the generator works
with the CRC-CCITT (actually ITU-T). The proper final value
depends on the initial CRC and whether the CRC is inverted before
sending. The limitation is that the CRC has to be sent least
significant octet first.
For a reference, see RFC1662, Appendix C.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.infinitum.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in thebit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
On 4/19/23 10:06 PM, Rick C wrote:infinitum.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting checksumI'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the
IF I understand you correctly, what you want is for the file to compute
to some "checksum" that comes from the basic contents of the file, and
then you want to add the "checksum" into the file so the program itself
can print its checksum.
One fact to remember, is that "cryptographic hashes" were invented
because it was too easy to create a faked file that matches a non-crptographic hash/checksum, so that couldn't be a key to make sure
you really had the right file in the presence of a determined enemy, but
the checksums were good enough to catch "random" errors.
This means that you can add the checksum into the file, and some
additional bytes (likely at the end) and by knowing the propeties of the checksum algorithm, compute a value for those extra bytes such that the "undo" the changes caused by adding the checksum bytes to file.
I'm not sure exactly how to computes these, but the key is that you add something at the end of the file to get the checksum back to what the original file had before you added the checksum into the file.
You have some block of data |....data....|---------------------------------------------^^^^^^
You compute CRC on the data block and then append the resulting value
to the end of the block. xmodem CRC is 16-bit, so it adds 2 bytes to
the data.
So now you have a new extended block |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or
the original CRC value appended to it has been changed/corrupted.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.infinitum.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting checksum
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the bitstream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
--Rick, What is the purpose of this? Is it (1) to be able to externally identify a binary, as one might a ROM image by computing a checksum? Is it (2) for a run-able binary to be able to check itself? This would of course only be able to detect
Rick C.
- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209
On 4/20/2023 1:44 PM, George Neuner wrote:
You have some block of data  |....data....|---------------------------------------------^^^^^^
You compute CRC on the data block and then append the resulting value
to the end of the block. xmodem CRC is 16-bit, so it adds 2 bytes to
the data.
Exactly. You *don't* drag the "extra bits" into the initial
CRC calculation but *do* into the CRC *verification*. Easy
peasy (since forever).
[Think about it:Â your performing a division operation
and the residual is the "remainder".]
Note that you want to choose a polynomial that doesn't
give you a "win" result for "obviously" corrupt data.
E.g., if data is all zeros or all 0xFF (as these sorts of
conditions can happen with hardware failures) you probably
wouldn't want a "success" indication!
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
So now you have a new extended block  |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or
the original CRC value appended to it has been changed/corrupted.
As there is usually a lack of originality in the algorithms
chosen, you have to consider if you are also hoping to use
this to safeguard the *integrity* of your image (i.e.,
against intentional modification).
On Thursday, April 20, 2023 at 12:06:36 PM UTC+10, Rick C wrote:infinitum.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting checksumI'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the
corruption, not tampering. Is it (3) for the loader (whatever that might be) to be able to say 'this binary has the correct checksum' and only jump to it if it does? Again this would only be able to detect corruption, not tampering. Are you hoping for--
Rick C.
- Get 1,000 miles of free SuperchargingRick, What is the purpose of this? Is it (1) to be able to externally identify a binary, as one might a ROM image by computing a checksum? Is it (2) for a run-able binary to be able to check itself? This would of course only be able to detect
- Tesla referral code - https://ts.la/richard11209
Note that you want to choose a polynomial that doesn't
give you a "win" result for "obviously" corrupt data.
E.g., if data is all zeros or all 0xFF (as these sorts of
conditions can happen with hardware failures) you probably
wouldn't want a "success" indication!
No, that is pointless for something like a code image. It just adds needless
complexity to your CRC algorithm.
You should already have checks that would eliminate an all-zero image or other
"obviously corrupt" data. You'll be checking the image for a key or "magic number" that identifies the image as "program image for board X, project Y". You'll be checking version numbers. You'll be reading the length of the image
so you know the range for your CRC function, and where to find the appended CRC
check. You might not have all of these in a given system, but you'll have some
kind of check which would fail on an all-zero image.
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
Again, pointless.
Salt is important for security-related hashes (like password hashes), not for integrity checks.
So now you have a new extended block  |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or
the original CRC value appended to it has been changed/corrupted.
As there is usually a lack of originality in the algorithms
chosen, you have to consider if you are also hoping to use
this to safeguard the *integrity* of your image (i.e.,
against intentional modification).
"Integrity" has nothing to do with the motivation for change. /Security/ is concerned with intentional modifications that deliberately attempt to defeat /integrity/ checks. Integrity is about detecting any changes.
If you are concerned about the possibility of intentional malicious changes,
CRC's alone are useless. All the attacker needs to do after modifying the image is calculate the CRC themselves, and replace the original checksum with their own.
Using non-standard algorithms for security is a simple way to get things completely wrong. "Security by obscurity" is very rarely the right answer. In
reality, good security algorithms, and good implementations, are difficult and
specialised tasks, best left to people who know what they are doing.
To make something secure, you have to ensure that the check algorithms depend on a key that you know, but that the attacker does not have. That's the basis of digital signatures (though you use a secure hash algorithm rather than a simple CRC).
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup
so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
On 4/21/2023 3:43 AM, David Brown wrote:
Note that you want to choose a polynomial that doesn't
give you a "win" result for "obviously" corrupt data.
E.g., if data is all zeros or all 0xFF (as these sorts of
conditions can happen with hardware failures) you probably
wouldn't want a "success" indication!
No, that is pointless for something like a code image. It just adds
needless complexity to your CRC algorithm.
Perhaps you've forgotten that you don't just use CRCs (secure hashes, etc.) on "code images"?
You should already have checks that would eliminate an all-zero image
or other "obviously corrupt" data. You'll be checking the image for a
key or "magic number" that identifies the image as "program image for
board X, project Y". You'll be checking version numbers. You'll be
reading the length of the image so you know the range for your CRC
function, and where to find the appended CRC check. You might not
have all of these in a given system, but you'll have some kind of
check which would fail on an all-zero image.
See above.
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
Again, pointless.
Salt is important for security-related hashes (like password hashes),
not for integrity checks.
You've missed the point. The correct "sum" can be anything.
Why is "0" more special than any other value? As the value is
typically meaningless to anything other than the code that verifies
it, you couldn't look at an image (or the output of the verifier)
and gain anything from seeing that obscure value.
OTOH, if the CRC yields something familiar -- or useful -- then
it can tell you something about the image. E.g., salt the algorithm
with the product code, version number, your initials, 0xDEADBEEF, etc.
So now you have a new extended block  |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or
the original CRC value appended to it has been changed/corrupted.
As there is usually a lack of originality in the algorithms
chosen, you have to consider if you are also hoping to use
this to safeguard the *integrity* of your image (i.e.,
against intentional modification).
"Integrity" has nothing to do with the motivation for change.
/Security/ is concerned with intentional modifications that
deliberately attempt to defeat /integrity/ checks. Integrity is about
detecting any changes.
If you are concerned about the possibility of intentional malicious
changes,
Changes don't have to be malicious.
I altered the test procedure for a
piece of military gear we were building simply to skip some lengthy
tests that I *knew* would pass (I don't want to inject an extra 20
minutes of wait time
just to get through a lengthy test I already know works before I can get
to the test of interest to me, now.
I failed to undo the change before the official signoff on the device.
The only evidence of this was the fact that I had also patched the
startup message to say "Go for coffee..." -- which remained on the
screen for the duration of the lengthy (even with the long test
elided) procedure...
..which alerted folks to the fact that this *probably* wasn't the
original image. (The computer running the test suite on the DUT had
no problem accepting my patched binary)
CRC's alone are useless. All the attacker needs to do after modifying
the image is calculate the CRC themselves, and replace the original
checksum with their own.
That assumes the "alterer" knows how to replace the checksum, how it
is computed, where it is embedded in the image, etc. I modified the Compaq portable mentioned without ever knowing where the checksum was store
or *if* it was explicitly stored. I had no desire to disassemble the
BIOS ROMs (though could obviously do so as there was no "proprietary hardware" limiting access to their contents and the instruction set of
the processor is well known!).
Instead, I did this by *guessing* how they would implement such a check
in a bit of kit from that era (ERPOMs aren't easily modified by malware
so it wasn't likely that they would go to great lengths to "protect" the image). And, if my guess had been incorrect, I could always reinstall
the original EPROMs -- nothing lost, nothing gained.
Had much experience with folks counterfeiting your products and making "simple" changes to the binaries? Like changing the copyright notice
or splash screen?
Then, bringing the (accused) counterfeit of YOUR product into a courtroom
and revealing the *hidden* checksum that the counterfeiter wasn't aware of?
"Gee, why does YOUR (alleged) device have *my* name in it -- in addition
to behaving exactly like mine??"
[I guess obscurity has its place!]
Use a non-secret approach and you invite folks to alter it, as well.
Using non-standard algorithms for security is a simple way to get
things completely wrong. "Security by obscurity" is very rarely the
right answer. In reality, good security algorithms, and good
implementations, are difficult and specialised tasks, best left to
people who know what they are doing.
To make something secure, you have to ensure that the check algorithms
depend on a key that you know, but that the attacker does not have.
That's the basis of digital signatures (though you use a secure hash
algorithm rather than a simple CRC).
If you can remove the check, then what value the key's secrecy? By your criteria, the adversary KNOWS how you are implementing your security
so he knows exactly what to remove to bypass your checks and allow his altered image to operate in its place.
Ever notice how manufacturers don't PUBLICLY disclose their security
hooks (without an NDA)? If "security by obscurity" was not important,
they would publish these details INVITING challenges (instead of
trying to limit the knowledge to people with whom they've officially contracted).
On Thu, 20 Apr 2023 09:45:59 -0700 (PDT), Rick C
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
It's implicit in the equation they chose. I don't know how it works -
just that it does.
result is 0xBA3C
result is 0x0000
On Friday, April 21, 2023 at 4:53:18 AM UTC-4, Brian Cockburn wrote:infinitum.
On Thursday, April 20, 2023 at 12:06:36 PM UTC+10, Rick C wrote:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
this would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resultingI'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum,
bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the
corruption, not tampering. Is it (3) for the loader (whatever that might be) to be able to say 'this binary has the correct checksum' and only jump to it if it does? Again this would only be able to detect corruption, not tampering. Are you hoping for--
Rick C.
- Get 1,000 miles of free SuperchargingRick, What is the purpose of this? Is it (1) to be able to externally identify a binary, as one might a ROM image by computing a checksum? Is it (2) for a run-able binary to be able to check itself? This would of course only be able to detect
- Tesla referral code - https://ts.la/richard11209
This is simply to be able to say this version is unique, regardless of what the version number says. Version numbers are set manually and not always done correctly. I'm looking for something as a backup so that if the checksums are different, I can besure the versions are not the same.
The less work involved, the better.Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools
--
Rick C.
++ Get 1,000 miles of free Supercharging
++ Tesla referral code - https://ts.la/richard11209
On Thursday, April 20, 2023 at 10:09:35 PM UTC-4, Richard Damon wrote:infinitum.
On 4/19/23 10:06 PM, Rick C wrote:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting checksum
I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the
checksum. All the carries would cascade out of the upper 16 bits from adding the inserted checksum and it's 2's complement.IF I understand you correctly, what you want is for the file to compute
to some "checksum" that comes from the basic contents of the file, and
then you want to add the "checksum" into the file so the program itself
can print its checksum.
One fact to remember, is that "cryptographic hashes" were invented
because it was too easy to create a faked file that matches a
non-crptographic hash/checksum, so that couldn't be a key to make sure
you really had the right file in the presence of a determined enemy, but
the checksums were good enough to catch "random" errors.
This means that you can add the checksum into the file, and some
additional bytes (likely at the end) and by knowing the propeties of the
checksum algorithm, compute a value for those extra bytes such that the
"undo" the changes caused by adding the checksum bytes to file.
I'm not sure exactly how to computes these, but the key is that you add
something at the end of the file to get the checksum back to what the
original file had before you added the checksum into the file.
Yeah, for a simple checksum, I think that would be easy, at least if "checksum" means a bitwise XOR operation. If the checksum and extra bytes are both 16 bits, this would also work for an arithmetic checksum where each 16 bit word were added into the
I don't even want to think about using a CRC to try to do this.
On 21/04/2023 14:12, Rick C wrote:David, a hash and a CRC are not the same thing. They both produce a reasonably unique result though. Any change would show in either (unless as a result of intentional tampering).
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup
so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
Run a simple 32-bit crc over the image. The result is a hash of the
image. Any change in the image will show up as a change in the crc.
On 21/04/2023 13:39, Don Y wrote:
On 4/21/2023 3:43 AM, David Brown wrote:
Note that you want to choose a polynomial that doesn't
give you a "win" result for "obviously" corrupt data.
E.g., if data is all zeros or all 0xFF (as these sorts of
conditions can happen with hardware failures) you probably
wouldn't want a "success" indication!
No, that is pointless for something like a code image. It just adds
needless complexity to your CRC algorithm.
Perhaps you've forgotten that you don't just use CRCs (secure hashes, etc.) >> on "code images"?
No - but "code images" is the topic here.
However, in almost every case where CRC's might be useful, you have additional
checks of the sanity of the data, and an all-zero or all-one data block would be rejected. For example, Ethernet packets use CRC for integrity checking, but
an attempt to send a packet type 0 from MAC address 00:00:00:00:00:00 to address 00:00:00:00:00:00, of length 0, would be rejected anyway.
I can't think of any use-cases where you would be passing around a block of "pure" data that could reasonably take absolutely any value, without any type of "envelope" information, and where you would think a CRC check is appropriate.
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
Again, pointless.
Salt is important for security-related hashes (like password hashes), not >>> for integrity checks.
You've missed the point. The correct "sum" can be anything.
Why is "0" more special than any other value? As the value is
typically meaningless to anything other than the code that verifies
it, you couldn't look at an image (or the output of the verifier)
and gain anything from seeing that obscure value.
Do you actually know what is meant by "salt" in the context of hashes, and why
it is useful in some circumstances? Do you understand that "salt" is added (usually prepended, or occasionally mixed in in some other way) to the data /before/ the hash is calculated?
I have not given the slightest indication to suggest that "0" is a special value. I fully agree that the value you get from the checking algorithm does
not have to be 0 - I already suggested it could be compared to the stored value. I.e., your build your image file as "data ++ crc(data)", at check it by
re-calculating "crc(data)" on the received image and comparing the result to the received crc. There is no necessity or benefit in having a crc run calculated over the received data plus the received crc being 0.
"Salt" is used in cases where the original data must be kept secret, and only the hashes are transmitted or accessible - by adding salt to the original data
before hashing it, you avoid a direct correspondence between the hash and the original data. The prime use-case is to stop people being able to figure out a
password by looking up the hash in a list of pre-computed hashes of common passwords.
OTOH, if the CRC yields something familiar -- or useful -- then
it can tell you something about the image. E.g., salt the algorithm
with the product code, version number, your initials, 0xDEADBEEF, etc.
You are making no sense at all. Are you suggesting that it would be a good idea to add some value to the start of the image so that the resulting crc calculation gives a nice recognisable product code? This "salt" would be different for each program image, and calculated by trial and error. If you want a product code, version number, etc., in the program image (and it's a good idea), just put these in the program image!
So now you have a new extended block  |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or
the original CRC value appended to it has been changed/corrupted.
As there is usually a lack of originality in the algorithms
chosen, you have to consider if you are also hoping to use
this to safeguard the *integrity* of your image (i.e.,
against intentional modification).
"Integrity" has nothing to do with the motivation for change. /Security/ is >>> concerned with intentional modifications that deliberately attempt to defeat
/integrity/ checks. Integrity is about detecting any changes.
If you are concerned about the possibility of intentional malicious changes,
Changes don't have to be malicious.
Accidental changes (such as human error, noise during data transfer, memory cell errors, etc.) do not pass integrity tests unnoticed.
To be more accurate,
the chances of them passing unnoticed are of the order of 1 in 2^n, for a good
n-bit check such as a CRC check. Certain types of error are always detectable,
such as single and double bit errors. That is the point of using a checksum or
hash for integrity checking.
/Intentional/ changes are a different matter. If a hacker changes the program
image, they can change the transmitted hash to their own calculated hash. Or
for a small CRC, they could change a different part of the image until the original checksum matched - for a 16-bit CRC, that only takes 65,535 attempts in the worst case.
That is why you need to distinguish between the two possibilities. If you don't have to worry about malicious attacks, a 32-bit CRC takes a dozen lines of C code and a 1 KB table, all running extremely efficiently. If security is
an issue, you need digital signatures - an RSA-based signature system is orders
of magnitude more effort in both development time and in run time.
I altered the test procedure for a
piece of military gear we were building simply to skip some lengthy tests
that I *knew* would pass (I don't want to inject an extra 20 minutes of wait >> time
just to get through a lengthy test I already know works before I can get
to the test of interest to me, now.
I failed to undo the change before the official signoff on the device.
The only evidence of this was the fact that I had also patched the
startup message to say "Go for coffee..." -- which remained on the
screen for the duration of the lengthy (even with the long test
elided) procedure...
..which alerted folks to the fact that this *probably* wasn't the
original image. (The computer running the test suite on the DUT had
no problem accepting my patched binary)
And what, exactly, do you think that anecdote tells us about CRC checks for image files? It reminds us that we are all fallible, but does no more than that.
CRC's alone are useless. All the attacker needs to do after modifying the >>> image is calculate the CRC themselves, and replace the original checksum >>> with their own.
That assumes the "alterer" knows how to replace the checksum, how it
is computed, where it is embedded in the image, etc. I modified the Compaq >> portable mentioned without ever knowing where the checksum was store
or *if* it was explicitly stored. I had no desire to disassemble the
BIOS ROMs (though could obviously do so as there was no "proprietary
hardware" limiting access to their contents and the instruction set of
the processor is well known!).
Instead, I did this by *guessing* how they would implement such a check
in a bit of kit from that era (ERPOMs aren't easily modified by malware
so it wasn't likely that they would go to great lengths to "protect" the
image). And, if my guess had been incorrect, I could always reinstall
the original EPROMs -- nothing lost, nothing gained.
Had much experience with folks counterfeiting your products and making
"simple" changes to the binaries? Like changing the copyright notice
or splash screen?
Then, bringing the (accused) counterfeit of YOUR product into a courtroom
and revealing the *hidden* checksum that the counterfeiter wasn't aware of? >>
"Gee, why does YOUR (alleged) device have *my* name in it -- in addition
to behaving exactly like mine??"
[I guess obscurity has its place!]
Security by obscurity is not security. Having a hidden signature or other mark
can be useful for proving ownership (making an intentional mistake is another common tactic - such as commercial maps having a few subtle spelling errors). But that is not security.
Use a non-secret approach and you invite folks to alter it, as well.
Using non-standard algorithms for security is a simple way to get things >>> completely wrong. "Security by obscurity" is very rarely the right answer.
In reality, good security algorithms, and good implementations, are
difficult and specialised tasks, best left to people who know what they are >>> doing.
To make something secure, you have to ensure that the check algorithms
depend on a key that you know, but that the attacker does not have. That's >>> the basis of digital signatures (though you use a secure hash algorithm
rather than a simple CRC).
If you can remove the check, then what value the key's secrecy? By your
criteria, the adversary KNOWS how you are implementing your security
so he knows exactly what to remove to bypass your checks and allow his
altered image to operate in its place.
Ever notice how manufacturers don't PUBLICLY disclose their security
hooks (without an NDA)? If "security by obscurity" was not important,
they would publish these details INVITING challenges (instead of
trying to limit the knowledge to people with whom they've officially
contracted).
Any serious manufacturer /does/ invite challenges to their security.
There are multiple reasons why a manufacturer (such as a semiconductor manufacturer) might be guarded about the details of their security systems. They can be avoiding giving hints to competitors. Maybe they know their systems aren't really very secure, because their keys are too short or they can
be read out in some way.
But I think the main reasons are often:
They want to be able to change the details, and that's far easier if there are
only a few people who have read the information.
They don't want endless support questions from amateurs.
They are limited by idiotic government export restrictions made by ignorant politicians who don't understand cryptography.
Some things benefit from being kept hidden, or under restricted access. The details of the CRC algorithm you use to catch accidental errors in your image file is /not/ one of them. If you think hiding it has the remotest hint of a
benefit, you are doing things wrong - you need a /security/ check, not a simple
/integrity/ check.
And then once you have switched to a security check - a digital signature - there's no need to keep that choice hidden either, because it is the /key/ that
is important, not the type of lock.
On 21/04/2023 14:12, Rick C wrote:
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup
so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
Run a simple 32-bit crc over the image. The result is a hash of the
image. Any change in the image will show up as a change in the crc.
On Friday, April 21, 2023 at 10:12:49 PM UTC+10, Rick C wrote:infinitum.
On Friday, April 21, 2023 at 4:53:18 AM UTC-4, Brian Cockburn wrote:
On Thursday, April 20, 2023 at 12:06:36 PM UTC+10, Rick C wrote:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
this would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resultingI'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum,
the bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in
corruption, not tampering. Is it (3) for the loader (whatever that might be) to be able to say 'this binary has the correct checksum' and only jump to it if it does? Again this would only be able to detect corruption, not tampering. Are you hoping for--
Rick C.
- Get 1,000 miles of free SuperchargingRick, What is the purpose of this? Is it (1) to be able to externally identify a binary, as one might a ROM image by computing a checksum? Is it (2) for a run-able binary to be able to check itself? This would of course only be able to detect
- Tesla referral code - https://ts.la/richard11209
be sure the versions are not the same.This is simply to be able to say this version is unique, regardless of what the version number says. Version numbers are set manually and not always done correctly. I'm looking for something as a backup so that if the checksums are different, I can
to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the 'checksum'The less work involved, the better.
--
Rick C.
++ Get 1,000 miles of free SuperchargingRick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools
++ Tesla referral code - https://ts.la/richard11209
And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?
to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the 'checksum'Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools
And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?
Yes, I want the checksum to be readable while operating. Calculation code??? Not going to happen. That's why I want to embed the checksum.
Yes, two compiled files which ended up with the same version number by error. We are using an 8 bit version number, so two hex digits. Negative numbers are lab versions, positive numbers are releases, so 64 of each.
... sometimes, in the lab, the rev number is not bumped when it should be.
So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.
Rick,
So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.
How will the checksum 'not change'? It will be different for every build won't it?
Cheers, Brian.
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
On Saturday, April 22, 2023 at 1:02:28 AM UTC+10, David Brown wrote:
On 21/04/2023 14:12, Rick C wrote:David, a hash and a CRC are not the same thing.
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
They both produce a
reasonably unique result though. Any change would show in either
(unless as a result of intentional tampering).
On 4/21/2023 7:50 AM, David Brown wrote:
On 21/04/2023 13:39, Don Y wrote:
On 4/21/2023 3:43 AM, David Brown wrote:
Note that you want to choose a polynomial that doesn't
give you a "win" result for "obviously" corrupt data.
E.g., if data is all zeros or all 0xFF (as these sorts of
conditions can happen with hardware failures) you probably
wouldn't want a "success" indication!
No, that is pointless for something like a code image. It just adds
needless complexity to your CRC algorithm.
Perhaps you've forgotten that you don't just use CRCs (secure hashes,
etc.)
on "code images"?
No - but "code images" is the topic here.
So, anything unrelated to CRC's as applied to code images is off limits... per order of the Internet Police"?
If *all* you use CRCs for is checking *a* code image at POST, you're
wasting a valuable resource.
Do you not think data/parameters need to be safeguarded? Program images? Communication protocols?
Or, do you develop yet another technique for *each* of those?
However, in almost every case where CRC's might be useful, you have
additional checks of the sanity of the data, and an all-zero or
all-one data block would be rejected. For example, Ethernet packets
use CRC for integrity checking, but an attempt to send a packet type 0
from MAC address 00:00:00:00:00:00 to address 00:00:00:00:00:00, of
length 0, would be rejected anyway.
Why look at "data" -- which may be suspect -- and *then* check its CRC?
Run the CRC first. If it fails, decide how you are going to proceed
or recover.
["Data" can be code or parameters]
I treat blocks of "data" (carefully arranged) with individual CRCs,
based on their relative importance to the operation. If the CRC is
corrupt, I have no idea *where* the error lies -- as it could
be anything in the checked block. So, one has to (typically)
restore some defaults (or, invoke a reconfigure operation) which
recreates *a* valid dataset.
This is particularly useful when power to a device can be
removed at arbitrary points in time (or, some other abrupt
crash). Before altering anything in a block, take deliberate
steps to invalidate the CRC, make your changes, then "fix"
the CRC. So, an interrupted process causes the CRC to fail
and remedial action taken.
Note that replacing a FLASH image (mostly code) falls under
such a mechanism.
I can't think of any use-cases where you would be passing around a
block of "pure" data that could reasonably take absolutely any value,
without any type of "envelope" information, and where you would think
a CRC check is appropriate.
I append a *version specific* CRC to each packet of marshalled data
in my RMIs. If the data is corrupted in transit *or* if the
wrong version API ends up targeted, the operation will abend
because we know the data "isn't right".
I *could* put a header saying "this is version 4.2". And, that
tells me nothing about the integrity of the rest of the data.
OTOH, ensuring the CRC reflects "4.2" does -- it the recipient
expects it to be so.
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
Again, pointless.
Salt is important for security-related hashes (like password
hashes), not for integrity checks.
You've missed the point. The correct "sum" can be anything.
Why is "0" more special than any other value? As the value is
typically meaningless to anything other than the code that verifies
it, you couldn't look at an image (or the output of the verifier)
and gain anything from seeing that obscure value.
Do you actually know what is meant by "salt" in the context of hashes,
and why it is useful in some circumstances? Do you understand that
"salt" is added (usually prepended, or occasionally mixed in in some
other way) to the data /before/ the hash is calculated?
What term would you have me use to indicate a "bias" applied to a CRC algorithm?
I have not given the slightest indication to suggest that "0" is a
special value. I fully agree that the value you get from the checking
algorithm does not have to be 0 - I already suggested it could be
compared to the stored value. I.e., your build your image file as
"data ++ crc(data)", at check it by re-calculating "crc(data)" on the
received image and comparing the result to the received crc. There is
no necessity or benefit in having a crc run calculated over the
received data plus the received crc being 0.
"Salt" is used in cases where the original data must be kept secret,
and only the hashes are transmitted or accessible - by adding salt to
the original data before hashing it, you avoid a direct correspondence
between the hash and the original data. The prime use-case is to stop
people being able to figure out a password by looking up the hash in a
list of pre-computed hashes of common passwords.
See above.
OTOH, if the CRC yields something familiar -- or useful -- then
it can tell you something about the image. E.g., salt the algorithm
with the product code, version number, your initials, 0xDEADBEEF, etc.
You are making no sense at all. Are you suggesting that it would be a
good idea to add some value to the start of the image so that the
resulting crc calculation gives a nice recognisable product code?
This "salt" would be different for each program image, and calculated
by trial and error. If you want a product code, version number, etc.,
in the program image (and it's a good idea), just put these in the
program image!
Again, that tells you nothing about the rest of the image!
See the RMI desciption.
[Note that the OP is expecting the checksum to help *him*
identify versions:Â "Just put these in the program image!"Â Eh?]
So now you have a new extended block  |....data....|crc|
Now if you compute a new CRC on the extended block, the resulting
value /should/ come out to zero. If it doesn't, either your data or >>>>>> the original CRC value appended to it has been changed/corrupted.
As there is usually a lack of originality in the algorithms
chosen, you have to consider if you are also hoping to use
this to safeguard the *integrity* of your image (i.e.,
against intentional modification).
"Integrity" has nothing to do with the motivation for change.
/Security/ is concerned with intentional modifications that
deliberately attempt to defeat /integrity/ checks. Integrity is
about detecting any changes.
If you are concerned about the possibility of intentional malicious
changes,
Changes don't have to be malicious.
Accidental changes (such as human error, noise during data transfer,
memory cell errors, etc.) do not pass integrity tests unnoticed.
That's not true. The role of the 8test* is to notice these. If the test is blind to the types of errors that are likely to occur, then it CAN'T notice them.
A CRC (hash, etc.) reduces a large block of data to a small bit of
data. So, by definition, there are multiple DIFFERENT sets of data that
map to the same CRC/hash/etc. (2^(data_size-CRC-size))
E.g., simply summing the values in a block of memory will yield "0"
for ANY condition that results in the block having identical values
for ALL members, if the block size is a power of 2. So, a block
of 0xFF, 0x00, 0xFE, 0x27, 0x88, etc. will all yield the same sum.
Clearly a bad choice of test!
OTOH, "salting" the calculation so that it is expected to yield
a value of 0x13 means *those* situations will be flagged as errors
(and a different set of situations will sneak by, undetected).
The trick (engineering) is to figure out which types of failures/faults/errors are most common to occur and guard
against them.
To be more accurate, the chances of them passing unnoticed are of the
order of 1 in 2^n, for a good n-bit check such as a CRC check.
Certain types of error are always detectable, such as single and
double bit errors. That is the point of using a checksum or hash for
integrity checking.
/Intentional/ changes are a different matter. If a hacker changes the
program image, they can change the transmitted hash to their own
calculated hash. Or for a small CRC, they could change a different
part of the image until the original checksum matched - for a 16-bit
CRC, that only takes 65,535 attempts in the worst case.
If the approach used is "typical", then you need far fewer attempts to produce a correct image -- without EVER knowing where the CRC is stored.
That is why you need to distinguish between the two possibilities. If
you don't have to worry about malicious attacks, a 32-bit CRC takes a
dozen lines of C code and a 1 KB table, all running extremely
efficiently. If security is an issue, you need digital signatures -
an RSA-based signature system is orders of magnitude more effort in
both development time and in run time.
It's considerably more expensive AND not fool-proof -- esp if the
attacker knows you are signing binaries. "OK, now I need to find
WHERE the signature is verified and just patch that "CALL" out
of the code".
I altered the test procedure for a
piece of military gear we were building simply to skip some lengthy
tests that I *knew* would pass (I don't want to inject an extra 20
minutes of wait time
just to get through a lengthy test I already know works before I can get >>> to the test of interest to me, now.
I failed to undo the change before the official signoff on the device.
The only evidence of this was the fact that I had also patched the
startup message to say "Go for coffee..." -- which remained on the
screen for the duration of the lengthy (even with the long test
elided) procedure...
..which alerted folks to the fact that this *probably* wasn't the
original image. (The computer running the test suite on the DUT had
no problem accepting my patched binary)
And what, exactly, do you think that anecdote tells us about CRC
checks for image files? It reminds us that we are all fallible, but
does no more than that.
That *was* the point. Because the folks who designed the test computer relied on common techniques to safeguard the image.
The counterfeiting example I cited indicates how "obscurity/secrecy"
is far more effective (yet you dismiss it out-of-hand).
CRC's alone are useless. All the attacker needs to do after
modifying the image is calculate the CRC themselves, and replace the
original checksum with their own.
That assumes the "alterer" knows how to replace the checksum, how it
is computed, where it is embedded in the image, etc. I modified the
Compaq
portable mentioned without ever knowing where the checksum was store
or *if* it was explicitly stored. I had no desire to disassemble the
BIOS ROMs (though could obviously do so as there was no "proprietary
hardware" limiting access to their contents and the instruction set of
the processor is well known!).
Instead, I did this by *guessing* how they would implement such a check
in a bit of kit from that era (ERPOMs aren't easily modified by malware
so it wasn't likely that they would go to great lengths to "protect" the >>> image). And, if my guess had been incorrect, I could always reinstall
the original EPROMs -- nothing lost, nothing gained.
Had much experience with folks counterfeiting your products and making
"simple" changes to the binaries? Like changing the copyright notice
or splash screen?
Then, bringing the (accused) counterfeit of YOUR product into a
courtroom
and revealing the *hidden* checksum that the counterfeiter wasn't
aware of?
"Gee, why does YOUR (alleged) device have *my* name in it -- in addition >>> to behaving exactly like mine??"
[I guess obscurity has its place!]
Security by obscurity is not security. Having a hidden signature or
other mark can be useful for proving ownership (making an intentional
mistake is another common tactic - such as commercial maps having a
few subtle spelling errors). But that is not security.
Of course it is! If *you* check the "hidden signature" at runtime
and then alter "your" operation such that an altered copy fails
to perform properly, then then you have secured it.
Would you want to use a check-writing program if the account
balances it maintains were subtly (but not consistently)
incorrect?
OTOH, if the (altered) program threw up a splash screen and
said "Unlicensed copy detected" and refused to operate, the
"program" is still "secured" -- but, now you've provided an
easy indicator of whether or not the security has been
defeated.
We started doing this in the heyday of video (arcade) gaming;
a counterfeiter would have a clone of YOUR game on the market
(at substantially reduced prices) in a matter of *weeks*.
As Operators have no foreknowledge of which games will be
moneymakers and which will be "90 day wonders" (literally,
no longer played after 90 days of exposure!), what incentive
to pay for a genuine article?
If all a counterfeiter had to do was alter the copyright
notice (even if it was stored in some coded form), or alter
some graphics (name of game, colors/shapes of characters)
that's *no* impediment -- given how often and quickly
it could be done.
Games would not just look at their images during POST
but, also, verify that routineX() had some particular
side-effect that could be tested, etc. Counterfeiters
would go to lengths to ensure even THESE tests would pass.
Because the game would *complain*, otherwise! (so, keep
looking for more tests until the game stops throwing an
alarm).
OTOH, if you *hide* the checks in the runtime and alter
the game's performance subtly by folding expected values
into key calculations such that values derived from
altered code differ, you can annoy the player:Â "why did
my guy just turn blue and run off the edge of the screen?"
An annoyed player stops putting money into a game.
A game that doesn't earn money -- regardless of how
inexpensive it was to purchase -- quickly teaches the
Owner not to invest in such "buggy" games.
This is much better than taking the counterfeiter to court and
proving the code is a copy of yours! (and, "FlyByNight
Games Counterfeiters" simply closes up shop and opens up,
next door)
And, because there is no "drop dead" point in the code or
the games behavior, the counterfeiter never knows when
he's found all the protection mechanisms.
Checking signatures, CRCs, licensing schemes, etc. all are used
in a "drop dead" fashion so considerably easier to defeat.
Witness the number of "products" available as warez...
Use a non-secret approach and you invite folks to alter it, as well.
Using non-standard algorithms for security is a simple way to get
things completely wrong. "Security by obscurity" is very rarely the
right answer. In reality, good security algorithms, and good
implementations, are difficult and specialised tasks, best left to
people who know what they are doing.
To make something secure, you have to ensure that the check
algorithms depend on a key that you know, but that the attacker does
not have. That's the basis of digital signatures (though you use a
secure hash algorithm rather than a simple CRC).
If you can remove the check, then what value the key's secrecy? By your >>> criteria, the adversary KNOWS how you are implementing your security
so he knows exactly what to remove to bypass your checks and allow his
altered image to operate in its place.
Ever notice how manufacturers don't PUBLICLY disclose their security
hooks (without an NDA)? If "security by obscurity" was not important,
they would publish these details INVITING challenges (instead of
trying to limit the knowledge to people with whom they've officially
contracted).
Any serious manufacturer /does/ invite challenges to their security.
There are multiple reasons why a manufacturer (such as a semiconductor
manufacturer) might be guarded about the details of their security
systems. They can be avoiding giving hints to competitors. Maybe they
know their systems aren't really very secure, because their keys are
too short or they can be read out in some way.
But I think the main reasons are often:
They want to be able to change the details, and that's far easier if
there are only a few people who have read the information.
So, a legitimate customer is subjected to arbitrary changes in
the product's implementation?
They don't want endless support questions from amateurs.
Only answer with a support contract.
They are limited by idiotic government export restrictions made by
ignorant politicians who don't understand cryptography.
Protections don't always have to be cryptographic.
The
"Fortress" payphone is remarkably well hardened to direct
physical (brute force) attacks -- money is involved.
Ditto many slot machines (again, CASH money). Yet, all
have vulnerabilities. "Expose this portion of the die
to ultraviolet light to reset the memory protection bits"
Etc.
Some things benefit from being kept hidden, or under restricted
access. The details of the CRC algorithm you use to catch accidental
errors in your image file is /not/ one of them. If you think hiding
it has the remotest hint of a benefit, you are doing things wrong -
you need a /security/ check, not a simple /integrity/ check.
And then once you have switched to a security check - a digital
signature - there's no need to keep that choice hidden either, because
it is the /key/ that is important, not the type of lock.
Again, meaningless if the attacker can interfere with the *enforcement*
of that check. Using something "well known" just means he already knows what to look for in your code. Or, how to interfere with your
intended implementation in ways that you may have not anticipated
(confident that your "security" can't be MATHEMATICALLY broken).
I had a discussion with a friend who knew just enough about "computers"
to THINK he understood that world. I mentioned my NOT using ecommerce.
He laughed at me as "naive": "There's 40 bit encryption on those connections! No one is going to eavesdrop on your financial data!"
[Really, Jerry? You think, as an OLD accountant, you know more
than I do as a young engineer practicing in that field? Ok...]
"Yeah, and are you 100% sure something isn't already *on* your computer looking at your keystrokes BEFORE they head down that encrypted tunnel?"
Guess he hadn't really thought out the problem to that level of detail
as his confidence quickly melted away to one of worry ("I wonder if
I've already been hacked??")
People implementing security almost always focus on the wrong
aspects of the problem and walk away THINKING they can rest easy. Vulnerabilities are often so blatantly obvious, after the fact,
as to be embarassing:Â "You're not supposed to do that!"
"Then, why did your product LET ME?"
I use *many* layers of security in my current design and STILL
expect them (at least the ones that are accessible) to all
be subverted. So, ultimately rely on controlling *what*
the devices can do so that, even compromised, they can't
cause undetectable failures or information leaks.
"Here's my source code. Here are my schematics. Here's the
name of the guy who oversees production (bribe him to gain
access to the keys stored in the TPM). Now, what are you
gonna *do* with all that?"
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Again - use a CRC. It will give you what you want.
You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
on the kind of program, how often you build it, and what consequences a
hash collision could have. With a 16-bit CRC, you have a 5% chance of a collision after 82 builds. If collisions only matter for releases, and
you only release a couple of updates, fine - but if they matter during development builds, you are getting a more significant risk. Since a
32-bit CRC is quick and easy, it's worth using.
Rick,tools to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the '
Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of
And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?
Yes, I want the checksum to be readable while operating. Calculation code??? Not going to happen. That's why I want to embed the checksum.Can you expand on what you mean or expect by 'readable while operating' please? Are you planning to use some sort of tool to inspect the executing binary to 'read' this thing, or provoke output to the console in some way like:
$ run my-binary-thing --checksum
10FD
$
This would be as distinct from:
$ run my-binary-thing --version
-52
$
No, thank you, can you display the contents of registers 26 and 27 in hex please?That would be X0FE38
Thank you.
Yes, two compiled files which ended up with the same version number by error. We are using an 8 bit version number, so two hex digits. Negative numbers are lab versions, positive numbers are releases, so 64 of each.Signed 8-bit numbers range from -128 to +127 (0x80 to 0x7F) so probably a few more than 64.
... sometimes, in the lab, the rev number is not bumped when it should be.
This may be an indicator that better procedures are needed for code review-for-release. And that in independent pair of eyes should be doing the review against an agreed check list.
So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.How will the checksum 'not change'? It will be different for every build won't it?
On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown wrote:
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:Again - use a CRC. It will give you what you want.
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it
supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Again - as will a simple addition checksum.
You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
on the kind of program, how often you build it, and what consequences a
hash collision could have. With a 16-bit CRC, you have a 5% chance of a
collision after 82 builds. If collisions only matter for releases, and
you only release a couple of updates, fine - but if they matter during
development builds, you are getting a more significant risk. Since a
32-bit CRC is quick and easy, it's worth using.
Or, I might want to go with a simple checksum.
Thanks for your comments.
A simple addition checksum might be okay much of the time, but it
doesn't have the resolving power of a CRC. If the source code changes
"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
be exactly the same despite the change in the source. In general, you
will have much higher chance of collisions, though I think it would be
very hard to quantify that.
On 22/04/2023 18:56, Rick C wrote:
On Saturday, April 22, 2023 at 11:13:32?AM UTC-4, David Brown wrote:
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28?AM UTC-4, David Brown wrote:Again - use a CRC. It will give you what you want.
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it
supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Again - as will a simple addition checksum.
A simple addition checksum might be okay much of the time, but it
doesn't have the resolving power of a CRC. If the source code changes
"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
be exactly the same despite the change in the source. In general, you
will have much higher chance of collisions, though I think it would be
very hard to quantify that.
Maybe it will be good enough for you. Simple checksums were popular
once, and can still make sense if you are very short on program space.
But there are good reasons why they fell out of favour in many uses.
You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
on the kind of program, how often you build it, and what consequences a
hash collision could have. With a 16-bit CRC, you have a 5% chance of a
collision after 82 builds. If collisions only matter for releases, and
you only release a couple of updates, fine - but if they matter during
development builds, you are getting a more significant risk. Since a
32-bit CRC is quick and easy, it's worth using.
Or, I might want to go with a simple checksum.
Thanks for your comments.
It's your choice (obviously). I only point out the weaknesses in case
anyone else is listening in to the thread.
If you like, I can post code for a 32-bit CRC. It's a table, and a few
lines of C code.
On 2023-04-22, David Brown <david.brown@hesbynett.no> wrote:
A simple addition checksum might be okay much of the time, but it
doesn't have the resolving power of a CRC. If the source code changes
"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
be exactly the same despite the change in the source. In general, you
will have much higher chance of collisions, though I think it would be
very hard to quantify that.
I remember a long discussion about this a few decades ago. An N bit
additive checksum maps the source data into the same hash space
as a N-bit crc.
Therefore, for two randomly chosen sets of input bits, they both have
a 1 in 2^N chance of a collision. I think that means that for random
changes to an input set of unspecified properties, they would both
have the same chance that the hash is unchanged.
However... IIRC, somebody (probably at somewhere like Bell labs)
noticed that errors in data transmitted over media like phone lines
and microwave links are _not_ random. Errors tend to be "bursty" and
can be statistically characterized. And it was shown that for the
common error modes for _those_ media, CRCs were better at detecting real-world failures than additive checksum. And (this is also
important) a CRC is far, far simpler to implement in hardware than an additive checksum. For the same reasons, CRCs tend to get used for
things like Ethernet frames, disc sectors, etc.
Later people seem to have adopted CRCs for detecting failures in other
very dissimilar media (e.g. EPROMs) where implementing a CRC is _more_
work than an additive checksum. If the failure modes for EPROM are
similar to those studied at <wherever> when CRCs were chosen, then
CRCs are probably also a good choice for EPROMs despite the additional overhead. If the failure modes for EPROMs are significantly different,
then CRCs might be both sub-optimal and unnecessarily expensive.
I have no hard data either way, but it was never obvious to me that
the arguments people use in favor of CRCs (better at detecting burst
errors on transmission media) necessarily applied to EPROMs.
That said, I do use CRCs rather than additive checksums for things
like EPROM and flash.
On 22/04/2023 18:56, Rick C wrote:
On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown wrote:
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote: >>>> On 21/04/2023 14:12, Rick C wrote:Again - use a CRC. It will give you what you want.
Run a simple 32-bit crc over the image. The result is a hash of
This is simply to be able to say this version is unique,
regardless of what the version number says. Version numbers are
set manually and not always done correctly. I'm looking for
something as a backup so that if the checksums are different, I
can be sure the versions are not the same.
The less work involved, the better.
the image. Any change in the image will show up as a change in the
crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it
supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Again - as will a simple addition checksum.A simple addition checksum might be okay much of the time, but it
doesn't have the resolving power of a CRC. If the source code changes
"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
be exactly the same despite the change in the source. In general, you
will have much higher chance of collisions, though I think it would be
very hard to quantify that.
Maybe it will be good enough for you. Simple checksums were popular
once, and can still make sense if you are very short on program space.
But there are good reasons why they fell out of favour in many uses.
You might want to go for 32-bit CRC rather than a 16-bit CRC, depending >> on the kind of program, how often you build it, and what consequences a >> hash collision could have. With a 16-bit CRC, you have a 5% chance of a >> collision after 82 builds. If collisions only matter for releases, and
you only release a couple of updates, fine - but if they matter during
development builds, you are getting a more significant risk. Since a
32-bit CRC is quick and easy, it's worth using.
Or, I might want to go with a simple checksum.
Thanks for your comments.
It's your choice (obviously). I only point out the weaknesses in case
anyone else is listening in to the thread.
If you like, I can post code for a 32-bit CRC. It's a table, and a few
lines of C code.
On Saturday, April 22, 2023 at 1:55:01 PM UTC-4, David Brown wrote:
On 22/04/2023 18:56, Rick C wrote:
On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David BrownA simple addition checksum might be okay much of the time, but it
wrote:
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David BrownAgain - use a CRC. It will give you what you want.
wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a
This is simply to be able to say this version is unique,
regardless of what the version number says. Version
numbers are set manually and not always done correctly.
I'm looking for something as a backup so that if the
checksums are different, I can be sure the versions are
not the same.
The less work involved, the better.
hash of the image. Any change in the image will show up as
a change in the crc.
No one is trying to detect changes in the image. I'm trying
to label the image in a way that can be read in operation.
I'm using the checksum simply because that is easy to
generate. I've had problems with version numbering in the
past. It will be used, but I want it supplemented with a
number that will change every time the design changes, at
least with a high probability, such as 1 in 64k.
Again - as will a simple addition checksum.
doesn't have the resolving power of a CRC. If the source code
changes "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum
is likely to be exactly the same despite the change in the source.
In general, you will have much higher chance of collisions, though
I think it would be very hard to quantify that.
Maybe it will be good enough for you. Simple checksums were
popular once, and can still make sense if you are very short on
program space. But there are good reasons why they fell out of
favour in many uses.
It's your choice (obviously). I only point out the weaknesses in
You might want to go for 32-bit CRC rather than a 16-bit CRC,
depending on the kind of program, how often you build it, and
what consequences a hash collision could have. With a 16-bit
CRC, you have a 5% chance of a collision after 82 builds. If
collisions only matter for releases, and you only release a
couple of updates, fine - but if they matter during development
builds, you are getting a more significant risk. Since a 32-bit
CRC is quick and easy, it's worth using.
Or, I might want to go with a simple checksum.
Thanks for your comments.
case anyone else is listening in to the thread.
If you like, I can post code for a 32-bit CRC. It's a table, and a
few lines of C code.
You know nothing of the project I am working on or those that I
typically work on. But thanks for the advice.
On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
Another thing you can look at is the distribution of checksum outputs,
for random inputs. For an additive checksum, you can consider your
input as N independent 0-255 random values, added together. The result
will be a normal distribution of the checksum. If you have, say, a 100
byte data block and a 16-bit checksum, it's clear that you will never
get a checksum value greater than 25500, and that you are much more
likely to get a value close to 12750.
It never occurred to me that for an N-bit checksum, you would sum
something other than N-bit "words" of the input data.
On 23/04/2023 19:37, Grant Edwards wrote:
On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
Another thing you can look at is the distribution of checksum outputs,
for random inputs. For an additive checksum, you can consider your
input as N independent 0-255 random values, added together. The result >>> will be a normal distribution of the checksum. If you have, say, a 100 >>> byte data block and a 16-bit checksum, it's clear that you will never
get a checksum value greater than 25500, and that you are much more
likely to get a value close to 12750.
It never occurred to me that for an N-bit checksum, you would sum
something other than N-bit "words" of the input data.
Usually - in my experience - you sum bytes, using an unsigned integer
8-bit or 16-bit wide. Simple additive checksums are often used on small 8-bit microcontrollers where CRC's are seen (rightly or wrongly) as too demanding. Perhaps other people have different experiences.
You could certainly sum 16-bit words to get your 16-bit additive
checksum, and that would give a different kind of clustering - maybe
better, maybe not.
On 23/04/2023 19:34, Rick C wrote:
On Saturday, April 22, 2023 at 1:55:01 PM UTC-4, David Brown wrote:
On 22/04/2023 18:56, Rick C wrote:
On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David BrownA simple addition checksum might be okay much of the time, but it
wrote:
On 22/04/2023 05:14, Rick C wrote:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David BrownAgain - use a CRC. It will give you what you want.
wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a
This is simply to be able to say this version is unique,
regardless of what the version number says. Version
numbers are set manually and not always done correctly.
I'm looking for something as a backup so that if the
checksums are different, I can be sure the versions are
not the same.
The less work involved, the better.
hash of the image. Any change in the image will show up as
a change in the crc.
No one is trying to detect changes in the image. I'm trying
to label the image in a way that can be read in operation.
I'm using the checksum simply because that is easy to
generate. I've had problems with version numbering in the
past. It will be used, but I want it supplemented with a
number that will change every time the design changes, at
least with a high probability, such as 1 in 64k.
Again - as will a simple addition checksum.
doesn't have the resolving power of a CRC. If the source code
changes "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum
is likely to be exactly the same despite the change in the source.
In general, you will have much higher chance of collisions, though
I think it would be very hard to quantify that.
Maybe it will be good enough for you. Simple checksums were
popular once, and can still make sense if you are very short on
program space. But there are good reasons why they fell out of
favour in many uses.
It's your choice (obviously). I only point out the weaknesses in
You might want to go for 32-bit CRC rather than a 16-bit CRC,
depending on the kind of program, how often you build it, and
what consequences a hash collision could have. With a 16-bit
CRC, you have a 5% chance of a collision after 82 builds. If
collisions only matter for releases, and you only release a
couple of updates, fine - but if they matter during development
builds, you are getting a more significant risk. Since a 32-bit
CRC is quick and easy, it's worth using.
Or, I might want to go with a simple checksum.
Thanks for your comments.
case anyone else is listening in to the thread.
If you like, I can post code for a 32-bit CRC. It's a table, and a
few lines of C code.
You know nothing of the project I am working on or those that I
typically work on. But thanks for the advice.
You haven't given much to go on. It is still not really clear (to me,
at least) if you are asking about checksums or how to manipulate binary images as part of a build process, or what you are really asking.
When someone wants a checksum on an image file, the appropriate choice
in most cases is a CRC.
If security is an issue, then a secure hash is
needed. For a very limited system, additive checksums might be then
only realistic choice.
But more often, the reason people pick additive checksums rather than
CRCs is because they don't realise that CRCs are actually very simple
and efficient to implement.
People unfamiliar with them might have read
a little, and think they need to do calculations for each bit (which is possible but /slow/), or that they would have to understand the theory
of binary polynomial division rings (they don't). They think CRC's are complicated and advanced, and shy away from them.
There are a number of people who read this group - maybe some of them
have learned a little from this thread.
On 4/23/23 5:45 PM, David Brown wrote:
On 23/04/2023 19:37, Grant Edwards wrote:
On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
Another thing you can look at is the distribution of checksum outputs, >>>> for random inputs. For an additive checksum, you can consider your
input as N independent 0-255 random values, added together. The result >>>> will be a normal distribution of the checksum. If you have, say, a 100 >>>> byte data block and a 16-bit checksum, it's clear that you will never
get a checksum value greater than 25500, and that you are much more
likely to get a value close to 12750.
It never occurred to me that for an N-bit checksum, you would sum
something other than N-bit "words" of the input data.
Usually - in my experience - you sum bytes, using an unsigned integer
8-bit or 16-bit wide. Simple additive checksums are often used on
small 8-bit microcontrollers where CRC's are seen (rightly or wrongly)
as too demanding. Perhaps other people have different experiences.
You could certainly sum 16-bit words to get your 16-bit additive
checksum, and that would give a different kind of clustering - maybe
better, maybe not.
I have seen 16-bit checksums done both ways. Summing 16 bit units does eliminate the issue of clustering, and makes adjacent byte swaps
detectable.
However, in almost every case where CRC's might be useful, you have
additional checks of the sanity of the data, and an all-zero or all-one data
block would be rejected. For example, Ethernet packets use CRC for
integrity checking, but an attempt to send a packet type 0 from MAC address >>> 00:00:00:00:00:00 to address 00:00:00:00:00:00, of length 0, would be
rejected anyway.
Why look at "data" -- which may be suspect -- and *then* check its CRC?
Run the CRC first. If it fails, decide how you are going to proceed
or recover.
That is usually the order, yes. Sometimes you want "fail fast", such as dropping a packet that was not addressed to you (it doesn't matter if it was received correctly but for someone else, or it was addressed to you but the receiver address was corrupted - you are dropping the packet either way). But
usually you will run the CRC then look at the data.
But the order doesn't matter - either way, you are still checking for valid data, and if the data is invalid, it does not matter if the CRC only passed by
luck or by all zeros.
I can't think of any use-cases where you would be passing around a block of >>> "pure" data that could reasonably take absolutely any value, without any >>> type of "envelope" information, and where you would think a CRC check is >>> appropriate.
I append a *version specific* CRC to each packet of marshalled data
in my RMIs. If the data is corrupted in transit *or* if the
wrong version API ends up targeted, the operation will abend
because we know the data "isn't right".
Using a version-specific CRC sounds silly. Put the version information in the
packet.
I *could* put a header saying "this is version 4.2". And, that
tells me nothing about the integrity of the rest of the data.
OTOH, ensuring the CRC reflects "4.2" does -- it the recipient
expects it to be so.
Now you don't know if the data is corrupted, or for the wrong version - or occasionally, corrupted /and/ the wrong version but passing the CRC anyway.
Unless you are absolutely desperate to save every bit you can, your system will
be simpler, clearer, and more reliable if you separate your purposes.
You can also "salt" the calculation so that the residual
is deliberately nonzero. So, for example, "success" is
indicated by a residual of 0x474E. :>
Again, pointless.
Salt is important for security-related hashes (like password hashes), not >>>>> for integrity checks.
You've missed the point. The correct "sum" can be anything.
Why is "0" more special than any other value? As the value is
typically meaningless to anything other than the code that verifies
it, you couldn't look at an image (or the output of the verifier)
and gain anything from seeing that obscure value.
Do you actually know what is meant by "salt" in the context of hashes, and >>> why it is useful in some circumstances? Do you understand that "salt" is >>> added (usually prepended, or occasionally mixed in in some other way) to the
data /before/ the hash is calculated?
What term would you have me use to indicate a "bias" applied to a CRC
algorithm?
Well, first I'd note that any kind of modification to the basic CRC algorithm is pointless from the viewpoint of its use as an integrity check. (There have
been, mostly historically, some justifications in terms of implementation efficiency. For example, bit and byte re-ordering could be done to suit hardware bit-wise implementations.)
Otherwise I'd say you are picking a specific initial value if that is what you
are doing, or modifying the final value (inverting it or xor'ing it with a fixed value). There is, AFAIK, no specific terms for these - and I don't see
any benefit in having one. Misusing the term "salt" from cryptography is certainly not helpful.
See the RMI desciption.
I'm sorry, I have no idea what "RMI" is or where it is described. You've mentioned that abbreviation twice, but I can't figure it out.
OTOH, "salting" the calculation so that it is expected to yield
a value of 0x13 means *those* situations will be flagged as errors
(and a different set of situations will sneak by, undetected).
And that gives you exactly /zero/ benefit.
You run your hash algorithm, and check for the single value that indicates no errors. It does not matter if that number is 0, 0x13, or - often more-----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
conveniently - the number attached at the end of the image as the expected result of the hash of the rest of the data.
To be more accurate, the chances of them passing unnoticed are of the order >>> of 1 in 2^n, for a good n-bit check such as a CRC check. Certain types of >>> error are always detectable, such as single and double bit errors. That is
the point of using a checksum or hash for integrity checking.
/Intentional/ changes are a different matter. If a hacker changes the
program image, they can change the transmitted hash to their own calculated >>> hash. Or for a small CRC, they could change a different part of the image >>> until the original checksum matched - for a 16-bit CRC, that only takes
65,535 attempts in the worst case.
If the approach used is "typical", then you need far fewer attempts to
produce a correct image -- without EVER knowing where the CRC is stored.
It is difficult to know what you are trying to say here, but if you believe that different initial values in a CRC algorithm makes it harder to modify an image to make it pass the integrity test, you are simply wrong.
That is why you need to distinguish between the two possibilities. If you >>> don't have to worry about malicious attacks, a 32-bit CRC takes a dozen
lines of C code and a 1 KB table, all running extremely efficiently. If >>> security is an issue, you need digital signatures - an RSA-based signature >>> system is orders of magnitude more effort in both development time and in >>> run time.
It's considerably more expensive AND not fool-proof -- esp if the
attacker knows you are signing binaries. "OK, now I need to find
WHERE the signature is verified and just patch that "CALL" out
of the code".
I'm not sure if that is a straw-man argument, or just showing your ignorance of
the topic. Do you really think security checks are done by the program you are
trying to send securely? That would be like trying to have building security
where people entering the building look at their own security cards.
I altered the test procedure for a
piece of military gear we were building simply to skip some lengthy tests >>>> that I *knew* would pass (I don't want to inject an extra 20 minutes of >>>> wait time
just to get through a lengthy test I already know works before I can get >>>> to the test of interest to me, now.
I failed to undo the change before the official signoff on the device. >>>>
The only evidence of this was the fact that I had also patched the
startup message to say "Go for coffee..." -- which remained on the
screen for the duration of the lengthy (even with the long test
elided) procedure...
..which alerted folks to the fact that this *probably* wasn't the
original image. (The computer running the test suite on the DUT had
no problem accepting my patched binary)
And what, exactly, do you think that anecdote tells us about CRC checks for >>> image files? It reminds us that we are all fallible, but does no more than
that.
That *was* the point. Because the folks who designed the test computer
relied on common techniques to safeguard the image.
There was a human error - procedures were not good enough, or were not followed. It happens, and you learn from it and make better procedures.  The
fault was in what people did, not in an automated integrity check. Â It is completely unrelated.
The counterfeiting example I cited indicates how "obscurity/secrecy"
is far more effective (yet you dismiss it out-of-hand).
No, it does nothing of the sort. There is no connection at all.
CRC's alone are useless. All the attacker needs to do after modifying the
image is calculate the CRC themselves, and replace the original checksum >>>>> with their own.
That assumes the "alterer" knows how to replace the checksum, how it
is computed, where it is embedded in the image, etc. I modified the Compaq
portable mentioned without ever knowing where the checksum was store
or *if* it was explicitly stored. I had no desire to disassemble the >>>> BIOS ROMs (though could obviously do so as there was no "proprietary
hardware" limiting access to their contents and the instruction set of >>>> the processor is well known!).
Instead, I did this by *guessing* how they would implement such a check >>>> in a bit of kit from that era (ERPOMs aren't easily modified by malware >>>> so it wasn't likely that they would go to great lengths to "protect" the >>>> image). And, if my guess had been incorrect, I could always reinstall >>>> the original EPROMs -- nothing lost, nothing gained.
Had much experience with folks counterfeiting your products and making >>>> "simple" changes to the binaries? Like changing the copyright notice >>>> or splash screen?
Then, bringing the (accused) counterfeit of YOUR product into a courtroom >>>> and revealing the *hidden* checksum that the counterfeiter wasn't aware of?
"Gee, why does YOUR (alleged) device have *my* name in it -- in addition >>>> to behaving exactly like mine??"
[I guess obscurity has its place!]
Security by obscurity is not security. Having a hidden signature or other >>> mark can be useful for proving ownership (making an intentional mistake is >>> another common tactic - such as commercial maps having a few subtle spelling
errors). But that is not security.
Of course it is! If *you* check the "hidden signature" at runtime
and then alter "your" operation such that an altered copy fails
to perform properly, then then you have secured it.
That is not security. "Security" means that the program that starts the updated program checks the /entire/ image according to its digital signature, and rejects it /entirely/ if it does not match.
What you are talking about here is the sort of cat-and-mouse nonsense computer
games producers did with intentional disk errors to stop copying. It annoys legitimate users and does almost nothing to hinder the bad guys.
Would you want to use a check-writing program if the account
balances it maintains were subtly (but not consistently)
incorrect?
Again, you make no sense. What has this got to do with integrity checks or security?
Checking signatures, CRCs, licensing schemes, etc. all are used
in a "drop dead" fashion so considerably easier to defeat.
Witness the number of "products" available as warez...
Look, it is all /really/ simple. And the year is 2023, not 1973.
If you want to check the integrity of a file against accidental changes, a CRC
is usually fine.
If you want security, and to protect against malicious changes, use a digital signature. This must be checked by the program that /starts/ the updated code,
or that downloaded and stored it - not by the program itself!
Only answer with a support contract.
Oh, sure - the amateurs who have some of the information but not enough details, skill or knowledge to get things working will /never/ fill forums with
questions, complaints or bad reviews that bother your support staff or scare away real sales.
They are limited by idiotic government export restrictions made by ignorant >>> politicians who don't understand cryptography.
Protections don't always have to be cryptographic.
Correct, but - as with a lot of what you write - completely irrelevant to the subject at hand.
Why can't companies give out information about the security systems used in their microcontrollers (for example) ? Because some geriatric ignoramuses think banning "export" of such information to certain countries will stop those
countries knowing about security and cryptography.
Some things benefit from being kept hidden, or under restricted access. The >>> details of the CRC algorithm you use to catch accidental errors in your
image file is /not/ one of them. If you think hiding it has the remotest >>> hint of a benefit, you are doing things wrong - you need a /security/ check,
not a simple /integrity/ check.
And then once you have switched to a security check - a digital signature - >>> there's no need to keep that choice hidden either, because it is the /key/ >>> that is important, not the type of lock.
Again, meaningless if the attacker can interfere with the *enforcement*
of that check. Using something "well known" just means he already knows
what to look for in your code. Or, how to interfere with your
intended implementation in ways that you may have not anticipated
(confident that your "security" can't be MATHEMATICALLY broken).
If the attacker can interfere with the enforcement of the check, then it doesn't matter what checks you have. Keeping the design of a building's locks
secret does not help you if the bad guys have bribed the security guard /inside/ the building!
"Here's my source code. Here are my schematics. Here's the
name of the guy who oversees production (bribe him to gain
access to the keys stored in the TPM). Now, what are you
gonna *do* with all that?"
The first two should be fine - if people can break your security after looking
at your source code or schematics, your security is /bad/. As for the third one, if they can break your security by going through the production guy, your
production procedures are bad.
On Sunday, April 23, 2023 at 5:58:51 PM UTC-4, David Brown wrote:
When someone wants a checksum on an image file, the appropriate
choice in most cases is a CRC.
Why? What makes a CRC an "appropriate" choice. Normally, when I
design something, I establish the requirements. What requirements
are you assuming, that would make the CRC more desireable than a
simple checksum?
If security is an issue, then a secure hash is needed. For a very
limited system, additive checksums might be then only realistic
choice.
What have I said that makes you think security is an issue??? I
don't recall ever mentioning anything about security. Do you recall
what I did say?
If you think a discussion of CRC calculations would be useful, why
don't you open a thread and discuss them, instead of insisting they
are the right solution to my problem, when you don't even know what
the problem requirements are? It's all here in the thread. You only
need to read, without projecting your opinions on the problem
statement.
On 24/04/2023 00:24, Rick C wrote:
On Sunday, April 23, 2023 at 5:58:51 PM UTC-4, David Brown wrote:
When someone wants a checksum on an image file, the appropriate
choice in most cases is a CRC.
Why? What makes a CRC an "appropriate" choice. Normally, when I
design something, I establish the requirements. What requirements
are you assuming, that would make the CRC more desireable than a
simple checksum?
I've already explained this in quite a lot of detail in this thread (as
have others). If you don't like my explanation, or didn't read it,
that's okay. You are under no obligation to learn about CRCs. Or if
you prefer to look it up in other sources, that's obviously also an option.
If security is an issue, then a secure hash is needed. For a very
limited system, additive checksums might be then only realistic
choice.
What have I said that makes you think security is an issue??? I
don't recall ever mentioning anything about security. Do you recall
what I did say?
If you think a discussion of CRC calculations would be useful, why
don't you open a thread and discuss them, instead of insisting they
are the right solution to my problem, when you don't even know what
the problem requirements are? It's all here in the thread. You only
need to read, without projecting your opinions on the problem
statement.
I've asked you this before - are you /sure/ you understand how Usenet works?
On 4/22/2023 7:57 AM, David Brown wrote:
However, in almost every case where CRC's might be useful, you have
additional checks of the sanity of the data, and an all-zero or
all-one data block would be rejected. For example, Ethernet packets
use CRC for integrity checking, but an attempt to send a packet type
0 from MAC address 00:00:00:00:00:00 to address 00:00:00:00:00:00,
of length 0, would be rejected anyway.
Why look at "data" -- which may be suspect -- and *then* check its CRC?
Run the CRC first. If it fails, decide how you are going to proceed
or recover.
That is usually the order, yes. Sometimes you want "fail fast", such
as dropping a packet that was not addressed to you (it doesn't matter
if it was received correctly but for someone else, or it was addressed
to you but the receiver address was corrupted - you are dropping the
packet either way). But usually you will run the CRC then look at the
data.
But the order doesn't matter - either way, you are still checking for
valid data, and if the data is invalid, it does not matter if the CRC
only passed by luck or by all zeros.
You're assuming the CRC is supposed to *vouch* for the data.
The CRC can be there simply to vouch for the *transport* of a
datagram.
So, use a version-specific CRC on the packet. If it fails, then
either the data in the packet has been corrupted (which could just
as easily have involved an embedded "interface version" parameter);
or the packet was formed with the wrong CRC.
If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
why bother looking at a "protocol version" parameter? Would
you ALSO want to verify all the rest of the parameters?
What term would you have me use to indicate a "bias" applied to a CRC
algorithm?
Well, first I'd note that any kind of modification to the basic CRC
algorithm is pointless from the viewpoint of its use as an integrity
check. (There have been, mostly historically, some justifications in
terms of implementation efficiency. For example, bit and byte
re-ordering could be done to suit hardware bit-wise implementations.)
Otherwise I'd say you are picking a specific initial value if that is
what you are doing, or modifying the final value (inverting it or
xor'ing it with a fixed value). There is, AFAIK, no specific terms
for these - and I don't see any benefit in having one. Misusing the
term "salt" from cryptography is certainly not helpful.
Salt just ensures that you can differentiate between functionally identical values. I.e., in a CRC, it differentiates between the "0x0000" that CRC-1 generates from the "0x0000" that CRC-2 generates.
You don't see the parallel to ensuring that *my* use of "Passw0rd" is
encoded in a different manner than *your* use of "Passw0rd"?
See the RMI desciption.
I'm sorry, I have no idea what "RMI" is or where it is described.
You've mentioned that abbreviation twice, but I can't figure it out.
<https://en.wikipedia.org/wiki/RMI>
<https://en.wikipedia.org/wiki/OCL>
Nothing magical with either term.
OTOH, "salting" the calculation so that it is expected to yield
a value of 0x13 means *those* situations will be flagged as errors
(and a different set of situations will sneak by, undetected).
And that gives you exactly /zero/ benefit.
See above.
You run your hash algorithm, and check for the single value that-----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
indicates no errors. It does not matter if that number is 0, 0x13, or
- often more
As you've admitted, it doesn't matter. So, why wouldn't I opt to have
an algorithm for THIS interface give me a result that is EXPECTED
for this protocol? What value picking "0"?
That is why you need to distinguish between the two possibilities.
If you don't have to worry about malicious attacks, a 32-bit CRC
takes a dozen lines of C code and a 1 KB table, all running
extremely efficiently. If security is an issue, you need digital
signatures - an RSA-based signature system is orders of magnitude
more effort in both development time and in run time.
It's considerably more expensive AND not fool-proof -- esp if the
attacker knows you are signing binaries. "OK, now I need to find
WHERE the signature is verified and just patch that "CALL" out
of the code".
I'm not sure if that is a straw-man argument, or just showing your
ignorance of the topic. Do you really think security checks are done
by the program you are trying to send securely? That would be like
trying to have building security where people entering the building
look at their own security cards.
Do YOU really think we all design applications that run in PCs where some CLOSED OS performs these tests in a manner that can't be subverted?
*WE* (tend to) write ALL the code in the products developed, here.
So, whether it's the POST WE wrote that is performing the test or
the loader WE wrote, it's still *our* program.
Yes, we ARE looking at our own security cards!
Manufacturers *try* to hide ("obscurity") details of these mechanisms
in an attempt to improve effective security. But, there's nothing
that makes these guarantees.
Give me the sources for Windows (Linux, *BSD, etc.) and I can
subvert all the state-of-the-art digital signing used to ensure
binaries aren't altered. Nothing *outside* the box is involved
so, by definition, everything I need has to reside *in* the box.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.infinitum.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in thebit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.infinitum.
I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad
I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting
I keep thinking there is a different way of looking at this to achieve the result I want...
Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this
This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in thebit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
On 20/04/2023 18:45, Rick C wrote:
On Thursday, April 20, 2023 at 11:33:28 AM UTC-4, George Neuner
wrote:
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embedTake a look at the old xmodem/ymodem CRC. It was designed such
a checksum in a code module to report the checksum, is there a
way of doing this? It's a bit like being your own grandfather, I
think.
that when the CRC was sent immediately following the data, a
receiver computing CRC over the whole incoming packet (data and CRC
both) would get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use
xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather aSorry, I don't know a way to do it with a modular checksum. YMMV,
simple modulo N addition, maybe N being 2^16.
but I think 16-bit CRC is pretty simple.
George
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
You "insert" the value at the end. Anything else is insane.
CRC's are quite good hashes, for suitable sized data. There are perhaps some special cases, but basically you'd be doing trial-and-error
searches to find an inserted value that gives you a zero CRC overall.
2^16 is not an overwhelming search space, but the whole idea is pointless.
Even so, I'm not trying to validate the file. I'm trying to come up
with a substitute for a time stamp or version number. I don't want
to have to rely on my consistency in handling the version number
correctly. This would be a backup in case there was more than one
version released, even only within the "lab", that were different. A
checksum that could be read by the controlling software would do the
job.
A CRC is fine for that.
I have run into this before, where the version number was not a 100%
indication of the uniqueness of an executable. The checksum would be
a second indicator.
I should mention that I'm not looking for a solution that relies on
any specific details of the tools.
A table-based CRC is easy, runs quickly, and can be quickly ported to
pretty much any language (the C and Python code, for example, is almost
the same).
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:used, but I want it supplemented with a number that will change every time the design changes, at least with a high probability, such as 1 in 64k.
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of the
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup
so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
image. Any change in the image will show up as a change in the crc.
No one is trying to detect changes in the image. I'm trying to label the image in a way that can be read in operation. I'm using the checksum simply because that is easy to generate. I've had problems with version numbering in the past. It will be
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
The proper way to do this is to have a directive in the linker.
This reserves space for the CRC and defines the area where the CRC is calculated.
I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not signed
an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which released
their code under MIT, the licensing has not been resolved yet
but the patch is in binutils git, but reverted.
You would write (IIRC):
DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in the final link stage.
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.The proper way to do this is to have a directive in the linker.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker.
How does the application access this information?
You are making a lot of assumptions about the tools. I'm pretty sure
they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
 _checksum EQU.  .
  extern uint8[8] checksum;
On 2023-04-27 20:09, Rick C wrote:
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote: >>> Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker.
Almost all toolchains have a linker.
How does the application access this information?
In Ulf's suggestion, it seems the DIGEST directive emits 8 bytes of
checksum at the current point (usually the linker "." symbol). I assume
one can give that point in the image a linkage symbol, perhaps like
 _checksum DIGEST "CRC64-ECMA", (from, to)
or like
 _checksum EQU.  .
            DIGEST "CRC64-ECMA", (from, to)
(This is schematic linker code, not necessarily proper syntax.)
One can then from the application code access the "checksum" location as
an externally defined object, say:
  extern uint8[8] checksum;
The linker will connect that C identifier to the actual address of the
DIGEST checksum. Here I assumed that the C compiler mangles C
identifiers into linkage symbols by prefixing an underscore; YMMV.
No, you reserve room for the checksum, but that needs to be outsideYou are making a lot of assumptions about the tools. I'm pretty sure
they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
But you have insisted that your "checksum" is for the purpose of
identifying the version of the program, not for checking the integrity
of the memory image. If so, that checksum does not have to be the
checksum of the whole memory image, as long as it is the checksum of the
part of the image that contains the actual code and constant data, and
so will change according to changes in those parts of the image.
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.The proper way to do this is to have a directive in the linker.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker. How does the application access this information?
unless I'm missing something significant.I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not signed
an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which released
their code under MIT, the licensing has not been resolved yet
but the patch is in binutils git, but reverted.
You would write (IIRC):
DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in the
final link stage.
You are making a lot of assumptions about the tools. I'm pretty sure they don't apply to my case. I'm not at all clear how this is workable, anyway. Adding the checksum to the file, changes the checksum, which is where this conversation started...
Den 2023-04-27 kl. 19:09, skrev Rick C:
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote: >>> Den 2023-04-20 kl. 04:06, skrev Rick C:Linker command file
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker. How does the application access this
information?
       public CRC64; start, stop
       HEADER = .;
       QUAD(MAGIC);
    CRC64 = .;
       DIGEST "CRC64-ECMA", (start, stop)
       start = .;
       # Your data to be protected
       ...
       stop = .;
C source code.
extern uint64_t CRC64;
extern char* start;
extern char* stop;
uint64_t crc;
crc64 = calc_crc64_ecma(start, stop);
if (crc64 == CRC64) {
  /* everything is OK */
}
Den 2023-04-20 kl. 22:26, skrev David Brown:
On 20/04/2023 18:45, Rick C wrote:
On Thursday, April 20, 2023 at 11:33:28 AM UTC-4, George Neuner
wrote:
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embedTake a look at the old xmodem/ymodem CRC. It was designed such
a checksum in a code module to report the checksum, is there a
way of doing this? It's a bit like being your own grandfather, I
think.
that when the CRC was sent immediately following the data, a
receiver computing CRC over the whole incoming packet (data and CRC
both) would get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use
xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather aSorry, I don't know a way to do it with a modular checksum. YMMV,
simple modulo N addition, maybe N being 2^16.
but I think 16-bit CRC is pretty simple.
George
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
You "insert" the value at the end. Anything else is insane.
In all projects I have been involved with, the application binary starts
with a header looking like this.
MAGIC WORD 1
CRC
Entry Point
Size
other info...
MAGIC WORD 2
APPLICATION_START
...
APPLICATION_END (aligned with flash sector)
The bootloader first checks the two magic words.
It then computes CRC on the header (from Entry Point) to APPLICATION_END
I ported the IAR ielftool (open source) to Linux at https://github.com/emagii/ielftool
This can insert the CRC in the ELF file, but needs tweaks to work
with an ELF file generated by the GNU tools.
/Ulf
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is calculated.
I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not signed
an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which released
their code under MIT, the licensing has not been resolved yet
but the patch is in binutils git, but reverted.
You would write (IIRC):
  DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in the final link stage.
/Ulf
Den 2023-04-22 kl. 05:14, skrev Rick C:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of the
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup
so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
image. Any change in the image will show up as a change in the crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it
supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Another thing I added (and was later removed) was a timestamp directive.
A 64 bit integer with the number of seconds since 1970-01-01 00:00.
On 2023-04-27 20:09, Rick C wrote:
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote: >>> Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker.
Almost all toolchains have a linker.
Den 2023-04-27 kl. 20:29, skrev Niklas Holsti:
On 2023-04-27 20:09, Rick C wrote:
No, you reserve room for the checksum, but that needs to be outside
You are making a lot of assumptions about the tools. I'm pretty sure
they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
the checked area.
The address of the checksum needs to be known to the application.
Also the limits of the checked area.
That is why the application has a header in front in my projects.
The application is started by the bootloader, which checks
a number of things before the application is started.
The application can read the header as well to allow checking
the code area at runtime.
On 27/04/2023 18:36, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 22:26, skrev David Brown:
On 20/04/2023 18:45, Rick C wrote:
On Thursday, April 20, 2023 at 11:33:28 AM UTC-4, George Neuner
wrote:
On Wed, 19 Apr 2023 19:06:33 -0700 (PDT), Rick C
<gnuarm.del...@gmail.com> wrote:
This is a bit of the chicken and egg thing. If you want a embedTake a look at the old xmodem/ymodem CRC. It was designed such
a checksum in a code module to report the checksum, is there a
way of doing this? It's a bit like being your own grandfather, I
think.
that when the CRC was sent immediately following the data, a
receiver computing CRC over the whole incoming packet (data and CRC
both) would get a result of zero.
But AFAIK it doesn't work with CCITT equation(s) - you have to use
xmodem/ymodem.
I'm not thinking anything too fancy, like a CRC, but rather aSorry, I don't know a way to do it with a modular checksum. YMMV,
simple modulo N addition, maybe N being 2^16.
but I think 16-bit CRC is pretty simple.
George
CRC is not complicated, but I would not know how to calculate an
inserted value to force the resulting CRC to zero. How do you do
that?
You "insert" the value at the end. Anything else is insane.
In all projects I have been involved with, the application binary starts
with a header looking like this.
MAGIC WORD 1
CRC
Entry Point
Size
other info...
MAGIC WORD 2
APPLICATION_START
...
APPLICATION_END (aligned with flash sector)
The bootloader first checks the two magic words.
It then computes CRC on the header (from Entry Point) to APPLICATION_END
I ported the IAR ielftool (open source) to Linux at
https://github.com/emagii/ielftool
This can insert the CRC in the ELF file, but needs tweaks to work
with an ELF file generated by the GNU tools.
/Ulf
That can work for some microcontrollers, but is unsuitable for others -
it depends on how the flash is organised. For an msp430, for example,
it would be fine, as the interrupt vectors (including the reset vector)
are at the end of flash. But for most ARM Cortex M devices, it would
not be suitable - they expect the reset vector and initial stack pointer
at the start of the flash image. Some devices have a boot ROM, and then
you have to match their specifics for the header - or you can have your
own boot program, and make the header how ever you like.
I am absolutely a fan of having some kind of header like this (and
sometimes even a human-readable copyright notice, identifier and version information). And having it as near the beginning as possible is good.
But for many microcontrollers, having it at the start is not feasible.
And if you can't put the CRC at the start like you do, you have to put
it at the end of the image.
I've never really thought about trying to inject a CRC into an elf file.
 I use elfs (or should that be "elves" ?) for debugging, not flash programming. And usually the main concern for having a CRC at the end
of the image is when you have an online update of some kind, to check
that nothing has gone wrong during the transfer or in-field update.
On 27/04/2023 18:42, Ulf Samuelsson wrote:
Den 2023-04-22 kl. 05:14, skrev Rick C:
On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
On 21/04/2023 14:12, Rick C wrote:
Run a simple 32-bit crc over the image. The result is a hash of the
This is simply to be able to say this version is unique, regardless
of what the version number says. Version numbers are set manually
and not always done correctly. I'm looking for something as a backup >>>>> so that if the checksums are different, I can be sure the versions
are not the same.
The less work involved, the better.
image. Any change in the image will show up as a change in the crc.
No one is trying to detect changes in the image. I'm trying to label
the image in a way that can be read in operation. I'm using the
checksum simply because that is easy to generate. I've had problems
with version numbering in the past. It will be used, but I want it
supplemented with a number that will change every time the design
changes, at least with a high probability, such as 1 in 64k.
Another thing I added (and was later removed) was a timestamp directive.
A 64 bit integer with the number of seconds since 1970-01-01 00:00.
Timestamping a build in some way (as part of the "make", using __DATE__
or __TIME__ in source code, or some feature of a revision control
system) is very tempting, and can be helpful for tracking exactly what
code you have on the system.
However, IMHO having reproducible builds is much more valuable. I am
not happy with a project build until I am getting identical binaries
built on multiple hosts (Windows and Linux). That's how you can be absolutely sure of what code went into a particular binary, even years
or decades later.
A compromise that can work is to distinguish development builds and production builds, and have timestamping in development builds. That
also reduces the rate at which your minor version number or build number
goes up, and avoids endless changes to your "version.h" include file.
On 27/04/2023 22:44, Ulf Samuelsson wrote:
Den 2023-04-27 kl. 20:29, skrev Niklas Holsti:
On 2023-04-27 20:09, Rick C wrote:
No, you reserve room for the checksum, but that needs to be outside
You are making a lot of assumptions about the tools. I'm pretty sure >>>> they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
the checked area.
The address of the checksum needs to be known to the application.
The address here could have a symbol, and then declared "extern" in the
C code - it would not have to be a known numerical address. But if the image is checked or started from another program (such as a boot
program), you need an absolute address somewhere to chain this all
together.
Also the limits of the checked area.
That is why the application has a header in front in my projects.
The application is started by the bootloader, which checks
a number of things before the application is started.
The application can read the header as well to allow checking
the code area at runtime.
Or for my preferences, the CRC "DIGEST" would be put at the end of the
image, rather than near the start. Then the "from, to" range would
cover the entire image except for the final CRC. But I'd have a similar directive for the length of the image at a specific area near the start.
On 2023-04-27 23:36, Ulf Samuelsson wrote:
Den 2023-04-27 kl. 19:09, skrev Rick C:
On Thursday, April 27, 2023 at 12:26:47 PM UTC-4, Ulf Samuelsson wrote: >>>> Den 2023-04-20 kl. 04:06, skrev Rick C:Linker command file
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
That assumes there is a linker. How does the application access this
information?
        public CRC64; start, stop
        HEADER = .;
        QUAD(MAGIC);
     CRC64 = .;
        DIGEST "CRC64-ECMA", (start, stop)
        start = .;
        # Your data to be protected
        ...
        stop = .;
C source code.
extern uint64_t CRC64;
extern char* start;
extern char* stop;
uint64_t crc;
crc64 = calc_crc64_ecma(start, stop);
if (crc64 == CRC64) {
   /* everything is OK */
}
I'm nit-picking, but that C code does not look right to me. The extern declarations for "start" and "stop" claim them to be names of memory locations that contain addresses, but the linker file just places them
at the starting and one-past-end locations of the block to be protected.
So the "start" variable contains the first bytes of the "data to be protected", and the contents of the "stop" variable are not defined
because it is placed after the "data to be protected", where no code or
data is loaded (it seems).
It seems to me that the call to calc_crc64_ecma should get the addresses
of "start" and "stop" as arguments (&start, &stop), instead of their
values. But perhaps calc_crc64_ecma is not a function, but a macro that
can itself take the addresses of its parameters.
On 27/04/2023 18:26, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not signed
an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which released
their code under MIT, the licensing has not been resolved yet
but the patch is in binutils git, but reverted.
You would write (IIRC):
   DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in
the final link stage.
/Ulf
I like that. Thanks for doing that work.
Is there also a way to get the length of the final link, and insert it
near the beginning of the image? I suppose that would be another kind
of DIGEST where the algorithm is simply (to - from). (I assume that
"to" and "from" may be linker symbols.)
Den 2023-04-28 kl. 09:38, skrev David Brown:
On 27/04/2023 22:44, Ulf Samuelsson wrote:
Den 2023-04-27 kl. 20:29, skrev Niklas Holsti:
On 2023-04-27 20:09, Rick C wrote:
No, you reserve room for the checksum, but that needs to be outside
You are making a lot of assumptions about the tools. I'm pretty sure >>>>> they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
the checked area.
The address of the checksum needs to be known to the application.
The address here could have a symbol, and then declared "extern" in
the C code - it would not have to be a known numerical address. But
if the image is checked or started from another program (such as a
boot program), you need an absolute address somewhere to chain this
all together.
The header is declared as a struct.
Also the limits of the checked area.
That is why the application has a header in front in my projects.
The application is started by the bootloader, which checks
a number of things before the application is started.
The application can read the header as well to allow checking
the code area at runtime.
Or for my preferences, the CRC "DIGEST" would be put at the end of the
image, rather than near the start. Then the "from, to" range would
cover the entire image except for the final CRC. But I'd have a
similar directive for the length of the image at a specific area near
the start.
I really do not see a benefit of splitting the meta information about
the image to two separate locations.
The bootloader uses the struct for all checks.
It is a much simpler implementation once the tools support it.
You might find it easier to write a tool which adds the CRC at the end,
but that is a different issue.
Occam's Razor!
Den 2023-04-28 kl. 09:24, skrev David Brown:
On 27/04/2023 18:26, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC is
calculated.
I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not signed >>> an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which released
their code under MIT, the licensing has not been resolved yet
but the patch is in binutils git, but reverted.
You would write (IIRC):
   DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in
the final link stage.
/Ulf
I like that. Thanks for doing that work.
Is there also a way to get the length of the final link, and insert it
near the beginning of the image? I suppose that would be another kind
of DIGEST where the algorithm is simply (to - from). (I assume that
"to" and "from" may be linker symbols.)
  app_size = .;
  LONG(to-from);
should work using the GNU linker.
On 28/04/2023 10:56, Ulf Samuelsson wrote:
Den 2023-04-28 kl. 09:24, skrev David Brown:
On 27/04/2023 18:26, Ulf Samuelsson wrote:
Den 2023-04-20 kl. 04:06, skrev Rick C:
This is a bit of the chicken and egg thing. If you want a embed aThe proper way to do this is to have a directive in the linker.
checksum in a code module to report the checksum, is there a way of
doing this? It's a bit like being your own grandfather, I think.
This reserves space for the CRC and defines the area where the CRC
is calculated.
I am not aware of any linker which support this.
Two months ago, I added the DIGEST directive to binutils aka the GNU
linker. It was committed, but then people realized that I had not
signed
an agreement with Free Software Foundation.
Since part of the code I pushed was from a third party which
released their code under MIT, the licensing has not been resolved yet >>>> but the patch is in binutils git, but reverted.
You would write (IIRC):
   DIGEST "CRC64-ECMA", (from, to)
and the linker would reserve 8 bytes which is filled with the CRC in
the final link stage.
/Ulf
I like that. Thanks for doing that work.
Is there also a way to get the length of the final link, and insert
it near the beginning of the image? I suppose that would be another
kind of DIGEST where the algorithm is simply (to - from). (I assume
that "to" and "from" may be linker symbols.)
   app_size = .;
   LONG(to-from);
should work using the GNU linker.
Will that work when placed earlier in the link than the definition of
"to" ? I had assumed - perhaps completely incorrectly - that the linker would have to have established the value of "to" before its use in such
an expression.
On 28/04/2023 10:50, Ulf Samuelsson wrote:
Den 2023-04-28 kl. 09:38, skrev David Brown:
On 27/04/2023 22:44, Ulf Samuelsson wrote:
Den 2023-04-27 kl. 20:29, skrev Niklas Holsti:
On 2023-04-27 20:09, Rick C wrote:
No, you reserve room for the checksum, but that needs to be outside
You are making a lot of assumptions about the tools. I'm pretty sure >>>>>> they don't apply to my case. I'm not at all clear how this is
workable, anyway. Adding the checksum to the file, changes the
checksum, which is where this conversation started... unless I'm
missing something significant.
the checked area.
The address of the checksum needs to be known to the application.
The address here could have a symbol, and then declared "extern" in
the C code - it would not have to be a known numerical address. But
if the image is checked or started from another program (such as a
boot program), you need an absolute address somewhere to chain this
all together.
The header is declared as a struct.
Also the limits of the checked area.
That is why the application has a header in front in my projects.
The application is started by the bootloader, which checks
a number of things before the application is started.
The application can read the header as well to allow checking
the code area at runtime.
Or for my preferences, the CRC "DIGEST" would be put at the end of
the image, rather than near the start. Then the "from, to" range
would cover the entire image except for the final CRC. But I'd have
a similar directive for the length of the image at a specific area
near the start.
I really do not see a benefit of splitting the meta information about
the image to two separate locations.
The bootloader uses the struct for all checks.
It is a much simpler implementation once the tools support it.
You might find it easier to write a tool which adds the CRC at the
end, but that is a different issue.
Occam's Razor!
There are different needs for different projects - and more than one way
to handle them. I find adding a CRC at the end of the image works best
for me, but I have no problem appreciating that other people have
different solutions.
Den 2023-04-28 kl. 15:04, skrev David Brown:
On 28/04/2023 10:50, Ulf Samuelsson wrote:
Den 2023-04-28 kl. 09:38, skrev David Brown:
I'd be curious to know WHY it works best for you.
Or for my preferences, the CRC "DIGEST" would be put at the end of
the image, rather than near the start. Then the "from, to" range
would cover the entire image except for the final CRC. But I'd have
a similar directive for the length of the image at a specific area
near the start.
I really do not see a benefit of splitting the meta information about
the image to two separate locations.
The bootloader uses the struct for all checks.
It is a much simpler implementation once the tools support it.
You might find it easier to write a tool which adds the CRC at the
end, but that is a different issue.
Occam's Razor!
There are different needs for different projects - and more than one
way to handle them. I find adding a CRC at the end of the image works
best for me, but I have no problem appreciating that other people have
different solutions.
/Ulf
On 24/04/2023 09:32, Don Y wrote:
On 4/22/2023 7:57 AM, David Brown wrote:
However, in almost every case where CRC's might be useful, you have
additional checks of the sanity of the data, and an all-zero or all-one >>>>> data block would be rejected. For example, Ethernet packets use CRC for >>>>> integrity checking, but an attempt to send a packet type 0 from MAC
address 00:00:00:00:00:00 to address 00:00:00:00:00:00, of length 0, would
be rejected anyway.
Why look at "data" -- which may be suspect -- and *then* check its CRC? >>>> Run the CRC first. If it fails, decide how you are going to proceed
or recover.
That is usually the order, yes. Sometimes you want "fail fast", such as >>> dropping a packet that was not addressed to you (it doesn't matter if it was
received correctly but for someone else, or it was addressed to you but the >>> receiver address was corrupted - you are dropping the packet either way). >>> But usually you will run the CRC then look at the data.
But the order doesn't matter - either way, you are still checking for valid >>> data, and if the data is invalid, it does not matter if the CRC only passed >>> by luck or by all zeros.
You're assuming the CRC is supposed to *vouch* for the data.
The CRC can be there simply to vouch for the *transport* of a
datagram.
I am assuming that the CRC is there to determine the integrity of the data in the face of possible unintentional errors. That's what CRC checks are for. They have nothing to do with the content of the data, or the type of the data package or image.
As an example of the use of CRC's in messaging, look at Ethernet frames:
<https://en.wikipedia.org/wiki/Ethernet_frame>
The CRCÂ does not care about the content of the data it protects.
So, use a version-specific CRC on the packet. If it fails, then
either the data in the packet has been corrupted (which could just
as easily have involved an embedded "interface version" parameter);
or the packet was formed with the wrong CRC.
If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
why bother looking at a "protocol version" parameter? Would
you ALSO want to verify all the rest of the parameters?
I'm sorry, I simply cannot see your point. Identifying the version of a protocol, or other protocol type information, is a totally orthogonal task to ensuring the integrity of the data. The concepts should be handled separately.
What term would you have me use to indicate a "bias" applied to a CRC
algorithm?
Well, first I'd note that any kind of modification to the basic CRC
algorithm is pointless from the viewpoint of its use as an integrity check. >>> (There have been, mostly historically, some justifications in terms of
implementation efficiency. For example, bit and byte re-ordering could be >>> done to suit hardware bit-wise implementations.)
Otherwise I'd say you are picking a specific initial value if that is what >>> you are doing, or modifying the final value (inverting it or xor'ing it with
a fixed value). There is, AFAIK, no specific terms for these - and I don't
see any benefit in having one. Misusing the term "salt" from cryptography >>> is certainly not helpful.
Salt just ensures that you can differentiate between functionally identical >> values. I.e., in a CRC, it differentiates between the "0x0000" that CRC-1 >> generates from the "0x0000" that CRC-2 generates.
Can we agree that this is called an "initial value", not "salt" ?
You don't see the parallel to ensuring that *my* use of "Passw0rd" is
encoded in a different manner than *your* use of "Passw0rd"?
No. They are different things.
An important difference is that adding "salt" to a password hash is an important security feature. Picking a different initial value for a CRC instead of having appropriate protocol versioning in the data (or a surrounding
envelope) is a misfeature.
The second difference is the purpose of the hashing. The CRC here is for data
integrity - spotting mistakes in the data during transfer or storage. The hash
in a password is for security, avoiding the password ever being transmitted or
stored in plain text.
Any coincidence in the the way these might be implemented is just that - coincidence.
See the RMI desciption.
I'm sorry, I have no idea what "RMI" is or where it is described. You've >>> mentioned that abbreviation twice, but I can't figure it out.
<https://en.wikipedia.org/wiki/RMI>
<https://en.wikipedia.org/wiki/OCL>
Nothing magical with either term.
I looked up RMI on Wikipedia before asking, and saw nothing of relevance to CRC's or checksums.
I noticed no mention of "OCL" in your posts, and looking
---8<---8<---I can't think of any use-cases where you would be passing around a block of
"pure" data that could reasonably take absolutely any value, without any
type of "envelope" information, and where you would think a CRC check is
appropriate.
I append a *version specific* CRC to each packet of marshalled data
in my RMIs. If the data is corrupted in transit *or* if the
wrong version API ends up targeted, the operation will abend
because we know the data "isn't right".
Using a version-specific CRC sounds silly. Put the version information in
the packet.
The packet routed to a particular interface is *supposed* to
conform to "version X" of an interface. There are different stubs
generated for different versions of EACH interface. The OCL for
the interface defines (and is used to check) the form of that
interface to that service/mechanism.
The parameters are checked on the client side -- why tie up the
transport medium with data that is inappropriate (redundant)
to THAT interface? Why tie up the server verifying that data?
The stub generator can perform all of those checks automatically
and CONSISTENTLY based on the OCL definition of that version
of that interface (because developers make mistakes).
So, at the instant you schedule the marshalled data for transmission,
you *know* the parameters are "appropriate" and compliant with
the constraints of THAT version of THAT interface.
Now, you have to ensure the packet doesn't get corrupted (altered) in transmission. If it remains intact, then there is no need to check
the parameters on the server side.
NONE OF THE PARAMETERS... including the (implied) "interface version" field!
Yet, folks make mistakes. So, you want some additional reassurance
that this is at least intended for this version of the interface,
ESPECIALLY IF THAT CAN BE MADE AVAILABLE FOR ZERO COST (i.e., check
to see if the residual is 0xDEADBEEF instead of 0xB16B00B5).
Why burden the packet with a "protocol version" parameter?
it up on Wikipedia gives no clues.
So for now, I'll assume you don't want anyone to know what you meant and I can
safely ignore anything you write in connection with the terms.
OTOH, "salting" the calculation so that it is expected to yield
a value of 0x13 means *those* situations will be flagged as errors
(and a different set of situations will sneak by, undetected).
And that gives you exactly /zero/ benefit.
See above.
I did. Zero benefit.
Actually, it is worse than useless - it makes it harder to identify the protocol, and reduces the information content of the CRC check.
You run your hash algorithm, and check for the single value that indicates >>> no errors. It does not matter if that number is 0, 0x13, or - often more >> -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As you've admitted, it doesn't matter. So, why wouldn't I opt to have
an algorithm for THIS interface give me a result that is EXPECTED
for this protocol? What value picking "0"?
A /single/ result does not matter (other than needlessly complicating things).
Having multiple different valid results /does/ matter.
That is why you need to distinguish between the two possibilities. If you >>>>> don't have to worry about malicious attacks, a 32-bit CRC takes a dozen >>>>> lines of C code and a 1 KB table, all running extremely efficiently. If >>>>> security is an issue, you need digital signatures - an RSA-based signature
system is orders of magnitude more effort in both development time and in >>>>> run time.
It's considerably more expensive AND not fool-proof -- esp if the
attacker knows you are signing binaries. "OK, now I need to find
WHERE the signature is verified and just patch that "CALL" out
of the code".
I'm not sure if that is a straw-man argument, or just showing your ignorance
of the topic. Do you really think security checks are done by the program >>> you are trying to send securely? That would be like trying to have building
security where people entering the building look at their own security cards.
Do YOU really think we all design applications that run in PCs where some
CLOSED OS performs these tests in a manner that can't be subverted?
Do you bother to read my posts at all? Or do you prefer to make up things that
you imagine I write, so that you can make nonsensical attacks on them? Certainly there is no sane reading of my posts (written and sent from an /open/
OS) where "do not rely on security by obscurity" could be taken to mean "rely on obscured and closed platforms".
*WE* (tend to) write ALL the code in the products developed, here.
So, whether it's the POST WE wrote that is performing the test or
the loader WE wrote, it's still *our* program.
Yes, we ARE looking at our own security cards!
Manufacturers *try* to hide ("obscurity") details of these mechanisms
in an attempt to improve effective security. But, there's nothing
that makes these guarantees.
Why are you trying to "persuade" me that manufacturer obscurity is a bad thing? You have been promoting obscurity of algorithms as though it were helpful for security - I have made clear that it is not. Are you getting your
own position mixed up with mine?
Give me the sources for Windows (Linux, *BSD, etc.) and I can
subvert all the state-of-the-art digital signing used to ensure
binaries aren't altered. Nothing *outside* the box is involved
so, by definition, everything I need has to reside *in* the box.
No, you can't. The sources for Linux and *BSD /are/ all freely available. The
private signing keys used by, for example, Red Hat or Debian, are /not/ freely
available. You cannot make changes to a Red Hat or Debian package that will pass the security checks - you are unable to sign the packages.
This is precisely because something /outside/ the box /is/ involved - the private half of the public/private key used for signing. The public half - and
all the details of the algorithms - is easily available to let people verify the signature, but the private half is kept secret.
(Sorry, but I've skipped and snipped the rest. I simply don't have time to go
through it in detail. If others find it useful or interesting, that's great,
but there has to be limits somewhere.)
On 4/24/2023 7:37 AM, David Brown wrote:
On 24/04/2023 09:32, Don Y wrote:
On 4/22/2023 7:57 AM, David Brown wrote:
However, in almost every case where CRC's might be useful, you
have additional checks of the sanity of the data, and an all-zero
or all-one data block would be rejected. For example, Ethernet
packets use CRC for integrity checking, but an attempt to send a
packet type 0 from MAC address 00:00:00:00:00:00 to address
00:00:00:00:00:00, of length 0, would be rejected anyway.
Why look at "data" -- which may be suspect -- and *then* check its
CRC?
Run the CRC first. If it fails, decide how you are going to proceed >>>>> or recover.
That is usually the order, yes. Sometimes you want "fail fast",
such as dropping a packet that was not addressed to you (it doesn't
matter if it was received correctly but for someone else, or it was
addressed to you but the receiver address was corrupted - you are
dropping the packet either way). But usually you will run the CRC
then look at the data.
But the order doesn't matter - either way, you are still checking
for valid data, and if the data is invalid, it does not matter if
the CRC only passed by luck or by all zeros.
You're assuming the CRC is supposed to *vouch* for the data.
The CRC can be there simply to vouch for the *transport* of a
datagram.
I am assuming that the CRC is there to determine the integrity of the
data in the face of possible unintentional errors. That's what CRC
checks are for. They have nothing to do with the content of the data,
or the type of the data package or image.
Exactly. And, a CRC on *a* protocol can use ANY ALGORITHM that the
protocol
defines. Not some "canned one-size fits all" approach.
As an example of the use of CRC's in messaging, look at Ethernet frames:
<https://en.wikipedia.org/wiki/Ethernet_frame>
The CRCÂ does not care about the content of the data it protects.
AND, if the packet yielded an incorrect CRC, you can assume the
data was corrupt... OR, you are looking at a different protocol
and MISTAKING it for something that you *think* it might be.
If I produce a stream of data, can you tell me what the checksum
for THAT stream *should* be? You have to either be told what
it is (and have a way of knowing what the checksum SHOULD be)
*or* have to make some assumptions about it.
If you have assumed wrong *or* if the data has been corrupt, then
the CRC should fail. You don't care why it failed -- because you
can't do anything about it. You just know that you can't use the data
in the way you THOUGHT it could be used.
So, use a version-specific CRC on the packet. If it fails, then
either the data in the packet has been corrupted (which could just
as easily have involved an embedded "interface version" parameter);
or the packet was formed with the wrong CRC.
If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
why bother looking at a "protocol version" parameter? Would
you ALSO want to verify all the rest of the parameters?
I'm sorry, I simply cannot see your point. Identifying the version of
a protocol, or other protocol type information, is a totally
orthogonal task to ensuring the integrity of the data. The concepts
should be handled separately.
It is. A packet using protocol XYZ is delivered to port ABC.
Port ABC *only* handles protocol XYZ. Anything else arriving there,
with a potentially different checksum, is invalid. Even if, for example, byte number 27 happens to have the correct "magic number" for that
protocol.
Because the message doesn't obey the rules defined by the protocol
FOR THAT PORT. What do I gain by insisting that byte number 27 must
be 0x5A that the CRC doesn't already tell me?
Salt just ensures that you can differentiate between functionally
identical
values. I.e., in a CRC, it differentiates between the "0x0000" that
CRC-1
generates from the "0x0000" that CRC-2 generates.
Can we agree that this is called an "initial value", not "salt" ?
It depends on how you implement it. The point is to produce
different results for the same polynmomial.
You don't see the parallel to ensuring that *my* use of "Passw0rd" is
encoded in a different manner than *your* use of "Passw0rd"?
No. They are different things.
An important difference is that adding "salt" to a password hash is an
important security feature. Picking a different initial value for a
CRC instead of having appropriate protocol versioning in the data (or
a surrounding envelope) is a misfeature.
And you don't see that verifying that a packet of data received at
port ABC that should only see the checksum associated with protocol
XYZ as being similarly related?
See the RMI desciption.
I'm sorry, I have no idea what "RMI" is or where it is described.
You've mentioned that abbreviation twice, but I can't figure it out.
<https://en.wikipedia.org/wiki/RMI>
<https://en.wikipedia.org/wiki/OCL>
Nothing magical with either term.
I looked up RMI on Wikipedia before asking, and saw nothing of
relevance to CRC's or checksums.
I noticed no mention of "OCL" in your posts, and looking
You need to read more carefully.
Give me the sources for Windows (Linux, *BSD, etc.) and I can
subvert all the state-of-the-art digital signing used to ensure
binaries aren't altered. Nothing *outside* the box is involved
so, by definition, everything I need has to reside *in* the box.
No, you can't. The sources for Linux and *BSD /are/ all freely
available. The private signing keys used by, for example, Red Hat or
Debian, are /not/ freely available. You cannot make changes to a Red
Hat or Debian package that will pass the security checks - you are
unable to sign the packages.
Sure I can! If you are just signing a package to verify that it hasn't
been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
to do is remove the signature check -- leaving the signature in the (unchecked) executable.
Programming a flash memory can flip bits in parts of the flash memory which is
not programmed.
Bit errors can also be introduced by radiation.
Some applications require better security than others.
Functional Safety may require CRC size based on code size.
Sure I can! If you are just signing a package to verify that it hasn'tGive me the sources for Windows (Linux, *BSD, etc.) and I can
subvert all the state-of-the-art digital signing used to ensure
binaries aren't altered. Nothing *outside* the box is involved
so, by definition, everything I need has to reside *in* the box.
No, you can't. The sources for Linux and *BSD /are/ all freely available. >>> The private signing keys used by, for example, Red Hat or Debian, are /not/ >>> freely available. You cannot make changes to a Red Hat or Debian package >>> that will pass the security checks - you are unable to sign the packages. >>
been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
to do is remove the signature check -- leaving the signature in the
(unchecked) executable.
Woah, you /really/ don't understand this stuff, do you? Here's a clue - ask yourself what is being signed, and what is doing the checking.
Perhaps also ask yourself if /all/ the people involved in security for Linux or
BSD - all the companies such as Red Hat, IBM, Intel, etc., - ask if /all/ of them have got it wrong, and only /you/ realise that digital signatures on open
source software is useless?
/Very/ occasionally, there is a lone genius that
understands something while all the other experts are wrong - but in most cases, the loner is the one that is wrong.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (0 / 16) |
Uptime: | 164:44:11 |
Calls: | 10,385 |
Calls today: | 2 |
Files: | 14,057 |
Messages: | 6,416,518 |