What is a good hash function for pointers to use in portable ANSI C?
What is a good hash function for pointers to use in portable ANSI C?
What is a good hash function for pointers to use in portable
ANSI C?
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash function.
On 5/23/2024 5:32 PM, Chris M. Thomasson wrote:
On 5/23/2024 4:11 AM, Malcolm McLean wrote:
What is a good hash function for pointers to use in portable ANSI C?
The pointers are nodes of a tree, which are read only, and I want to
associate read/write data with them. So potentially a lage number of
pointers,and they might be consecutively ordered if they are taken
from an array, or they might be returned from repeared calls to
malloc() with small allocations. Obviously I have no control over
pointer size or internal representation.
I kind of remember an old one for 32-bit pointers. It was something
like multiply by a prime and shift to the right by 7 or 8. Shit, I
think. It was way back on comp.programming.threads.
There was an old paper from microsoft that had a pointer hash. It was
used to index into a table of locks. I need to find it.
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash function.
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash function.
In that case I think you are stuck with using a half-baked
solution. The standard integer types available in C89 just
aren't a good fit in a 64-bit world.
On 24/05/2024 02:39, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash
function.
In that case I think you are stuck with using a half-baked
solution. The standard integer types available in C89 just
aren't a good fit in a 64-bit world.
I assume the C89 implementation is one that can target current 64
bit machines.
Then char, short, int, long long will almost certainly have widths
of 8, 16, 32 and 64 bits respectively.
(I don't know if 'long long' was part of C89, but it sounds like
Malcolm just doesn't want to be bothered with stdint.h, and any
compiler used is like to support it.
On 24/05/2024 14:18, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
cJSON.c is C89. And these additions to the resource compiler were
inspired by a menntion of cJSON.c here.
So a C89 hash for a pointer to an unsigned int would be ideal. However
it might be impossible to write one which is both efficient in terms of >machine instructions and a good hash function in that it distributes the >hashes evenly given an uneven distribution of keys. And pointers
returned from repeated calls to malloc() are going to have an odd >distribution of values.
On 24/05/2024 02:39, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable
ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash function.
In that case I think you are stuck with using a half-baked
solution. The standard integer types available in C89 just
aren't a good fit in a 64-bit world.
I assume the C89 implementation is one that can target current 64 bit machines.
Then char, short, int, long long will almost certainly have widths of 8,
16, 32 and 64 bits respectively.
(I don't know if 'long long' was part of C89, but it sounds like Malcolm
just doesn't want to be bothered with stdint.h, and any compiler used is
like to support it.
So if we use uintptr_t the code might break.
And if we use unsigned long we have to assume maximum 32 bits.
However it's unlikely anyone will have an xml file with more than a
billion nodes.
On 24/05/2024 19:28, Bonita Montero wrote:
Am 23.05.2024 um 13:11 schrieb Malcolm McLean:
What is a good hash function for pointers to use in portable ANSI C?
The pointers are nodes of a tree, which are read only, and I want to
associate read/write data with them. So potentially a lage number of
pointers,and they might be consecutively ordered if they are taken
from an array, or they might be returned from repeared calls to
malloc() with small allocations. Obviously I have no control over
pointer size or internal representation.
Use FNV.
Here's an attempt.
/* FNV hash of a pointer */
static unsigned int hash(void *address)
{
int i;
unsigned long answer = 2166136261;
unsigned char *byte = (unsigned char *) &address;
for (i = 0; i < sizeof(void *); i++)
{
answer *= 16777619;
answer ^= byte[i];
}
return (unsigned int) (answer & 0xFFFFFFFF);
}
Now what will compilers make of that?
On 24/05/2024 19:57, Malcolm McLean wrote:
On 24/05/2024 19:28, Bonita Montero wrote:
Am 23.05.2024 um 13:11 schrieb Malcolm McLean:
What is a good hash function for pointers to use in portable ANSI C?
The pointers are nodes of a tree, which are read only, and I want
to associate read/write data with them. So potentially a lage
number of pointers,and they might be consecutively ordered if they
are taken from an array, or they might be returned from repeared
calls to malloc() with small allocations. Obviously I have no
control over pointer size or internal representation.
Use FNV.
Here's an attempt.
/* FNV hash of a pointer */
static unsigned int hash(void *address)
{
int i;
unsigned long answer = 2166136261;
unsigned char *byte = (unsigned char *) &address;
for (i = 0; i < sizeof(void *); i++)
{
answer *= 16777619;
answer ^= byte[i];
}
return (unsigned int) (answer & 0xFFFFFFFF);
}
Now what will compilers make of that?
Compiler, or performance?
I tried this function with the test program shown below. I used it to populate a hash table of 64K entries with pointers from successive
calls to malloc.
Results, in terms of clashes, for different numbers N of entries
populated out of 64K were:
10000 1100
30000 12000
50000 67000
60000 216000
65535 5500000 (largest N possible)
Result were rather variable as malloc produces different patterns of
pointers on different runs. These were simply the result from the
first run.
Was this good? I'd no idea. But as a comparison, I used my own hash function, normally used to hash identifiers, shown below the main
program as the function 'bchash'.
If this is subsituted instead, the results were:
10000 230
30000 3800
50000 10300
60000 50300
65535 2700000
Hash tables need a certain amount of free capacity to stay efficient,
so 3/4 full (about 50K/64K) is about right.
Again I don't know if these figures are good, they are just better
than from your hash() function, for the inputs I used, within this
test, and for this size of table.
No doubt there are much better ones.
------------------------------------------
#include <stdio.h>
#include <stdlib.h>
static unsigned int hash(void *address)
{
int i;
unsigned long answer = 2166136261;
unsigned char *byte = (unsigned char *) &address;
for (i = 0; i < sizeof(void *); i++)
{
answer *= 16777619;
answer ^= byte[i];
}
return (unsigned int) (answer & 0xFFFFFFFF);
}
void* table[65536];
int main(void) {
void* p;
int clashes=0, wrapped;
int j;
for (int i=0; i<30000; ++i) {
p = malloc(1);
j = hash(p) & 65535;
wrapped=0;
while (table[j]) {
++clashes;
++j;
if (j>65535) {
if (wrapped) { puts("Table full"); exit(1);}
wrapped=1;
j=0;
}
}
table[j] = p;
}
printf("Clashes %d\n", clashes);
}
------------------------------------------
static unsigned int bchash(void *address)
{
int i;
unsigned long hsum = 0;
unsigned char *byte = (unsigned char *) &address;
for (i = 0; i < sizeof(void *); i++) {
hsum = (hsum<<4) - hsum + byte[i];
}
return (hsum<<5) - hsum;
}
On 24/05/2024 14:18, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
On 24/05/2024 02:39, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/05/2024 00:52, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 23/05/2024 23:49, Tim Rentsch wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
What is a good hash function for pointers to use in portable >>>>>>>>> ANSI C?
I have a preliminary question. Do you really mean ANSI C, or
is C99 acceptable?
C89 is better.
But the pass has been sold.
I'm not asking which you think is better. I'm asking about
what your requirements are.
C 89.
I don't want to pull in C99 types and so on just for a hash
function.
In that case I think you are stuck with using a half-baked
solution. The standard integer types available in C89 just
aren't a good fit in a 64-bit world.
I assume the C89 implementation is one that can target current 64
bit machines.
Then char, short, int, long long will almost certainly have widths
of 8, 16, 32 and 64 bits respectively.
C89 doesn't have long long.
(I don't know if 'long long' was part of C89, but it sounds like
Malcolm just doesn't want to be bothered with stdint.h, and any
compiler used is like to support it.
What he said was C89. He didn't mention stdint.h. I take
him at his word. If what he wants is something different,
he should say clearly what it is, and not make people guess
about it. (To be clear this recommendation is intended for
every questioner, not just Malcolm.)
cJSON.c is C89. And these additions to the resource compiler were
inspired by a menntion of cJSON.c here.
So a C89 hash for a pointer to an unsigned int would be ideal. However
it might be impossible to write one which is both efficient in terms
of machine instructions and a good hash function in that it
distributes the hashes evenly given an uneven distribution of
keys. And pointers returned from repeated calls to malloc() are going
to have an odd distribution of values.
bart <bc@freeuk.com> writes:
It looks like your hash function was tuned for this testing
setup. With different choices for testing it does much
worse.
The testing is done with malloc() blocks all of the same
requested size, and that size is 1. Typical workloads
are likely to be both larger and more variable.
When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot. It's
well understood that linear probing suffers badly as the
load factor goes up. Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
On 25/05/2024 10:12, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
[...]
It looks like your hash function was tuned for this testing
setup. With different choices for testing it does much
worse.
It wasn't tuned at all; it was a copy of what I've longed used
within my lexers.
It is designed to give a good spread when inputs are similar or
change sequentially. (Well, not exactly designed; more trial
and error.)
The testing is done with malloc() blocks all of the same
requested size, and that size is 1. Typical workloads
are likely to be both larger and more variable.
When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot. It's
well understood that linear probing suffers badly as the
load factor goes up. Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.
You mean Malcolm's version? That also uses a loop. Neither
need to process all 8 bytes of a 64-bit pointer; 6 will do, and
probably just 4.
And such a loop can be trivially unrolled.
In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
My first thought in this thread was to just make use of the
pointer value directly. My version was written only to compare
against Malcolm's FNV code.
If I try the other suggestions, RH's and KK's which simply
shift the pointer value (similar to my idea), the results I got
were interesting: sometimes there were zero clashes (almost as
though bits 4 to 19 of the pointer were a perfect hash), but
sometimes there were millions.
But when I tried to mix up the input values, then all methods
seemed to converge to similar results.
In that case you might as well use the simplest and fastest
method.
However it really needs testing within the actual application.
On 25/05/2024 18:40, Tim Rentsch wrote:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
Plenty fast but the output quality is poor. I tested
this scheme against four other hash functions, and in
four out of five workloads it was always dead last, by
a noticeable margin.
The lower bits of a pointer are often all zeroes. And mutlipying
by any integer will not set them. And that is catastrophic for a
hash.
bart <bc@freeuk.com> writes:
On 25/05/2024 10:12, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
[...]
It looks like your hash function was tuned for this testing
setup. With different choices for testing it does much
worse.
It wasn't tuned at all; it was a copy of what I've longed used
within my lexers.
It is designed to give a good spread when inputs are similar or
change sequentially. (Well, not exactly designed; more trial
and error.)
The trial and error process you mention had the effect of tuning
your function for the testing setup used, whether you meant it to
or not.
The testing is done with malloc() blocks all of the same
requested size, and that size is 1. Typical workloads
are likely to be both larger and more variable.
When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot. It's
well understood that linear probing suffers badly as the
load factor goes up. Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.
You mean Malcolm's version? That also uses a loop. Neither
need to process all 8 bytes of a 64-bit pointer; 6 will do, and
probably just 4.
And such a loop can be trivially unrolled.
Both your function and the function posted by Malcolm are slow.
It's just that yours is slower.
It isn't hard to produce a
hash function of equal quality that runs more than twice as
fast.
Let me tell you a little secret. Hash functions are like random
number generators: good ones never do exceptionally well or
exceptionally poorly. The reason for this is simple - whenever
there are scenarios where one does well then inevitably there are
other scenarios where it does poorly. > The ones
that are left are all, with high probability, just fine no matter
what application they are used in. That is not to say there is
no need to test within the intended application, it's always good
to do a sanity check, but the point of the sanity check is just
to make sure the earlier process didn't miss something; it isn't
the right way to compare alternatives.
On 25/05/2024 18:40, Tim Rentsch wrote:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
Plenty fast but the output quality is poor. I tested
this scheme against four other hash functions, and in
four out of five workloads it was always dead last, by
a noticeable margin.
The lower bits of a pointer are often all zeroes. And mutlipying by any integer will not set them. And that is catastrophic for a hash.
On 25/05/2024 22:13, Janis Papanagnou wrote:OK, parsing is a task I'm familiar with. Parsing a normal language into
I haven't read all the thread, but if I'd be going to tackle thatYes, the Baby X resource compiler should be portable to any hosted
I'd first try to map the pointer onto a fitting integral type (if
that's possible; sizeof()), or first "fold" (XOR) the upper/lower
parts of the pointer (so that zero-sequences, if they appear, also
won't affect the outcome too badly as a side effect) to reduce the
size of the data, and finally use any established hash algorithm
on the (folded/reduced) integral type.
Shall the created hash codes also be portable across platforms?
platform with a reasonable up to date C compiler.
But we also need portablity in another sense. The hash table is hashing
nodes of a XML document model created by the functions in the file xmlparser2.c. These make repeated calls to malloc() for small
allocations of nodes. But I have no control, over either pointer representation or the platform's implemetation of malloc(). So the
pointers could be at seemingly rsndom loctions in. memory, or they could
be evenly spaced at increments of 64 bytes, and thus differing in only a
few bits.
Then if someone wants to help out with this project, an
obvious thing to do would be to speed up the xml parser by replacing the calls to malloc() with a custom allocator.
On 25/05/2024 19:12, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
On 25/05/2024 10:12, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
[...]
It looks like your hash function was tuned for this testing
setup. With different choices for testing it does much
worse.
It wasn't tuned at all; it was a copy of what I've longed used
within my lexers.
It is designed to give a good spread when inputs are similar or
change sequentially. (Well, not exactly designed; more trial
and error.)
The trial and error process you mention had the effect of tuning
your function for the testing setup used, whether you meant it to
or not.
The testing is done with malloc() blocks all of the same
requested size, and that size is 1. Typical workloads
are likely to be both larger and more variable.
When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot. It's
well understood that linear probing suffers badly as the
load factor goes up. Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.
You mean Malcolm's version? That also uses a loop. Neither
need to process all 8 bytes of a 64-bit pointer; 6 will do, and
probably just 4.
And such a loop can be trivially unrolled.
Both your function and the function posted by Malcolm are slow.
It's just that yours is slower.
With unoptimised code, mine is up to 50% faster (suggesting an
inherently faster method).
Optimised via gcc -O3, it's 17% slower.
If I duplicate the tests in my language, which has a meagre optimiser,
then mine is 3 times the speed (I haven't looked at why; something
about the MM code that keeps locals out of registers I guess):
Unopt gcc-O3 My lang opt (uses 64 bits)
MM hash 2.7 0.34 1.9 seconds
BC hash 1.7 0.4 0.6
There are anyway any number of tweaks that can be made; no need to do
all 64 bits for a start, and the loops can be manually
unrolled. Suggesting it is slower is premature.
It isn't hard to produce a
hash function of equal quality that runs more than twice as
fast.
With or without the aid of an optimising compiler? You can't always
tell with the latter whether it's you who's clever, or the compiler.
Let me tell you a little secret. Hash functions are like random
number generators: good ones never do exceptionally well or
exceptionally poorly. The reason for this is simple - whenever
there are scenarios where one does well then inevitably there are
other scenarios where it does poorly. > The ones
that are left are all, with high probability, just fine no matter
what application they are used in. That is not to say there is
no need to test within the intended application, it's always good
to do a sanity check, but the point of the sanity check is just
to make sure the earlier process didn't miss something; it isn't
the right way to compare alternatives.
So what's a good one for use within the identifier lookup of a
compiler's lexer?
Am 25.05.2024 um 19:40 schrieb Tim Rentsch:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
Plenty fast but the output quality is poor. ...
If the prime is large enough there'e no regular distribution.
Am 26.05.2024 um 18:24 schrieb Tim Rentsch:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 19:40 schrieb Tim Rentsch:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
Plenty fast but the output quality is poor. ...
If the prime is large enough there'e no regular distribution.
Your conjecture is contradicted by empirical facts.
There are no empirival facts for that since two times the
taken prime is beyond the 64 bit address space.
It's not clear to me what the hashing is for on the nodes. What exactly
is the input being looked up, and what is being compared? Because
something sounds off: you don't normally look up a pointer to try and
find it in a table; you already have the pointer pointing at the data!
It's not clear to me what the hashing is for on the nodes. What exactly
is the input being looked up, and what is being compared? Because
something sounds off: you don't normally look up a pointer to try and
find it in a table; you already have the pointer pointing at the data!
This shows the perfect equal distribution if I take a 32 bit integer
and multiply it by the largest 32 bit prime:
#include <iostream>
#include <vector>
#include <random>
using namespace std;
int main()
{
vector<uint8_t> buckets( 1ull << 32 );
for( size_t b = buckets.size(); b; )
if( !++buckets[(uint32_t)--b * 0xFFFFFFFBu % buckets.size()] )
return EXIT_FAILURE;
vector<size_t> loads;
for( uint8_t ld : buckets )
{
if( loads.size() <= ld )
loads.resize( ld + 1 );
++loads[ld];
}
for( size_t l = 0; size_t ld : loads )
cout << l++ << ": " << 100.0 * (ptrdiff_t)ld /
(ptrdiff_t)buckets.size() << "%" << endl;
}
Result:
0: 0%
1: 100%
Am 28.05.2024 um 12:07 schrieb Ben Bacarisse:
Bonita Montero <Bonita.Montero@gmail.com> writes:
This shows the perfect equal distribution if I take a 32 bit integer
and multiply it by the largest 32 bit prime:
There was no need to write this program. It just illustrates a specific
example of a basic theorem where the multiplier and modulus are
co-prime.
For Tim it was necessary.
Am 26.05.2024 um 19:20 schrieb Tim Rentsch:
I say the output quality is poor because I have run tests that
show the poor output quality.
If you chose a prime whose double is beyond 64 bit there's an
equal distribution among the 64 bit modulos.
I've done that with a prime of my own choosing and also with
18446744073709551557, the value you suggested.
Show me your code.
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 26.05.2024 um 19:20 schrieb Tim Rentsch:
I say the output quality is poor because I have run tests that
show the poor output quality.
If you chose a prime whose double is beyond 64 bit there's an
equal distribution among the 64 bit modulos.
I've done that with a prime of my own choosing and also with
18446744073709551557, the value you suggested.
Show me your code.
Oh get real. It's not my job to make up for your
laziness.
On Thu, 30 May 2024 19:27:48 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 26.05.2024 um 19:20 schrieb Tim Rentsch:
I say the output quality is poor because I have run tests that
show the poor output quality.
If you chose a prime whose double is beyond 64 bit there's an
equal distribution among the 64 bit modulos.
I've done that with a prime of my own choosing and also with
18446744073709551557, the value you suggested.
Show me your code.
Oh get real. It's not my job to make up for your
laziness.
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for
this type of application? Or some sort of less thorough
grinding of the bits is better?
Michael S <already5chosen@yahoo.com> writes:
On Thu, 30 May 2024 19:27:48 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 26.05.2024 um 19:20 schrieb Tim Rentsch:
I say the output quality is poor because I have run tests that
show the poor output quality.
If you chose a prime whose double is beyond 64 bit there's an
equal distribution among the 64 bit modulos.
I've done that with a prime of my own choosing and also with
18446744073709551557, the value you suggested.
Show me your code.
Oh get real. It's not my job to make up for your
laziness.
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for
this type of application? Or some sort of less thorough
grinding of the bits is better?
Factors that matter:
Probably much of the above was obvious. I make no apologies for
that. Also it may seem disjointed or unorganized. If so then
sorry about that chief, it's a big topic and there's a lot of
ground to cover, and I tried to hit the most important
highlights.
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of the
bits is better?
There's no need for a crypto-hash here.
Am 03.06.2024 um 16:46 schrieb Michael S:
On Mon, 3 Jun 2024 16:34:37 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of the
bits is better?
There's no need for a crypto-hash here.
Do you think I don't know?
Crypto hash is just an example of near-ideal pseudo-random uniformity.
As I've shown for pointers you get a perfect equal distribution with multiplying by an appropriate prime.
Am 03.06.2024 um 16:46 schrieb Michael S:
On Mon, 3 Jun 2024 16:34:37 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of the
bits is better?
There's no need for a crypto-hash here.
Do you think I don't know?
Crypto hash is just an example of near-ideal pseudo-random
uniformity.
As I've shown for pointers you get a perfect equal distribution with multiplying by an appropriate prime.
On 03/06/2024 16:54, Bonita Montero wrote:
Am 03.06.2024 um 16:46 schrieb Michael S:
On Mon, 3 Jun 2024 16:34:37 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of
the bits is better?
There's no need for a crypto-hash here.
Do you think I don't know?
Crypto hash is just an example of near-ideal pseudo-random
uniformity.
As I've shown for pointers you get a perfect equal distribution with multiplying by an appropriate prime.
A pointer with 8-byte or 16-byte alignment will have the bottom 3-4
bits zero.
No matter what number you multiply them by, prime or otherwise, those
3-4 bits will always be zero.
If you mask the result to fit a table of size power-of-two, then the resulting index will can only ever refer to every 8th or every 16th
slot; there will 8-16x as many clashes as there ought to be.
So some extra work is needed to get around that, for example
right-shifting before masking as some here have done, something you
have never mentioned.
On Mon, 3 Jun 2024 17:24:36 +0100
bart <bc@freeuk.com> wrote:
On 03/06/2024 16:54, Bonita Montero wrote:
Am 03.06.2024 um 16:46 schrieb Michael S:
On Mon, 3 Jun 2024 16:34:37 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of
the bits is better?
There's no need for a crypto-hash here.
Do you think I don't know?
Crypto hash is just an example of near-ideal pseudo-random
uniformity.
As I've shown for pointers you get a perfect equal distribution with
multiplying by an appropriate prime.
A pointer with 8-byte or 16-byte alignment will have the bottom 3-4
bits zero.
No matter what number you multiply them by, prime or otherwise, those
3-4 bits will always be zero.
If you mask the result to fit a table of size power-of-two, then the
resulting index will can only ever refer to every 8th or every 16th
slot; there will 8-16x as many clashes as there ought to be.
So some extra work is needed to get around that, for example
right-shifting before masking as some here have done, something you
have never mentioned.
According to my understanding, Bonita and Tim are discussing hash
generator which output is not used as is.
For big enough Hash_max (Bonita suggests 2**63-1), poor quality of
few LS bits of Hash(key) does not matter.
The lower bits of a pointer are often all zeroes. And mutlipying by
any integer will not set them. And that is catastrophic for a hash.
They assume that index of
the slot will be calculated as (Hash(key)*bucket_size)/(Hash_max+1).
Tim might, I'm not sure sure about BM.
I don't remember seeing that anywhere. But, bucket_size is the size
of the hash-table?
And Hash_max+1 is likely to be a power of two?
I prefer my hash values to be evaluated independently of table size.
They usually don't have a meaningful maximum value, other than the 64
bit of their type, and I'd rather they didn't have sequences of zero
bits especially at the bottom end.
If hash_max was 2**63-1,
say, then dividing by hash_max+1 would
probably give you zero anyway, especially if you first multiplied by
a bucket size that was a power of two, as all the interesting bits
would disappear past the top bit!
On 03/06/2024 18:16, Michael S wrote:[...]
For big enough Hash_max (Bonita suggests 2**63-1), poor quality
of few LS bits of Hash(key) does not matter.
Tim might, [...]
On Mon, 3 Jun 2024 17:24:36 +0100
bart <bc@freeuk.com> wrote:
On 03/06/2024 16:54, Bonita Montero wrote:
Am 03.06.2024 um 16:46 schrieb Michael S:
On Mon, 3 Jun 2024 16:34:37 +0200
Bonita Montero <Bonita.Montero@gmail.com> wrote:
Am 02.06.2024 um 09:45 schrieb Michael S:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for this
type of application? Or some sort of less thorough grinding of
the bits is better?
There's no need for a crypto-hash here.
Do you think I don't know?
Crypto hash is just an example of near-ideal pseudo-random
uniformity.
As I've shown for pointers you get a perfect equal distribution with
multiplying by an appropriate prime.
A pointer with 8-byte or 16-byte alignment will have the bottom 3-4
bits zero.
No matter what number you multiply them by, prime or otherwise, those
3-4 bits will always be zero.
If you mask the result to fit a table of size power-of-two, then the
resulting index will can only ever refer to every 8th or every 16th
slot; there will 8-16x as many clashes as there ought to be.
So some extra work is needed to get around that, for example
right-shifting before masking as some here have done, something you
have never mentioned.
According to my understanding, Bonita and Tim are discussing hash
generator which output is not used as is. They assume that index of
the slot will be calculated as (Hash(key)*bucket_size)/(Hash_max+1).
On Sun, 02 Jun 2024 16:02:05 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Thu, 30 May 2024 19:27:48 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
So, what were your conclusions?
Ignoring the speed of computation, would something like
cryptographic hash scaled to bucket size be a best hash for
this type of application? Or some sort of less thorough
grinding of the bits is better?
Factors that matter:
<snip>
Probably much of the above was obvious. I make no apologies for
that. Also it may seem disjointed or unorganized. If so then
sorry about that chief, it's a big topic and there's a lot of
ground to cover, and I tried to hit the most important
highlights.
It sounds like you *postulate* that crypto is an ideal.
I am less in axioms and more interested in your experimental
findings.
Michael S <already5chosen@yahoo.com> writes:
I am less in axioms and more interested in your experimental
findings.
I'm not sure what you're looking for here.
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 26.05.2024 um 18:24 schrieb Tim Rentsch:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 19:40 schrieb Tim Rentsch:
Bonita Montero <Bonita.Montero@gmail.com> writes:
Am 25.05.2024 um 11:12 schrieb Tim Rentsch:
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier. In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.
A (size_t)pointer * LARGE_PRIME can be sufficient,
ignoring the overflow.
Plenty fast but the output quality is poor. ...
If the prime is large enough there'e no regular distribution.
Your conjecture is contradicted by empirical facts.
There are no empirival facts for that since two times the
taken prime is beyond the 64 bit address space.
You don't listen very well do you?
I say the output quality is poor because I have run tests that
show the poor output quality. I've done that with a prime of my
own choosing and also with 18446744073709551557, the value you
suggested. In both cases the test runs show results that are
clearly worse than all the other hash functions tested, including
bart's and malcolm's. Furthermore the results for your suggested
calculation are worse across the board, on every variety of
dynamic workload in my test suite.
Your proposed hash function is too weak to be taken as a serious
candidate.
On Mon, 03 Jun 2024 18:02:21 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
I am less in axioms and more interested in your experimental
findings.
I'm not sure what you're looking for here.
I'd give an example.
You said that some of the variants had 4x differences between cases.
From my perspective, if you found a hash function that performs up to 3
times better* than "crypto-alike" hash in majority of tests and is 1.33x worse that "crypto-alike" in few other tests, it's something that I'd consider as valuable option.
* - i.e. produces 3x less collisions at, say, occupation ratio of 0.7
Am 04.06.2024 um 23:59 schrieb Michael S:
18446744073709551557 is indeed very bad (too close to 2**64).
It doesn't matter how close the prime is to 2 ^ 64 since the
results are chopped.
Am 05.06.2024 um 11:34 schrieb Michael S:
I have no idea what you are talking about.
18446744073709551557 == -59.
That's a very small number.
You should take the value not signed. Addition and subtraction work
the same signed and unsigned, but not multiplication and division.
As I've shown you get a perfekt equal distribution if the input
also has a equal distribution.
On Sun, 26 May 2024 10:20:55 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
18446744073709551557 is indeed very bad (too close to 2**64).
But I tried other prime and at least in my simple tests it performs
rather well.
May be my tests are to simplistic, I don't pretend to understand much
about it.
[lots of C code]
Compiled as:
gcc -s -Wall -O2 -march=native ptr_hash.c hash_bonita.c hash_aes4.c
Run as:
./a 100000 75000 80 1000
Result:
ref: 75000 keys, 100000 slots, 52799 occupied 17431 collisions. [...]
uut: 75000 keys, 100000 slots, 53201 occupied 17139 collisions. [...]
I did get your own hash function out and put it into my little
test rig. There may be summary comments below.
On Wed, 05 Jun 2024 08:58:46 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I did get your own hash function out and put it into my little
test rig. There may be summary comments below.
<snip>
As you probably already paid attention, I like bottom lines.
What is a bottom line here?
Did you you encounter cases in which almost-bonita's-but-not-quite
hash function performed badly?
Michael S <already5chosen@yahoo.com> writes:
On Wed, 05 Jun 2024 08:58:46 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I did get your own hash function out and put it into my little
test rig. There may be summary comments below.
<snip>
As you probably already paid attention, I like bottom lines.
What is a bottom line here?
The bottom line is both the multiply-by-a-large-prime hashes
should be avoided.
Did you you encounter cases in which almost-bonita's-but-not-quite
hash function performed badly?
Yes. A clear example is using directly mapped hash table that
has a power of two elements, with inputs all from malloc( 48 )
calls. All keys map to one of only 1024 entries, in a hash
table that has 65536 slots.
If we're trying to write a general hash function, it should work
well across a wide range of input key sets, and also for different
kinds of hash table management. Let's look at the output side
first. There are two independent axes for hash tables: table
size, and direct mapping versus using rehashing. For table size,
the two prominent choices are some prime in the appropriate range,
or else a power of two. Clearly any size could be chosen, but
primes have the nice property that they divide up the space of hash
values very evenly, whereas powers of two allow a cheaper way of
reducing a hash value to the range of table indices (anding versus a
modulo operation). A good hash function should work well with both.
The choice of direct mapping versus rehashing depends on expected
occupancy rates (which I am used to calling "load factor"). For a
load factor of 75 to 80 percent or less, generally rehashing is
better. Direct mapping has the advantage that it can handle load
factors above 100%, at a cost of dealing with whatever subsidiary
data structure (usually linked lists) is used when two or more keys
map to the same index in the hash table. (I expect you know all
that, I wrote it mostly just to clarify my own thinking.)
With the code I've shown I proved that the hash AES hashfunction
is totally stochastically and thereby a good hash function. But
I was still discontent with the speed of the code.
Usually you're not able to update the buckets of a hashtable with
multiple threads without any locking. But with my code the buckets
are only simple counters which could be updated atomically.
So I could reduce the code and partitioned the input range to the
hash function with the available number of cores. Here's the code:
#include <iostream>
On 09/06/2024 12:35, Bonita Montero wrote:
uint64_t MichaelsHash( uint64_t key )
{
__m128i xkey = _mm_set_epi64x( key, 42 );
using bar_t = pair<uint64_t, uint64_t>;
static bar_t const bars[8] =
{
{ 0xBB09BBCC90B24BF2, 0x825C622FF2792A01 },
{ 0x94F0535CB06D4060, 0x939C756246DBFD1D },
{ 0x5B835E01A7E14CA1, 0xAC2BDAFC023CDD06 },
{ 0xE0B6A4735B774AEC, 0x9CAFB43E7DDE494C },
};
for( bar_t const &bar : bars )
xkey = _mm_aesenc_si128( xkey, _mm_set_epi64x( bar.second,
bar.first ) );
return xkey.m128i_u64[0];
}
Now the code is about six times faster and I get a eight times
speedup over single-threaded processing with the same code. Of
course the results are still the same.
I have your permission to drop that in?
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 09/06/2024 12:35, Bonita Montero wrote:
uint64_t MichaelsHash( uint64_t key )
{
__m128i xkey = _mm_set_epi64x( key, 42 );
using bar_t = pair<uint64_t, uint64_t>;
static bar_t const bars[8] =
{
{ 0xBB09BBCC90B24BF2, 0x825C622FF2792A01 },
{ 0x94F0535CB06D4060, 0x939C756246DBFD1D },
{ 0x5B835E01A7E14CA1, 0xAC2BDAFC023CDD06 },
{ 0xE0B6A4735B774AEC, 0x9CAFB43E7DDE494C },
};
for( bar_t const &bar : bars )
xkey = _mm_aesenc_si128( xkey, _mm_set_epi64x( bar.second,
bar.first ) );
return xkey.m128i_u64[0];
}
Now the code is about six times faster and I get a eight times
speedup over single-threaded processing with the same code. Of
course the results are still the same.
I have your permission to drop that in?
Note that this code was cribbed from Michael S.
If you
think it's important to ask permission, I think he is
the one you should be asking.
By the way, I thought you were looking for code that works
in standard C, and acceptable under C90 rules. Have you
changed your mind about that? The code above is a far
cry from C, let alone C90.
On Sun, 09 Jun 2024 18:31:15 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 09/06/2024 12:35, Bonita Montero wrote:
uint64_t MichaelsHash( uint64_t key )
{
__m128i xkey = _mm_set_epi64x( key, 42 );
using bar_t = pair<uint64_t, uint64_t>;
static bar_t const bars[8] =
{
{ 0xBB09BBCC90B24BF2, 0x825C622FF2792A01 },
{ 0x94F0535CB06D4060, 0x939C756246DBFD1D },
{ 0x5B835E01A7E14CA1, 0xAC2BDAFC023CDD06 },
{ 0xE0B6A4735B774AEC, 0x9CAFB43E7DDE494C },
};
for( bar_t const &bar : bars )
xkey = _mm_aesenc_si128( xkey, _mm_set_epi64x( bar.second,
bar.first ) );
return xkey.m128i_u64[0];
}
Now the code is about six times faster and I get a eight times
speedup over single-threaded processing with the same code. Of
course the results are still the same.
I have your permission to drop that in?
Note that this code was cribbed from Michael S.
I don't permit to use my name in variant, modified by Bonita.
I prefer if my name is not used even on my own variant.
This code was never intended for production use. It is intended for use
as reference, as something that is "certainly random enough and then
many billions times more than enough". It is much much slower than
necessary for non-crypto hash. [...]
If you
think it's important to ask permission, I think he is
the one you should be asking.
By the way, I thought you were looking for code that works
in standard C, and acceptable under C90 rules. Have you
changed your mind about that? The code above is a far
cry from C, let alone C90.
Exactly.
I think that all methods that are both good and fast rely on 64-bit
unsigned integer arithmetic. And it seems to me that C90 does not
provide it in portable manner.
So, under portable C90 constraints one has to choose between good and
fast.
On Wed, 05 Jun 2024 08:58:46 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
If we're trying to write a general hash function, it should work
well across a wide range of input key sets, and also for different
kinds of hash table management. Let's look at the output side
first. There are two independent axes for hash tables: table
size, and direct mapping versus using rehashing. For table size,
the two prominent choices are some prime in the appropriate range,
or else a power of two. Clearly any size could be chosen, but
primes have the nice property that they divide up the space of hash
values very evenly, whereas powers of two allow a cheaper way of
reducing a hash value to the range of table indices (anding versus a
modulo operation). A good hash function should work well with both.
The choice of direct mapping versus rehashing depends on expected
occupancy rates (which I am used to calling "load factor"). For a
load factor of 75 to 80 percent or less, generally rehashing is
better. Direct mapping has the advantage that it can handle load
factors above 100%, at a cost of dealing with whatever subsidiary
data structure (usually linked lists) is used when two or more keys
map to the same index in the hash table. (I expect you know all
that, I wrote it mostly just to clarify my own thinking.)
I never had interest in implementation of hash methods, typically
just took what language or library provides and used it.
All my knowledge of internals are 50 y.o. news (TAoCP, and no I
did not read it 50 years ago, but I did read it first time
slightly less than 40 years ago and not sure if I re-read this
particular topic since then).
So, I am not sure that I understood everything above.
In particular, w.r.t. rehashing, IIRC, Knuth came to conclusion
that simple circular increment of the index is superior to more
elaborate methods. I don't know if conclusion was changed in the
following years, not even sure that I recollect it correctly.
Despite (or due to?) my relative ignorance of the topic, I'd dare
to disagree with couple of your points:
1. I don't think that hash function should be evaluated in
isolation from the rest of algorithm. IMHO, they should be
co-developed. There is nothing wrong with hash function that
works well with one particular "back end" and poorly with all
others as long as limitations are clearly stated.
2. I don't like "modulo" method of reduction of hash value to
range. I don't like it for any chosen size of the bucket, not
just for power of two, but even for prime. Of course, for power
of two size my dislike to it is deeper.
On Wed, 05 Jun 2024 21:40:27 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Wed, 05 Jun 2024 08:58:46 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I did get your own hash function out and put it into my little
test rig. There may be summary comments below.
<snip>
As you probably already paid attention, I like bottom lines.
What is a bottom line here?
The bottom line is both the multiply-by-a-large-prime hashes
should be avoided.
I am not convinced.
Of course, it is no good for Malcolm, because it can't be coded
in what Malcolm erroneously call "ANSI C".
Did you you encounter cases in which almost-bonita's-but-not-quite
hash function performed badly?
Yes. A clear example is using directly mapped hash table that
has a power of two elements, with inputs all from malloc( 48 )
calls. All keys map to one of only 1024 entries, in a hash
table that has 65536 slots.
I don't see it.
$ ./a 0x10000 0xC000 48
ref: 49152 keys, 65536 slots, 34590 occupied 11295 collisions.
Worst collision: 7
uut: 49152 keys, 65536 slots, 35791 occupied 11872 collisions.
Worst collision: 3
It sounds like in your tests you don't use 'linear scaling to
table size' for translation of hash value to table index.
IMHO, you should.
I don't know why you say that. C was an ANSI standard before it
was an ISO standard. Or is it that you think that the language
Malcolm is intent on using does not conform to C90/C89/ANSI C?
On Mon, 17 Jun 2024 00:56:40 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I don't know why you say that. C was an ANSI standard before it
was an ISO standard. Or is it that you think that the language
Malcolm is intent on using does not conform to C90/C89/ANSI C?
All I wanted to point by this comment is that ANSI recognizes ISO/IEC 9899:2018 as their current C Standard and probably will recognize the
next ISO C Standard pretty soon. For that reason I think that names like
C89 or C90 are preferable (to ANSI C) when we want to refer to this particular variant of the language.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Michael S <already5chosen@yahoo.com> writes:
On Mon, 17 Jun 2024 00:56:40 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I don't know why you say that. C was an ANSI standard before it
was an ISO standard. Or is it that you think that the language
Malcolm is intent on using does not conform to C90/C89/ANSI C?
All I wanted to point by this comment is that ANSI recognizes ISO/IEC
9899:2018 as their current C Standard and probably will recognize the
next ISO C Standard pretty soon. For that reason I think that names like >>> C89 or C90 are preferable (to ANSI C) when we want to refer to this
particular variant of the language.
I see. So it isn't that you think "ANSI C" is wrong, just
that it might be misleading or that C89 or C90 is preferable.
Personally I would be surprised if someone used "ANSI C" to
mean anything other than C89/C90, but certainly other people
could have a different reaction.
[...] I don't necessarily complain when someone uses the phrase
"ANSI C" to mean C89/C90, but I try to avoid it myself in favor
of "C89" or "C90".
I see. So it isn't that you think "ANSI C" is wrong, just
that it might be misleading or that C89 or C90 is preferable.
Personally I would be surprised if someone used "ANSI C" to
mean anything other than C89/C90,
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
...
I see. So it isn't that you think "ANSI C" is wrong, just
that it might be misleading or that C89 or C90 is preferable.
ANSI's documentation is quite clear about the fact that there is, at any time, only one ANSI C standard, [...]
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 148:42:07 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,760 |