Ironically, I probably should have leaned harder into the "daisy chained types" interpretation, rather than treating it as an undesirable implementation tradeoff.
In my compiler (BGBCC), such an internal pointer exists for arrays and structures in the local stack frame.
No separate pointer exists inside of things like structs, where, as can
be noted, the array exists at a fixed size and location.
So, eg:
void Foo()
{
int a[100];
...
}
There is both the space for 100 integers reserved in the stack frame,
and a variable 'a' which exists as an implicit pointer to that location.
But, say:
void Foo()
{
int a[8192];
...
}
There is no space reserved on the stack, and the array is instead
allocated dynamically (from the heap). In this case, the "a" variable
exists as a pointer to that location in memory.
Similar treatment also applies to structs.
On 7/7/2024 11:28 PM, James Kuyper wrote:
On 7/7/24 20:02, Kaz Kylheku wrote:
...
I see no point in having implicit pointers, but I don't believe that
they are prohibited.
They mostly exist in a "sort of simpler to implement the compiler this
way" sense.
In the implicit pointer case, the compiler just treats it as-if it were
an explicit pointer. In this case, both are basically treated as being roughly equivalent at the IR levels.
And, most of the code-generation stage doesn't need separate handling
for arrays and pointers, but can use combined "ArrayOrPointer" handling
or similar.
It had all seemed "obvious enough".
Similar reasoning for passing structs by-reference in the ABI:
Pass by reference is easy to implement;
In place copying and decomposing into registers, kinda bad.
Though, this one seems to be a common point of divergence between "SysV"
and "Microsoft" ABIs. Sometimes a target will have an ABI defined, and
the MS version was almost the same, just typically differing in that it passes structs by reference and provides a spill space for register arguments.
On 09/07/2024 15:31, David Brown wrote:
On 08/07/2024 19:39, BGB wrote:
Though, this one seems to be a common point of divergence between
"SysV" and "Microsoft" ABIs. Sometimes a target will have an ABI
defined, and the MS version was almost the same, just typically
differing in that it passes structs by reference and provides a spill
space for register arguments.
I don't think it is helpful that you keep mixing /logical/ terms with
/implementation/ terms.
In C, there is no "pass by reference" or "return by reference". It is
all done by value.
Arrays are passed by reference:
void F(int a[20]) {}
int main(void) {
int x[20];
F(x);
}
Although the type of 'a' inside 'F' will be int* rather than int(*)[20].
So if you have these structs and declarations :
struct small { uint16_t a; uint16_t b; };
struct big { uint32_t xs[10]; };
struct small foos(struct small y);
struct big foob(struct big y);
Then compilers will typically implement "x = foos(y)" as though it were:
extern uint32_t foos(uint32_t ab);
uint32_t _1 = foos(y.a << 16) | (y.b);
struct small x = { _1 >> 16, _1 & 0xffff };
And they will typically implement "x = foosb(y)" as though it were:
extern void foob(struct big * ret, const struct big * xs);
struct big x;
foob(&x, &y);
From what I've seen, structs that are not small enough to be passed in registers, are copied to a temporary, and the address of that temporary
is passed.
This seems to be the case even when the struct param is marked 'const'.
(My compiler won't create a copy when the parameter is 'const'. I
assumed that was how gcc did it; I was wrong.)
This is for Win64 ABI, however an ABI will only say they are passed by reference; it will not stipulate making a copy. That is up to the
language implementation.
On 7/9/2024 3:22 PM, James Kuyper wrote:
On 7/9/24 14:55, BGB wrote:
...
The pass by reference, in this context, was referring to the ABI, not
to C itself.
It looks from C's POV as-if it were by-value.
Which it is, depends on if one is looking at things at the language
level, ABI level, or IR level, ...
The C standard doesn't explicitly specify pass by value, or pass by
reference, or anything other passing mechanism. What it does say is what
a programmer needs to know to use the passing mechanism. It says that
the value of a function parameter that is seen by the code inside that
function is a copy of the value passed as an argument to the function.
The copy can be modified without changing the original. When a C
function's declaration looks as though it takes an array as an argument,
what that declaration actually means is that it takes a pointer value as
an argument, and it is a copy of that pointer's value which is seen
inside the function, and can be modified. The memory it points at is the
same as the memory pointed at by the corresponding argument.
We can probably agree that, in C:
typedef struct Foo_s Foo;
struct Foo_s {
int x, y, z, a, b, c;
};
int FooFunc(Foo obj)
{
obj.z = obj.x + obj.y;
return(obj.z);
}
int main()
{
Foo obj;
int z1;
obj.x=3;
obj.y=4;
obj.z=0;
z1=FooFunc(obj);
printf("%d %d\n", obj.z, z1);
}
Should print "0 7" regardless of how the structure is passed in the ABI.
Though, one possibility being to relax the language such that both "0 7"
and "7 7" are valid possibilities (the latter potentially allowing more performance by not needing to make a temporary local copy). Though,
AFAIK, C doesn't really allow for this.
An implementation could be clever though and only make local copies in
cases where the structure is modified by the callee, as in the example
above.
On 13/07/2024 10:39, BGB wrote:
But, as I see it, no real point in arguing this stuff (personally, I
have better stuff to be doing...).
We all do. But this group seems to be about arguing about pointless
stuff and you might come here when you want a respite from proper work.
However (here I assume you've gone back to Quake but that other
interested parties might be reading this), consider the program below.
That sets up an array and then sums its elements by calling 3 different functions to do the job:
(1) Using normal C pass-by-value
(2) Using C pass-by-value to emulate call-by-reference
(3) Using fantasy true call-by-reference as it might appear if C had the
feature
(I'd hoped C++ would run this, but it didn't even like the middle
function.)
I'm asking people to compare the first and third functions and their
calls, and to see if there's any appreciable difference between them.
There will obviously be a difference in how the A parameter is declared.
---------------------------------------------
#include <stdio.h>
typedef int T;
int sum_byvalue(T* A, int n) {
int i, sum=0;
for (i=0; i<n; ++i) sum += A[i];
return sum;
}
int sum_bymanualref(T(*A)[], int n) {
int i, sum=0;
for (i=0; i<n; ++i) sum += (*A)[i];
return sum;
}
int sum_bytrueref(T (&A)[], int n) {
int i, sum=0;
for (i=0; i<n; ++i) sum += A[i];
return sum;
}
int main(void) {
enum {N = 10};
T A[N] = {10,20,30,40,50,60,70,80,90,100};
int total=0;
total += sum_byvalue (A, N);
total += sum_bymanualref (&A, N);
total += sum_bytrueref (A, N);
printf("%d\n", total); // would show 1650
}
---------------------------------------------
Find anything? I thought not.
Those findings might suggest that C doesn't need call-by-reference, not
for arrays anyway.
Except that at present you can do this:
T x=42;
sum_byvalue(&x, N);
which would not be possible with call-by-reference. Nor with
sum_bymanualref, but apparently nobody wants to be doing with all that
extra, fiddly syntax. Better to be unsafe!
On 13/07/2024 10:37, David Brown wrote:
If you say stupid things, repeatedly,
Start applying a bit of your intelligence (you say stupid things
sometimes, but I know you are far from stupid), and you'll find the
level of conversation going up.
I made the tweak to see how hard it would be to detect value-arrays
declared in parameter list (it was very easy), and what the
consequences would be on existing code (significant).
No, the consequences are non-existent because no one uses your tool,
and no one will ever copy that change in other tools (of significance).
You are spectacularly missing the point. IT DOESN'T WHOSE TOOL IT IS.
Somebody could have done the same exercise with gcc, and come to the
same conclusion: too many programs use array parameters.
The example I posted showed a type (const char* x[]) where there was
no advantage to having that value array notation. Using 'const
char**' would be a more accurate description of the actual parameter
type.
You can write your code the way you want to write it - it will not
change the way anyone else writes their code. It really is that
simple. Why is this so difficult for you to understand?
Do you really suppose that if /you/ make "foo(char x[])" a syntax
error in /your/ compiler, it will have the slightest effect on how
other people write their C code?
What WOULD be the effect if a compiler did that? How would a particular codebase be affected?
You can just modify a compiler and try it, which is what I did. What difference does it make which compiler it is? You just have a blind, irrational hatred for anything I do.
Another way to do it is for someone to painstakingly go through every
line of a codebase by hand, expanding macros and typedefs as needed, and checking whether any parameters declared top-level array types.
I think if you were given that job to do, then applying my toy compiler wouldn't be so bad after all!
Or on what other C compilers do? Or on how half a century of existing
C code is written?
Personally, I don't like that C allows something that /looks/ like
arrays can be passed to functions, but doesn't work that way.
gcc could conceivably have an option that detects and warns about that.
Whoever is thinking about doing that might well do a test exactly like
mine.
I don't think I have ever written a function with an array-like
parameter - I use a pointer parameter if I mean a pointer, or have the
array wrapped in a struct if I want to pass an array by value.
So all /your/ code would still pass; great!
But I don't think my opinions make a difference to C, and even if I
were elected dictator of C for a day, I don't think my opinions should
count more than any one else's - including those that like the way C
works here
Half of the programming language you call "C" is defined by the way you invoke your compiler. So that already allows for myriad, slightly
different dialects.
Somebody had to think up all those options, and they would have been influence by people's opinions. This is just one more, which I would
happily have as a default.
And I don't confuse my opinions or preferences with how C actually
works and how it is actually defined, and I certainly don't spread
such confusions and misunderstandings to others.
C does that well enough by itself. There are any number of behaviours
where: (1) saying nothing and passing; (2) warning and passing; (3)
reporting an error and failing are all perfectly valid.
You just choose which one you want, at least if using an equivocal
compiler like gcc.
This is like taking an examination and being able to choose how strictly
it should be marked! Yeah, I think I'll got a pass today...
On 7/17/2024 6:38 AM, Bart wrote:
On 13/07/2024 10:39, BGB wrote:
But, as I see it, no real point in arguing this stuff (personally, I
have better stuff to be doing...).
We all do. But this group seems to be about arguing about pointless
stuff and you might come here when you want a respite from proper work.
However (here I assume you've gone back to Quake but that other
interested parties might be reading this), consider the program below.
I got back to debugging...
Ironically, one of the big bugs I ended up finding was related to
internal struct handling "leaking through" and negatively effecting stuff.
say:
typedef struct foo_s foo_t; // don't care what it contains for now.
foo_t arr[...];
foo_t temp;
int i, j;
...
temp=arr[i];
arr[i]=arr[j];
arr[j]=temp;
Internally, it would load a reference to arr[i] into temp, but then this location would get overwritten before the third assignment happened,
causing the incorrect contents to be copied to arr[j].
For now, have ended up changing stuff such that any struct-assignment
(for structs in the "by-ref" category) to a local variable will instead
copy the contents to the memory location associated with that struct.
On 7/18/2024 2:46 AM, David Brown wrote:
On 17/07/2024 19:53, BGB wrote:
On 7/17/2024 6:38 AM, Bart wrote:
On 13/07/2024 10:39, BGB wrote:
But, as I see it, no real point in arguing this stuff (personally,
I have better stuff to be doing...).
We all do. But this group seems to be about arguing about pointless
stuff and you might come here when you want a respite from proper work. >>>>
However (here I assume you've gone back to Quake but that other
interested parties might be reading this), consider the program below. >>>>
I got back to debugging...
To be clear - you are talking about debugging your compiler here, yes?
My compiler and my Quake 3 port, but most of the bugs in the Quake 3
port thus far were due to bugs either in my compiler or in the runtime libraries.
Ironically, one of the big bugs I ended up finding was related to
internal struct handling "leaking through" and negatively effecting
stuff.
say:
typedef struct foo_s foo_t; // don't care what it contains for now. >>>
foo_t arr[...];
foo_t temp;
int i, j;
...
temp=arr[i];
arr[i]=arr[j];
arr[j]=temp;
Internally, it would load a reference to arr[i] into temp, but then
this location would get overwritten before the third assignment
happened, causing the incorrect contents to be copied to arr[j].
For now, have ended up changing stuff such that any struct-assignment
(for structs in the "by-ref" category) to a local variable will
instead copy the contents to the memory location associated with that
struct.
How could it possibly mean anything else? Structs in C are objects -
contiguous blocks of bytes interpreted by a type. Assigning them will
mean copying those bytes. Pretending the language sometimes means
structs and sometimes means magical auto-dereferencing pointers to
structs is simply wrong.
The "magical auto dereferencing pointers" interpretation gives better performance, when it works. In this case, it didn't work...
Sadly, there is no good way at the moment to know whether or not it will work, for now forcing the slower and more conservative option.
If "foo_t" is 2000 bytes long, then "foo_t temp" makes a 2000 byte
space in your local variables (the stack, on virtually every platform)
and "temp = arr[i];" does a 2000 byte memcpy(). The same thing
applies if "foo_t" is 2 bytes long, or 2 megabytes long. And if there
is a stack overflow making "temp", that's the programmer's problem.
For now:
1 - 16 bytes, goes in registers, except when accessing a member where it needs to be in-memory; unless it is a SIMD type which is special and
allows accessing members with the value still in registers.
17 bytes to 15.999K: Accessed by an implicit reference, uses hidden
copying to mimic by-value semantics (not quite foolproof as of yet it
seems).
16K and beyond, quietly turned into a heap allocation (with a compiler warning). Should otherwise look the same as the prior case.
And it is only when the direct translation is working properly that
you can start to think of improving user convenience. Perhaps you
could allocate large temporary objects in non-stack memory somewhere
to avoid stack overflows. I don't think that is a good idea, but it's
a possibility. Giving compiler warnings about large stack objects is
a much better solution IMHO.
It both warns and also turns it into a heap allocation.
Because warning and the code still working, is better than warning and
the program most likely crashing due to a stack overflow (and in cases
with no memory protection, probably overwriting a bunch of other stuff
in the process).
On 18/07/2024 13:41, David Brown wrote:
On 18/07/2024 12:05, BGB wrote:
The "magical auto dereferencing pointers" interpretation gives better
performance, when it works. In this case, it didn't work...
It's very easy to make high performance code if correctness doesn't
matter! Obviously, correctness is more important.
It is useful to explore ways of making some scenarios faster. Here there
is simply a bug in one of those ways. But you don't just give up
completely; you can try fixing the bug.
Sadly, there is no good way at the moment to know whether or not it
will work, for now forcing the slower and more conservative option.
I would think the simple test is that for data that is never changed
(or not changed within the function), you can use a constant reference
- otherwise you cannot. It is not by coincidence that in C++, it is
common to use pass by /const/ reference as an efficient alternative to
pass by value for big objects.
You said the other day that my C compiler was wrong to do that: to use efficient pass-by-pointer for structs marked as 'const' in the function signature; they always have to be copied no matter what.
The normal system is that local objects are data on the stack -
regardless of the size or type, scaler or aggregate. Parameter
passing is done by register for some types (for the first few
parameters), or the stack otherwise. Returns are in a register or two
for some times, or by a stack slot assigned by the caller. For struct
parameters or return values, it can be efficient for the caller to
pass hidden pointers, but that's not strictly necessary if you have a
fixed stack frame layout. (Struct parameters still get copied to the
stack to ensure value semantics - the hidden pointer points to the
stack copy.)
Trying to have special cases for different sizes,
That's exactly what common 64-bit ABIs do. In fact the SYS V ABI is so complicated that I can't understand its struct passing rules at all.
(If I ever have to use that, I'd need to write test code for each
possible size of struct, up to 100 bytes or so (past the largest machine register), and see how an existing compliant compiler handles each case.)
Here the context appears to be a custom ISA, so anything is possible.
You are trying to be too smart here, IMHO - the compiler's job is to
let the programmer be smart. It's always nice to have optimisations,
but not at the expense of correctness.
That's an odd remark from a devotee of gcc. Your usual attitude is to
let the programmer write code in the most natural manner, and let a
smart optimising compiler sort it out.
These were my original comments on the subject made to DB:
DB:
. In C, there is no "pass by reference" or "return by reference". Itis all done by value.
BC:
Arrays are passed by reference:
void F(int a[20]) {}
int main(void) {
int x[20];
F(x);
}
Although the type of 'a' inside 'F' will be int* rather than int(*)[20].
It was in reply to DB which appear to imply that arrays were passed by
value. Obviously they're not passed by value, so what then? (Please,
don't rerun the thread! This is where everyone jumped in.)
Bart <bc@freeuk.com> writes:
In general there is no reason, in a language with true call-by-reference,
why any parameter type T (which has the form U*, a pointer to anything),
cannot be passed by reference. It doesn't matter whether U is an array type >> or not.
I can't unravel this. Take, as a concrete example, C++. You can't pass
a pointer to function that takes an array passed by reference. You can,
of course, pass a pointer by reference, but that is neither here nor
there.
David Brown <david.brown@hesbynett.no> writes:
On 16/08/2024 02:08, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
In general there is no reason, in a language with true call-by-reference, >>>> why any parameter type T (which has the form U*, a pointer to anything), >>>> cannot be passed by reference. It doesn't matter whether U is an array typeI can't unravel this. Take, as a concrete example, C++. You can't pass >>> a pointer to function that takes an array passed by reference. You can, >>> of course, pass a pointer by reference, but that is neither here nor
or not.
there.
In C++, you can't pass arrays as parameters at all - the language inherited >> C's handling of arrays. You can, of course, pass objects of std::array<>
type by value or by reference, just like any other class types.
The best way to think about C++ (in my very non-expert opinion) is to consider references as values that are passed by, err..., value. But
you seem prepared to accept that some things can be "passed by reference"
in C++.
So if this:
#include <iostream>
void g(int &i) { std::cout << i << "\n"; }
int main(void)
{
int I{0};
g(I);
}
shows an int object, I, being passed to g, why does this
#include <iostream>
void f(int (&ar)[10]) { std::cout << sizeof ar << "\n"; }
int main(void)
{
int A[10];
f(A);
}
not show an array, A, being passed to f?
As I said, I don't think it's wise to look at it this way, but I am just borrowing your use of terms to try to tease out what you are getting at.
On 16/08/2024 09:04, David Brown wrote:
On 15/08/2024 18:08, Bart wrote:
These were my original comments on the subject made to DB:
DB:
;. In C, there is no "pass by reference" or "return by reference".It is all done by value.
BC:
;Arrays are passed by reference:int(*)[20].
; void F(int a[20]) {}
; int main(void) {
; int x[20];
; F(x);
; }
Although the type of 'a' inside 'F' will be int* rather than
It was in reply to DB which appear to imply that arrays were passed
by value. Obviously they're not passed by value, so what then?
(Please, don't rerun the thread! This is where everyone jumped in.)
I am not sure if you want an answer here or not - you asked "so what
then", but also asked to avoid a re-run of the thread.
I can give a summary - and I also hope this doesn't lead to a re-run
of the discussion. However, since you are asking the same question as
you did at the start, and the language C has not changed in the
meantime, the factual and correct answers will inevitably be the same:
1. C has no "pass by reference" - it is all "pass by value".
2. In C, you cannot pass an array as a function parameter.
3. The automatic conversion of many array expressions to pointer
expressions, along with the similar conversions of function parameter
types, gives C users a syntax that is similar - but not identical to -
what you would have if the language supported passing arrays by
reference.
So, you agree that it is similar to.
And not just the resulting syntax,
but the semantics and even the generated code can be the same (as I demonstrated but somebody complained).
Would you agree that they are effectively passed by-reference for all practical purposes?
All the other differences in detail are mostly due to the weird way that
C handles arrays anyway.
4. Adding "pass by reference" and "arrays as first class objects"
would both be very significant changes to C
Adding pass-by-reference would not be a huge change. I added that using
a cheap and cheerful approach that seems work well enough (a parameter
marked as by-ref, would have '&' automatically applied on arguments, and
'*' automatically applied to parameter accesses in the callee**).
But what would complicate it in C is how it interacts with how arrays currently work. For example, passing array A already passes '&A[0]'; it
can't really pass '&&A[0]' if it's marked as being by-reference!
(** There were some side-effects: while you can pass a char or short to
an int parameter for example and it will promote it, if the int is by-reference, you can only pass an exact int type. And also, I wasn't
able to apply default values to optional by-reference parameters.)
Ben Bacarisse <ben@bsb.me.uk> writes:
David Brown <david.brown@hesbynett.no> writes:
On 16/08/2024 12:00, Ben Bacarisse wrote:
David Brown <david.brown@hesbynett.no> writes:
On 16/08/2024 02:08, Ben Bacarisse wrote:
Bart <bc@freeuk.com> writes:
In general there is no reason, in a language with true
call-by-reference, why any parameter type T (which has the form
U*, a pointer to anything), cannot be passed by reference. It
doesn't matter whether U is an array type or not.
I can't unravel this. Take, as a concrete example, C++. You
can't pass a pointer to function that takes an array passed by
reference. You can, of course, pass a pointer by reference, but
that is neither here nor there.
In C++, you can't pass arrays as parameters at all - the language
inherited C's handling of arrays. You can, of course, pass
objects of std::array<> type by value or by reference, just like
any other class types.
The best way to think about C++ (in my very non-expert opinion) is
to consider references as values that are passed by, err...,
value. But you seem prepared to accept that some things can be
"passed by reference" in C++.
That seems a subtle distinction - I'll have to think about it a
little. I like your description of arguments being like local
variable initialisation - it makes sense equally well regardless of
whether the parameter is "int", "int*", or "int&". (It's probably
best not to mention the other one in this group...)
So if this:
#include <iostream>
void g(int &i) { std::cout << i << "\n"; }
int main(void)
{
int I{0};
g(I);
}
shows an int object, I, being passed to g, why does this
#include <iostream>
void f(int (&ar)[10]) { std::cout << sizeof ar << "\n"; }
int main(void)
{
int A[10];
f(A);
}
not show an array, A, being passed to f?
That's backwards compatibility with C array handling at play.
I'm not sure how this answers my question. Maybe you weren't
answering it and were just making a remark...
My guess is he didn't understand the question. The code shown
has nothing to do with backwards compatibility with C array
handling.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 546 |
Nodes: | 16 (2 / 14) |
Uptime: | 146:45:40 |
Calls: | 10,383 |
Calls today: | 8 |
Files: | 14,054 |
D/L today: |
2 files (1,861K bytes) |
Messages: | 6,417,714 |