On 30/05/2024 18:16, Malcolm McLean wrote:
So here is a little function I just wrote as part of my work with the
XML parser.
int xmltojson(FILE *fp, XMLNODE *node, int useattributes)
{
if (is_array(node, useattributes))
{
fprintf(fp, "[\n");
xmltojson_r(fp, node->child, useattributes, 1, 0, 1);
fprintf(fp, "]\n");
}
else if (is_object(node, useattributes))
{
fprintf(fp, "{\n");
xmltojson_r(fp, node->child, useattributes, 1, 1, 1);
fprintf(fp, "}\n");
}
else if (is_field(node, useattributes))
{
fprintf(fp, "{");
xmltojson_r(fp, node->child, useattributes, 1, 1, 0);
fprintf(fp, "}\n");
}
return 0;
}
And it's completely hopeless isn't it? You can work out exactly what the first flag means. But the last three paramters? And of course I've just
used int, and bool will not help much, except parmater 3 is actually an integer (it's the depth parameter). Actually it obfuscates. Because bool
is avialable, but breaks on old compilers, it it less obvious that useattributes is a flag - someone who is a bit naive might think that
because I didn't use "bool" it must take more than two values. So it has
made code harder rather than easier to understand.
And every boolean means something. It doesn't mean "true" or "false" in
some sort of philosphical abstract, it means that attributs=es are to be
used or not to be used, and so on.
And what we want is this
int xmltojson(FILE *fp, XMLNODE *node, bool_useattributes useattributes)
{
if (is_array(node, useattributes))
{
fprintf(fp, "[\n");
xmltojson_r(fp, node->child, useattributes, 1, NO_WRITETAG, YES_NEWLINE);
fprintf(fp, "]\n");
}
else if (is_object(node, useattributes))
{
fprintf(fp, "{\n");
xmltojson_r(fp, node->child, useattributes, 1, YES_WRITETAG, YES_NEWLINE);
fprintf(fp, "}\n");
}
else if (is_field(node, useattributes))
{
fprintf(fp, "{");
xmltojson_r(fp, node->child, useattributes, 1, YES_WRITETAG, NO_NEWLINE);
fprintf(fp, "}\n");
}
return 0;
}
And now anyone can see exactly what this function is doing.
And so we want to abolish bool and replace it with a bool enum. bool
enums would be special enums constrained to have two values, and you'd
have an easy way of recognising them - prefixing by NO_ and YES_ is a tentative proposal. And the constants "true" and "false" just never
appear. They are disallowed for passing to functions.
That "argument against bool" is absolute piddle.
What you seem to have "discovered", generations after everyone else, is
that it can be hard to know what a function call is doing if it contains
many similar arguments. Your argument is no more "against bool" than it
is "against int", "against char*", and against every other common type
in every programming language.
Use "bool" when a true/false value is appropriate. So your "is_XXX"
functions should all return "bool". This is simple, clear, natural, and
the standard in most programming languages (including C, from C99
onwards, when the standards committee accepted that the current
situation was a complete dog's breakfast of home-made boolean types
mixed with "int"). Functions that return a success/fail should also
return "bool". (I have no idea what you think "xmltojson" should
return, but I assume you didn't list the full function.)
You are absolutely correct that using specific names for the parameters
is important. You are absolutely wrong to think this an issue with
"bool" or that using "int" and defining constants is the best solution.
There are several aspects to writing clear, safe code. Some of these
include:
A. Make the intention of the code clear to the reader.
B. Make it hard to write incorrect code.
We can all agree that the original code - passing "0" or "1" to multiple
int parameters in the function - fails on both of these.
There have been many suggestions over the years for ways to improve on
this. Let me try to give them in some kind of order of improvement.
(Note that for all of these, the efficiency of the generated code will
be the same, assuming a half-decent compiler.)
Attempt 1: The flag parameters can be changed from "int" to "bool", and
use "true" and "false" when calling. This is a marginal step forward -
it is now clear that these are flags and not numbers, but it is not
clear what they mean in the calling code, and it's easy to mix them up.
Attempt 2: Add an inline comment, like Lawrence suggests. This
slightly improves on A - the intention is a bit clearer. But it is
ugly, messy, inconvenient to write, and adds nothing to the code itself.
Separate comments, or end-of-line comments, would be significantly
clearer to the reader.
Attempt 3: Use "int" parameters, and constants such as "#define
NO_WRITETAG 0" or "static const int yes_newline = 1;". These fix
problem A - the intention of the calling code is clear. And if you use
other values for the constants then it is possible to do some run-time
checking in the function - it can check for the validity of the
constants and catch cases of people using 0 or 1, true or false, or
mixing up the parameters.
But there is no compile-time checking. And it is awkward if the choice
at the call site is determined by a local variable - you have to write something like "(want_newline ? yes_newline : no_newline)".
Attempt 4: Use enumerated types for the parameters and use the
enumeration constants when calling. This has all the benefits of
attempt four, but is more elegant (types and constants are strongly associated), is clearer at the function definition (as the parameter
types are their own types, not "int"), and allows some static checking
from appropriate tools. C lets you freely mix and match ints and
enumeration types and their constants - enumeration constants are just
named integer constants. Some tools can warn on mismatches, such as
"gcc -Wenum-conversion", but you have to enable such checks (for gcc,
this is not included in "-Wall").
Attempt 5: Use C's strong type-checking by making specific types for
the two parameters:
typedef struct { bool flag; } IsNewline;
typedef struct { bool flag; } IsWriteTag;
extern bool xmltojson_r(void* junk, int number, IsNewline isNewline,
IsWriteTag isWriteTag);
bool foo(void * junk, int n) {
return xmltojson_r(junk, n, (IsNewline) { true },
(IsWriteTag) { false });
}
Now, finally, we have a solution that fulfils both points. It is clear
and obvious what the calling code intends. It is equally happy with
constant values, or with variables (you can write "(IsNewline) {
wantNewline }"). And if you get it wrong, the compiler will complain -
you cannot pass anything but an IsNewline value to the "isNewline"
parameter, and similarly for the isWriteTag parameter.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)