|
Yet it is certainly not uncommon. I guess the most common use is to let the MSB indicate whether the remaining bits is an error code or a valid index. This is frequently used as function return values: A non-negative values is a valid function result, a negative value is an error code.
My gut feeling is that this is more common with functions defined 10 or 20 years ago than with functions defined this year. But lots of old libraries are still used. Also, coding habits die slowly.
I think there is a dividing line between "Giving a single field/value multiple meanings and uses" and "storing multiple distinct fields in a word in order to save space". The compiler can give full support for it, so that the fields addressed by distinct names, are of distinct types and have no overlap. They may be declared e.g. as a byte and as a 24-bit unsigned value. Maybe the original code designers never ever would dream of 16 million not being enough for everyone. Like those who set off a 32 bit value to represent the number of seconds since 01-01-1970 00:00:00. (There is no principal difference between an unplanned 24 bit value overflow and an unplanned 32 bit overflow.)
You may argue that a programmer should always make ample headroom for all values. I have seen programmers doing that, using 64 bit values everywhere, without ever thinking. Non-thinking programmers are no good. Around Y2K there also arose a "2038 panic", and I saw programmers argue in dead earnest that now that we are approaching a 32 bit overflow, to make sure that it doesn't happen again, we should not expand the value to 64 bits but to 128 bits.
I guess that most readers are familiar with the quote from a Xerox Fortran manual, using a DATA statement to give the value of pi as 3.141592653589793, with the additional remark. "This also simplifies modifying the program, should the value of pi change". While that statement is most certainly true, the situation is not very likely to happen.
Making common sense assumptions makes sense even in programming. And even if common sense fails a rare occasions. I could mention quite a few examples of lists of persons, names or IDs where I would never consider the situation that e.g. a database system with a maximum of 16 Mi tuples would have insufficient capacity.
I would be curious to know how may of the 15.9 million CP members have been giving one or more contributions to the form the last 12 months, writing anything anywhere on the site, and who is still a member (not counting spammers who are thrown out). I suspect that the count would suggest that a 24 bit number should be enough for everybody.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
trønderen wrote: I suspect that the count would suggest that a 24-bit number should be enough for everybody.
The way that user IDs are allocated seems to indicate that they are defined as an "auto-incremented" integer. This means that user IDs are never reused. So, while there are less than 16M users on the site, the user ID values will grow without bound.
I expect that message IDs are allocated in a similar way (presumably with a 64-bit ID).
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Daniel Pfeffer wrote: The way that user IDs are allocated seems to indicate that they are defined as an "auto-incremented" integer. This means that user IDs are never reused. Oh, sure. But if there never are more than, say, 200 active users, it seems a little overkill to use a 64 bit member user ID for the purpose of preventing overflow. Similar to increasing the variable holding the second count since 01011970 to an 128 bit value "to be on the safe side". That is a "... should the value of pi change" kind of rationale.
To make one thing very clear: There is nothing wrong about 64 bit values - certainly not if you have got unlimited space and unlimited processing capacity available, and most certainly not for one 64 bit value. If the cost of using 64 bit values is zero or practically zero, then you may of course use 64 bit, even to store bools. Then you can simply ignore space considerations. You need to care for space requirements only if you do not have unlimited space.
Whenever a (hard or soft) limit is broken is a time to sit down to consider, whether an 8-bit, 16-bit, 32-bit or some other limit. Don't forget that 64 bits is not 32 bits more than 32 bits, it is 11 orders of magnitude more! 32 bits is 5 orders of magnitude more than 16 bits, not 16 bits more. "A billion here, a billion there - you know, pretty soon it grows into real money" ... Whether he actually phrased it that way or not (never trust quotes to be exact!), the message is clear: Keep the magnitudes straight. When someone states something like "It was either millions or billions, I am not sure", you shake your head, even though that is only three orders of magnitude. So shouldn't raising the upper limit by five, or even eleven orders of magnitude be handled as something rather significant?
Raymond Chen's last blog entry, posted yesterday, addresses 8 bit counters: Why does GlobalLock max out at 255 locks?[^]. Note his final paragraph: "This hasn’t caused any problems for 30 years, so I think we dodged a bullet there."
(And for those unfamiliar with "The New Old Thing": That is among the most readworthy, and enjoyable, IT blogs in the entire internet. Bookmark it!)
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
trønderen wrote: But if there never are more than, say, 200 active users, it seems a little overkill to use a 64 bit member user ID for the purpose of preventing overflow.
I agree with you regarding appropriate choice of value sizes.
The number of users is unlikely to exceed 232 (4 billion - half the population of the planet!), so I agree than a 32-bit autoincremented value is sufficient for that. However, the number of messages has likely already exceeded 232 - this would require that the average number of messages per user be only 256.
trønderen wrote: "The New Old Thing": That is among the most readworthy, and enjoyable, IT blogs in the entire internet.
Agreed. I've learnt quite a bit, both about Windows internals and about general programming from this blog.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Some small part of such things definitely manifests in places where DBAs rule with iron fists.
It's simply 'worth' the tradeoff of uhm... 'repurposing' a field than dealing with bureaucratic sadists.
|
|
|
|
|
If the number of privileges exceeds 32, then just use a bigint!
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
Almost all embedded MCUs are little endian.
Almost all display controllers that can connect to them are big endian.
My graphics library builds pixels in big endian order on little endian machines as a consequence. It's just more efficient.
LVGL is another embedded graphics library - one I contribute to - and they removed a feature on version 9+ where you could set it to swap the bytes on a 16 bit color value. This is to compensate for the endian issues.
Not swapping during the draw operation means you need to scan and rearrange the transfer buffer before you send it to the display.
evoid lcd_flush_display( lv_display_t *disp, const lv_area_t *area, uint8_t * px_map) {
size_t count = (area->x2-area->x1+1)*(area->y2-area->y1+1);
for(int i = 0;i<count;++i) {
uint16_t* p = &((uint16_t*)px_map)[i];
*p = ((*p<<8)&0xFF00)|((*p>>8)&0xFF);
}
esp_lcd_panel_draw_bitmap(lcd_handle,area->x1,area->y1,area->x2+1,area->y2+1,px_map);
LV_UNUSED(disp);
}
This is less efficient. It's also ugly. This is what's for dinner now in LVGL. This is "progress".
The worst part is I understand and even sort of agree with why they did it.
The issue is multiple displays, and the fact that some displays may not need the swap, perhaps because they're monochrome or something. Previously prior to 8 LVGL simply didn't support that scenario because the swap option was a #define (LVGL is C, not C++) and it applied to all displays as a consequence.
But to remove it entirely seems like it was a decision guided by expediency more than anything. It's unfortunate.
And all of it reminds me of why I hate the fact that humans couldn't universally agree on endian order.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
modified 1-Jun-24 13:36pm.
|
|
|
|
|
One little two little three lit....
|
|
|
|
|
My wife is little-endian...
|
|
|
|
|
So which way would you want it?
The little endian way we write email addresses and domain names? Or the big endian way we write IP addresses?
The little endian way we write snail mail addresses on an envelope? Or the big endian way we dial a phone number?
The little endian way we sign a document with our full name? Or the big endian way we are listed in the telephone book? (Iceland is an exception - the phone book is little endian!)
The big endian way we write multi-digit Arabic numerals? Or the big endian (?) way Arabs write multi-digit numerals? They reading right-to-left, so to them, it is little endian.
The big endian way Americans write month/date, or the little endian way they write month/date - year?
The little endian way street addresses are written by American standards (17 Main street) or the big endian way used e.g. in Norway (Storgata 17)?
The big endian way when adding an entrance (17 Main Street, Entrance B) or stick to a consistent little endian way (Entrance B 17 Main Street)?
Four-and-twenty blackbirds baked in a pie, or Twenty four blackbirds baked in a pie?
A quarter past nine, or nine fifteen?
If you want the entire world to agree on a single endianness, you probably have to work yourself up to a position of an almighty ruler of the world. Even if you got in a position where you could turn the other way IP addresses, and put the area code at the end of the phone number, and reorder the phone book on first names rather than last, and teach schoolkids to write the tens after the ones and then the hundreds and the thousands, and making all Americans write the date before the month, and ... What would you do with IBM mainframes? With systems based on MC68, 8051 or OpenRISC, older Power and SPARC systems? Would you make them illegal?
There are some small, insignificant embedded (and other) architectures that allow memory access in either endianness, such as ARM, newer Power or SPARC, Alpha, MIPS, i860 and PA-RISC. You may consider them all to be so unimportant that you will ignore them. You may also find unimportant CPU instructions for reversing the byte order in halfwords, words or doublewords, provided by several architectures of one (or one preferred) endianness.
I guess you will be following up here at WeirdAndWonderful/Feature/com.codeproject.www//:http
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
Frankly, little endian everything, if I had my way, but it's too late. The internet is big endian for example.
I don't know what you're talking about in terms of writing email addresses and domain names.
I'm talking about byte order of machine words. That's all.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: I don't know what you're talking about in terms of writing email addresses and domain names. What goes first, to the left: The smallest unit, i.e. the individual recipient, or the larger unit comprising a huge number of them, the mail server name?
Which goes first: The smaller subdomain 'codeproject' or the TLD 'com'?
Are you really sure that the internet is consistently big endian? IP addresses certainly are, but is that all that there is? (Then let's keep dancing ...)
Curious memory: The very first map I saw of the internet, the nodes were labeled with a number, mostly 3-digit. Several of the nodes had the same number. I had to have these labels explained to me: It was IBM 360 mainframe model provided at that site as the main computing resource. IBM mainframes always were big endian. That might explain why IP addresses were selected to be big endian.
If you go for consistent little endian format: Do you consider the last character in a name to be the most significant? If memory contains the bytes, at increasing addresses, 'J', 'O', 'H', 'N', would you then consider 'J' the least significant one, e.g. if you were to sort a list of names? Would you rather choose to store 'JOHN' as 'N', 'H', 'O', 'J'?
Would you then require all names (or other strings) to be of a fixed length, with the characters to be compared for sorting at the same offset from the start of the string? Or would you construct a descriptor for each string, each with an index (offset) starting at the last character, decrementing it as the sorting progressed to less significant characters?
Or would you store numeric values as little endian, but strings as big endian?
Bottom line: It isn't as simple as 'Choose one and always use that'. You can't define yourself away from endianness problems, even if you have omnipotent power. Unless, of course, if your power is so omnipotent that you can define two plus two to make something else than four.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
Generally, the Internet is big endian, as in any binary protocols that exist on the internet expect "Network Byte Order" which is big endian. I'm happy to be proven wrong on that score, but it applies to everything I can think of, be it 32 or 64 bit IP addresses, 16 bit port numbers, NTP, etc.
I generally sort asciibetically if I'm sorting with a machine. I don't care about text in this instance.
Like I said, I was only referring to machine word byte order.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Sure, we could select one tiny little speck of the endianness problem - byte ordering, say - and ignore the rest. Assuming that it was possible to select one single byte ordering and rule out any others, we could declare: 'Hooray! Now the world is free of endianness problems!'
Maybe that would be the case inside your little IDE. At least until you have to handle a date. Or an IP address. Or a decimal multi-digit number.
You may argue: But those are not endianness problems! They is something different from the ordering of bytes within a word!
I'll accept what you say, and you have the full right to say that nothing else than byte ordering falls in under 'endianness'. But the issues are about ordering smallest unit to biggest or biggest to smallest. We could use a different term for this wider problem area, e.g. 'big ordering' or 'small ordering' (and 'mixed ordering'), making 'endianness' a subset of 'ordering'.
In principle, we could then throw out all IBMs, Powers, MC68s, OpenRISC and a number of others, as they are more or less bound to the forbidden endianness. That would leave the tiny 'endianness' speck of the ordering problem area 'solved' (and a few manufacturers would get rid of some nasty competitors). But the rest of the 'ordering' problem domain would remain unsolved. I understand that you don't care about the rest. That is OK with me, as long as you accept that today is 2/6, 2024.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
trønderen wrote: That is OK with me, as long as you accept that today is 2/6, 2024.
Heh. As an American, that actually took some getting used to when I'd date my articles here. We do things differently here. Dumber. See our football. Our "cheese". etc.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
I like this approach: Let's decide, whether the whole world is little or big endian.
Then we can apply this decision in the computer world.
By the way, what programming language was used to write the whole world? I think this was C.
|
|
|
|
|
As the Universe is still expanding, going from small to big, I think the answer is obvious. At least on the scale of the Universe.
|
|
|
|
|
Honey the codewitch wrote: humans couldn't universally agree on endian order.
Now we know why: the endianness of the whole universe is gradually changed from little to big.
|
|
|
|
|
One little, two little, three little Indians,
Six little, five little, four little endians...
|
|
|
|
|
Here's a little test for you.
Does this code compile?
public interface IStorage{
int Save();
}
public interface IPersist{
int Save();
}
public class SaveImplementor : IStorage, IPersist{
public int Save(){
return 1;
}
}
I'm implementing two interfaces which contain the same virtual method signature.
Well, it seems a little odd to me.
Obviously, if you want the separate implementations you have to write them explicitly.
Like this:
public class SaveImplementor : IStorage, IPersist{
int IStorage.Save(){
return 1;
}
int IPersist.Save(){
return 2;
}
}
IMPORTANT NOTE: Notice that in the first example you HAVE to include the public modifier on the method implementation.
HOWEVER, on the second example where you explicitly implement the Interface method you CANNOT include the public modifier.
I'm filing this one under weird.
But, I guess I accept it. I have to, or else the C# compiler doesn't accept me.
Answer - The first example does indeed compile.
EDIT
Oh, and after I posted that, I went back and new'd up a SaveImplementor() and then I couldn't figure out how to call either of those explicit methods.
Hmm... It's got me thinking now.
EDIT 2
Here's the simple example that explains the explicit implementation: Explicit Interface Implementation - C# Programming Guide - C# | Microsoft Learn[^] .
modified 25-Apr-24 16:36pm.
|
|
|
|
|
raddevus wrote: Oh, and after I posted that, I went back and new'd up a SaveImplementor() and then I couldn't figure out how to call either of those explicit methods.
Hmm... It's got me thinking now.
((IPersist)saveImplementor).Save()
((IStorage)saveImplementor).Save()
If I'm remembering correctly, saveImplementor.Save() raw won't work in the second EII example, you'd have to ALSO have a public int Save() {} . I believe this is because EII implementations are ad-hoc polymorphic, the implementation differs depending on the type-view/cast of the object - as opposed to the standard implicit ones which are parametric polymorphic, so its the same implementation for every type-view/cast of the object.
Honestly it's been a long time since I've seen EII used because of this weird behavior. It makes it so you can have an object that seems to completely change it's class/behavior with a simple cast, AND all that new behavior is completely hidden from all other casts of the SAME object.
|
|
|
|
|
For added fun, try throwing some default interface members[^] into the mix.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
I've always understood C# to automatically "fill in" any empty method slots with methods of the same name. Contrast this with VB.NET where you must (AFAIK) always explicitly tell it which interface it implements.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
... after a frustrating week, this could be taken as a general rant at the quality of open-source software with applications to particular pieces.
Given enough eyeballs, where are those eyeballs looking? Surely not at libpng[^] - the reference implementation for the PNG format. It is a smallish library of about 20 kLOC but the configuration file with all the possible options has over 200 different options. That makes for 200x200 = 40k potentially different ways you could build the library. Either that or some of the options are redundant.
The code quality is atrocious. I understand that it's a project started in the '90es but that's no excuse for not cleaning it up from time to time. You cannot let one test program get to 12000 lines in a single file. And those 12000 lines are full of miracles like parameters and structure members called this ! Don't you worry! At the beginning of the file there is this fragment:
#ifdef __cplusplus
# define this not_the_cpp_this
# define new not_the_cpp_new
Also, if the byzantine compile time configuration options make it impossible for the program to run would you think of throwing an error using a #error directive? NO, good quality open-source code just wraps the whole program in an #if block with the #else clause, 12000 lines below:
#else /* write or low level APIs not supported */
int main(void)
{
fprintf(stderr,
"pngvalid: no low level write support in libpng, all tests skipped\n");
return SKIP;
}
#endif Remember: these are compile time conditions; why would you fail at run time?
Have you heard of semantic versioning? Well, check this out (straight from the LIBPNG) web site:
Quote: At present, there are eight branches:
master (actively developed)
libpng16 (equivalent to master)
libpng17 (frozen, last updated on 2017-Sep-03)
libpng15 (frozen, last updated on 2017-Sep-28)
libpng14 (frozen, last updated on 2017-Sep-28)
libpng12 (frozen, last updated on 2017-Sep-28)
libpng10 (frozen, last updated on 2017-Aug-24)
libpng00 (frozen, last updated on 1998-Mar-08)
These translate in version numbers as 1.6.x, 1.7.x, 1.5.x, and so on. So, let me get this straight: version 1.7 is frozen and version 1.6 is actively developed? Have you guys ran out of numbers? And, guess what, in code you find many tests like these:
#if PNG_LIBPNG_VER >= 10700
if (!for_background && image->bit_depth < 8)
image->bit_depth = image->sample_depth = 8;
#endif ???
I will stop here although, after a week of frustrations, I could go on and on.
There is a well-known commencement speech: Make your own bed - University of Texas at Austin 2014 Commencement Address - Admiral William H. McRaven - YouTube[^]. As a developer, and specially as an open-source maintainer, before writing a single line of new code, do everyone a favour: clean the project you are working on; make your own bed!
Mircea
modified 21-Apr-24 13:44pm.
|
|
|
|
|
Thanks for the amusing rant!
You're saying the implementation of the PNG format is only 20 KLOCs and is a hacked together PoS written in C? I'm stunned no one has rewritten it. That was something that regularly gave me joy! But let me guess. Obscure corners of the spec for which no test images exist? People who'd scream about a breaking change that forced them to revisit magic settings for 200 options? Or just, "if it ain't broke, don't fix it"--which makes sense if it's not seeing new development, but it sounds like it is. And that many developers contributing to it over a long period of time were fearful of breaking it, so most (or all, because of a "policy") enabled their new code with a new option?
|
|
|
|
|