|
The level of the library should determine who is doing most of the validation. If I am using a low level C library functions like memcpy then all validation is on my head. But if I am using a high level library I expect it to do some, not all, of the validation. After all I bought the library so I could develop a reliable program in the leased amount of time.
FYI: I have a tendency to over validate, since program crashes, etc.. are against my religion.
John R. Shaw
"Trust in the code luke." "Ya right."
|
|
|
|
|
John R. Shaw wrote:
The level of the library should determine who is doing most of the validation
Exactly!
cheers,
Chris Maunder
|
|
|
|
|
I found the results of this poll rather puzzling.
I voted for the first option; the person who uses the library should use it correctly.
It seems like a very odd thing to do to do the error checking in two places, which is what a whole lot of you guys voted for. Why would you waste time, implementation (your own time! You should be at the beach) and CPU-time, doing the same thing twice? Especially when you know that if it's done "right", it simply executes the exact same code twice.
I could understand the argument that "it's better to be safe than sorry", but this sounds kludgy; programming by approximation ("oh, so that's where it crashes. I guess _I_ should have taken extra precautions, even though I'm used to the library catching these things for me.")
There are a number of reasons why I'd rather have the user of the library make sure he feeds correct data than having the library check it every time:
- Efficiency. If the input value needs to be between 0 and 255, the library could check it, but if the user wants to iterate from 0 to 255, that's 256 if-tests out the window CPU-wise, but more importantly:
- Allocation of responsibility. A good coder should know his libraries well enough to know what their valid input parameters are. A good library should make clear for which input values it's supposed to work.
It's important that you, the library user, knows where the check is being done, and I for one, would be much more comfortable checking my own input than having that being left to the library writer. Not that I don't necessarily trust him/them... But why should I? The excuse "I picked a bad library, so it isn't really my fault" might fly with your geekiest of friends, but the boss needs a working product.
It's been pointed out that some things are downright infeasible for the library writer to check himself, especially if it's a library for å low-level language like C or C++; how do you determine if a handle is valid? For scripting languages like PHP, I like to make a level of indirection from the handle to the user, and make sure the "Resource ID" is valid. Why? Because you can't trust a PHP-programmer to do what he should, and he'll take down the webserver given the chance. Doing this with a C/C++-library (and you could, of course), you'd be walking around in quicksand.
Obviously, C and C++ themselves advocates correct usage of libraries. You should not free a pointer not returned by malloc. You should not delete [] something not returned by new []. The STL expects you to make sure you're iterating from .begin() to .end() on the same container. Or else... all hell breaks loose.
To sum up my point of view:
- Consider your users' needs.
- For a low-level library, it's more important that the correct usage of a library is documented than enforced by code. That way, everybody knows who's to blame when things go wrong.
--
O< O- O< *quack* O- O< O- O< O- O< *kvekk* O-
Lars Thomas Denstad <larsde@redloop.com>
http://www.redloop.com/larsde/
O< *pip* *pip* *pip* O- O< O- O< O- O< *quack* O- O< O-
|
|
|
|
|
Why perform error checking in both places?
In my opinion, these days, when processors are very fast, we don't (usually) have to be as concerned with extracting the last little bit of performance from our apps.
Put it another way. Customers who use our products pay for both efficiency and robustness. Clearly, very sluggish or memory-piggish apps won't be appreciated, but neither would very fast apps which crash a lot.
Again, because processors are (usually) very fast these days, we can afford to spend some performance in assuring robustness. Just my opinion.
|
|
|
|
|
Poor data scrubbing in a library is an open invitation for security holes. If there is a flaw in a library that can be exploited, someone will find it and try to exploit it. Telling hackers and crackers to stop abusing it because they should be using it correctly poorly written library is not going to work.
I voted for number two. The responsibility for the integrity of a library rests solely upon the writer of the library.
|
|
|
|
|
Bingo! If you publish a library (esp. one without source code), you need to make darn sure it doesn't open a security hole. The evil s'kid'diots are fed more clever code every day. If your library is the hole they use to get into MY system, you'll get some hate mail.
|
|
|
|
|
Ok, now imagine a Win32 API without thorough error checking, because, as you say, the application should not pass invalid data.
You passed a wrong ACL and your data was corrupted? Well, take care next time.
You passed a wrong pointer and the OS blue screened? Take care next time.
I think you would pretty soon rethink your theory
powerful binary resource reuse - another word for "no sources, you are stuck with a pain-in-the-a## COM component"
|
|
|
|
|
Also look on the "reusability" side,
The reason the application is calling functions in a library is to reuse code, (whether it's about oop or structered programming, doesn't matter) so why should you do the same checks again and again when you can have it in one simple and safer place get checked, namely in the function.
Another argument would be that the function itself is responsible for doing actions with the input data, so it should know more about the nature of the data and thus can be more accurate in restricting the data, hence it performs the operation.
Surprising results imho as well, option number two should get more votes.
|
|
|
|
|
I think each error has a 'locus' determined by the knowledge required to detect and/or remediate the error.
A library should expect to be used as a black box, and therefore should at least validate the assumptions it makes about its arguments. What's important is that the knowledge required to validate the library arguments lies within the library itself. Requiring the caller to validate arguments disperses that knowledge to multiple locations, increasing the probability that the error detection will not be consistent.
The caller does have a role to play in validation, however. It has knowledge of the context in which the library is being used. It is the caller's responsibility to validate its use of the library against that context. In this case, the library doesn't have any way of knowing the application's context, and therefore can't validate against it.
How the library or the caller handles a detected error is immaterial; it can assert , throw an exception, return an error code, or halt the program. Any of these responses is valid, depending upon the situation (and personal taste).
Software Zen: delete this;
|
|
|
|
|
When it comes to security related functions, both parties should IMO verify data. At the very least, the library should.
But for applications which are unrelated to security, I like to use the programming by contract paradigm. It basically means that if the caller ensures that the input is valid according to the functions contract, then the function will deliver according to the contract. everything else is undefined (in debug builds that usually means ASSERT() ). Bugs are found in no time.
I put the error checks in the I/O-layer. Check the text boxes, check the file data, check the whatever-input, and issue errors as soon as possible instead of doing it 10 function calls down into the library.
Basically, this method puts more effort into checking data in one layer. Beyond that layer, everything is considered ok. It has worked for me so far.
But then again, methodologies are like religions. Pick your own path and see where it leads you.
--
Chatai. Yana ra Yakana ro futisha ta?
|
|
|
|
|
Jörgen Sigvardsson wrote:
But for applications which are unrelated to security, I like to use the programming by contract paradigm. It basically means that if the caller ensures that the input is valid according to the functions contract, then the function will deliver according to the contract. everything else is undefined (in debug builds that usually means ASSERT()). Bugs are found in no time.
Unfortunately its exactly this behaviour which causes security holes. The leaky functions are not any security related ones, because they are cleverly designed. No, its the innocent looking things like strcpy which causes troubles.
Jörgen Sigvardsson wrote:
But then again, methodologies are like religions. Pick your own path and see where it leads you.
Nope, ask someone who may know better which path is usable and improve it. Never believe that you are clever enough to invent something better than anyone else. Always get some more pairs or eyes and some other brains thinking on it.
powerful binary resource reuse - another word for "no sources, you are stuck with a pain-in-the-a## COM component"
|
|
|
|
|
I know of one major phone PBX that will crash if you send a bad connection string via its OAI stream interface.
I have one rule, I don't care what the routine does with the garbage I pass in as long as it doesn't crash.
Michael
Wonder Woman, Wonder Woman.
All the world's waiting for you,
and the power you possess.
In your satin tights,
Fighting for your rights
And the old Red, White and Blue.
|
|
|
|
|
Michael P Butler wrote:
I know of one major phone PBX that will crash if you send a bad connection string via its OAI stream interface.
Oooh, bad! On the other hand, one wouldn't have to talk to people the rest of that day.
I don't know if the vaugeness of the poll was intentional, but what you have displayed isn't what I'd call a "library" call. The (very serious) error here is that a function that should expect any input (both since it's a vital system and should reasonably have to assume bad input, and there is no way in h*ll they could reasonably assume only valid and predictable input from something as inherently unsafe as a communications port) isn't verified, and more importantly it's not handled in a reasonable manner.
I'd say it all boils down to the readers definition of "library" and what the documentation of the library states.
If library is the C lib, I would expect e.g. memcpy(0, 4711, 42); to crash.
If the library is e.g. user32.dll, I would NOT expect it to crash from bad parameters to e.g. CreateWindow (probably mostly since it's documented to return error-values, and that it has to verify parameters since it's a kernel-call).
If library is a C library documenting "If parameters are in error, we will return error code" I wouldn't expect it to crash. If it's not documented, or documented to only accept legal pointers, I'd expect it to crash given illegal input.
If it was see-hash I wouldn't expect it to even start...
|
|
|
|
|
I would have to choose the caller, because the caller can always verify the data according to the documentation, even if the library checks the data as well.
However, if the responsibility is put on the library, it is possible that the library does not take any responsibility for data and does not verify it. In this case since the caller depended on the library, and the library did not verify, then no one has checked and a problem can occur.
With that being said, I still think it depends and this design issue should be determined up front before coding or the choice of library is made .
Build a man a fire, and he will be warm for a day Light a man on fire, and he will be warm for the rest of his life!
|
|
|
|
|
c'mon, no option for "Dump core and let the user figure it out"? W0t!?!
shog
nine
Ever since i heard the voice
i thought i had no choice...
|
|
|
|
|
Eh, C++ isn't any better. (Actually worse). There we either just display a message box say "Program hosed, exiting" or we try to repair ourselves and half the time leave the program in a trashed state. I have had programs that would display an exception message box in their OnPaint. Every time you tried to exit the program, you would get another exception.
Tim Smith
I'm going to patent thought. I have yet to see any prior art.
|
|
|
|
|
:: Jeremy waves his hands franticly. :: Here, here, over here!
Jeremy Falcon
Imputek
|
|
|
|
|
Sometimes I write libraries, and sometimes I use other libraries. So, we have two scenarios:
1) When I use other libraries, I want them to be "lean", and I'll take care about error-checking. STL is a great example of this.
2) When I write a library that others will use, I'll do all error checking (unless the requirements say otherwise), so that whoever uses the library cannot blame me for their incompetence.
|
|
|
|
|
I remember barely a few years ago hearing from C++ devs who prided themselves on writing lean and mean apps that did as little error and bounds checking as was necessary. Not to say that it wasn't done: rather, that it was done before a function call and then once the data verified the function would be called. The thought that the function they were calling would do further checking was considered by some to be not just a waste of cycles but also something only of value to lame programmers who still slept with their teddy bears used night-lights.
It's interesting to see a shift in focus from performance based programming to secure and robust based programming.
cheers,
Chris Maunder
|
|
|
|
|
Chris Maunder wrote:
rather, that it was done before a function call and then once the data verified the function would be called.
Which is fine, so long as you control both. But the use of 3rd party libraries requires the library to either be very well designed and documented (so validation *can* be performed ahead of time), or to be robust enough not to crap out when bad data is passed in.
shog
nine
Ever since i heard the voice
i thought i had no choice...
|
|
|
|
|
Being a C++ dev at heart, I feel I'm in a position to answer.
IMHO the poll is heavily biased towards fat interface APIs with little abstraction, where pointers to built-in types are passed around. It is in the C++ spirit to minimize the possibility of erroneous calls rather than detecting them by means of (at least) the following two techniques:- Data abstraction,
- RAII (Resource Acquisition Is Initialization) idiom.
Let me give a couple of examples about this:- Buffer overruns and dangling
char pointers can be a nightmare only if you use them. std::string greatly reduces these risks simply because they are not allowed. Here, data encapsulation ensures some degree of integrity before related pieces of information (like a buffer and its length). The richer the semantincs of a type, the less errors of this kind will happen. - The semantics of
CWnd s often force the programmer to go trough the sequence construction-creation-destruction. This is IMHO an open hole for errors cause the possibility exists of having a "void" CWnd with no attached HWND . If MFC guys had moved to RAII, CWnd s will be created as part of their constructor, elminating the possibility of the error described above. This render the common ASSERT(IsWindow()) s unnecesary. "void" CWnd s van still be modeled, should the need arise, by having unassigned CWnd pointers.
Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo
|
|
|
|
|
Joaquín M López Muñoz wrote:
The semantics of CWnds often force the programmer to go trough the sequence construction-creation-destruction. This is IMHO an open hole for errors cause the possibility exists of having a "void" CWnd with no attached HWND. If MFC guys had moved to RAII, CWnds will be created as part of their constructor, elminating the possibility of the error described above. This render the common ASSERT(IsWindow())s unnecesary. "void" CWnds van still be modeled, should the need arise, by having unassigned CWnd pointers.
I think that you are right about RAII but in that particular case, I'm not sure of what should be the solution since allowing no attached HWND allows so simplification of the code and help for RAD...
The C++ solution (and Delphi too) is that the object would hide the fact that the object is created or not by esentially duplicating all properties (for ex. Text, Position,...) so that you can alter them even when the object is not yet created... It works relatively well. I think that .NET Forms would be similar in that area.
Maybe a good solution would be to hide a bit more the user interface by access the data through a class that would knows if a interface object is connected to it (so that the UI is updated when created)... But IMHO, such a solution is probably too complicate for the benefit.
But OTOH since MFC uses DDX for data exchange, maybe it would have been possible to uses RAII exclusively... but then we would have need the possibility to transfer far more property... and also MFC update everything at once and it would cause a lot of overhead...
Philippe Mori
|
|
|
|
|
Chris Maunder wrote:
It's interesting to see a shift in focus from performance based programming to secure and robust based programming.
I think there is a middle ground for C and C++ programmers, a way to have it both ways (almost). My approach, which I've just recently adopted, is to use assertions to validate preconditions. That way when the code is finished, those tests do not affect the performance of the final product. Exceptions come into the picture when a function cannot perform its duty after the preconditions have been verfied through the assertions. In those cases, an exception is thrown.
This is not a full-proof method because it is possible to imagine certain situations in which the UI passes along data from the user that is invalid in production code. The assertions are no longer there at that point, so you have trouble. However, this is the UI's responsibility and not the libriaries; validating user input is up to the UI. As long as the preconditions are clearly documented, the library is no longer responsible for these kinds of bugs.
I think the use of assertions for validating preconditions is a good middle ground - Defensive programming without driving yourself nuts.
|
|
|
|
|
*wave arms* I'm here!
But I do make a difference between "ordinary code" and security related code.
In Debug-builds I [ATL]ASSERT() anything that moves. ASSERT() is a friend you can trust.
--
Chatai. Yana ra Yakana ro futisha ta?
|
|
|
|
|
Robustness first, performance second.
I've always lived by that rule, unless the boss forces me to do otherwise.
Most performance implementations are based on assumptions about the code. That's all well and good until you get some hotshot programmer that breaks all your assumptions and tears down that house of cards you call an application.
------- signature starts
"...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001
Please review the Legal Disclaimer in my bio.
------- signature ends
|
|
|
|
|