The | operator is the bitwise logical or operator. It is not a non-short circuiting equivalent to ||.
You're right. Thanks for pointing this out.
I will use the terms used in section Operators of the official Oracle docs (i.e. logical OR operator and bitwise inclusive OR operator), and I'll change the text from:
Developer E also gets it wrong — not because the order of the expressions is wrong, but because she uses the non-short-circuiting or operator | instead of its short-circuiting counterpart ||, which also means that a null pointer error is thrown if email points to null:
to:
Developer E also gets it wrong, but not because the order of the operands is wrong. Instead of using the short-circuiting logical OR operator ||, she uses the bitwise inclusive OR operator |, which has the effect of a non-short-circuiting logical OR operator if both operands are of type boolean. Therefore a null pointer error is thrown if email points to null:
Note: I've changed already the original article, but it will take some time for the changes to appear here on codeproject.
Both are useful -- empty does not imply null unless that is what the caller wants.
In the particular case of reading a CSV file, [,,] is not the same as [,"",] -- the caller may want the former or both or neither to be replaced with null.
In the particular case of reading a CSV file, [,,] is not the same as [,"",]
That's an interesting case.
In idiomatic PTS, [,,] and [,"",] are the same; they are both parsed as null.
This makes client code:
- simpler, because we only have to check for null
- less error-prone, because if we forget to check for null (often a corner case), the compiler reminds us to do it (in a null-safe language)
However, this works only if [,,] and [,"",] have the same meaning.
Suppose that [,,] and [,"",] have different meanings (according to some specification), and that both values need to be handled differently in client code, although this is bad practice:
- for reasons explained in sub-section 'Argument #3' of section Can We Do It?
- according to Microsoft's Guidelines for Collections: "The general rule is that null and empty (0 item) collections or arrays should be treated the same."
Section Working with Non-PTS Libraries briefly explains how situations like this can be handled in PTS. For example, we could use type emptyable_string, provided to be used in such cases. The parser would then return null for [,,] and an emptyable_string for [,"",]. This allows client code to differentiate both cases and handle them differently.
It all boils down to specific app requirements. Any general purpose language shall provide as much flexibility as reasonably possible.
Its perfectly normal to use null for string as well as nullable value types in c#. That a value is not provided. E.g. null - no user input; empty - user specifically/intentionally left blank input. Usefull when you need a user to demonstrate his intension.
Dev shall never assume anything unless no option. If smth is objectively unknown, assume worse. Its called defensive programing and it pays off.
There's a reason that after this iterations of C# (currently C# 12, 2023) that there still exists String.Empty.
It's not merely for backwards compatibility... once Entity Framework was introduced, along with CodeBehind, table classes require that string field(s) be initialized with String.Empty for non-NULLABLE columns in the underlying database.
Even more compelling is when you want to ensure that your business logic does not generate NullReferenceExceptionsthat you use String.Empty to represent "blank" fields. I would argue that adding in null checks and exception handling scaffolding makes the code harder to read, harder to maintain and more error prone.
Best practices is to write lock free, exception free code to reduce the number of paths required to implement and test.
My subject matter expertise is in C/C++/C#, so I cannot speak to other languages. I would, however, posit that there's likely similar conditions/circumstances which might exhibit the same, or similar requirements.
Indeed, I should have mentioned them in the article.
In a previous comment I said:
... a whitespace-only (non-empty) string is a perfectly valid value in PTS.
A whitespace-only string could represent, for example, the indent in a line of a YAML document (important data required to parse the YAML structure correctly).
After all, nullitself does not have a clear type.
While some languages, including C#, have a nullable specific type like List<int>?.
Some do not, such as Python (I guess).
While you could List<int>? numbers = null;, trouble comes up as soon as you try to append an item to the list. To do so, the list must not be null after all.
Instead, the list must be overwritten by a list of one element.
numbers = new List<int>(){7};
AIUI, we'll get better benchmarks when we first allocate the memory for an empty list, then just append an element instead of multiple steps:
(1) allocate memory for numbers,
(2) write new List<int>(){7} into memory,
(3) copy contents of Step (2) into numbers,
(4) let garbage collector remove the copy.
After all, null itself does not have a clear type.
Indeed, most languages don't have a specific type for null (as far as I know). On the other hand, PTS has a dedicated type null, as explained in section The null Type of a previous PTS article.
DGrothe-PhD wrote:
trouble comes up as soon as you try to append an item to the list. To do so, the list must not be null after all.
Lists are immutable in PTS, as explained in section Non-empty Immutable Collections. Functions use Immutable Collection Builders to create a list, and then return a non-empty, immutable list (or null if null is allowed). Because lists are immutable, we can't append elements to them.
However, if you need a list that changes over time (e.g. to implement a stack), then you can use a non-null mutable list which can be empty, as explained in section Mutable Collections That Can Be Empty.
C code will crash if you pass a null pointer ro wcslen. So I wrote wcslen_s which is similar to IsNullOrEmpty. I did this because I sometimes got an empty string and sometimes got a null pointer when I called an api function.
I appreciate the attempt to look at things differently, but I can't get behind any of this. You talk about "null safe" languages, but honestly, this is just Option/Maybe in disguise. What type is null? Originally I assumed it was just a value that any reference type could have, but no, you use it as type, as evidenced by your saying you use a union of list<string> or null. So null isn't a value (with no type) but is actually a special type, i.e. it's language sugar for Maybe/Option. Great, I can get behind that. But...
What do you do with a function like filter, which could filter out all items in the list? Oh, filter must return a union of list<string> or null, problem solved, right? Except now you've changed types and have made it extremely difficult to chain operations. OK, let's fix that by having filter and other collection operations accept list<string> or null as input as well. Now chaining operations works well again. Except... isn't this exactly the behavior you'd have if empty collections were allowed?
You really aren't solving any issues here. You're applying concepts that are already applied in other languages with new rules that are foreign, and simply more confusing.
I don't want a "null safe" language. I want a language without null, with Option/Maybe types, with empty collections and a type system strong enough to apply constraints to new types. A non-empty collection type is certainly useful in some cases, and I should be able to declare such a type, but making it the default would be bad, and not being able to have an empty collection type simply can't be considered a good thing, at least IMHO.
I think that your remarks/questions regarding null and null-safety are all addressed in articles #4 and #5 of the PTS article series (this is article #7 in the series):
#4: Union Types in the Practical Type System (PTS): explains why PTS uses union types for nullable object references (e.g. string or null)
#5: Null-Safety in the Practical Type System (PTS): explains the concepts of null and null-safety in PTS
Note: A link to all articles in the series can be found in section Links to All Articles in the Series at the end of the first PTS article.
wkempf wrote:
So null ... is actually a special type
Yes, in PTS there is a dedicated type named null, and the only valid value for this type is also named null.
A union type is used to declare a nullable object reference (e.g. supplier or null)
The above mentioned PTS articles #4 and #5 fully explain these concepts.
wkempf wrote:
So null isn't a value (with no type) but is actually a special type, i.e. it's language sugar for Maybe/Option.
Sorry, but I have to disagree on this.
Yes, both null-safety and Option/Maybe effectively eliminate null pointer errors, which is great. But they are both very different concepts with their own pros and cons. In my opinion, null-safety is the better option, for reasons explained in my article Null-Safety vs Maybe/Option - A Thorough Comparison, as well as in the above mentioned PTS articles.
wkempf wrote:
now you've changed types and have made it extremely difficult to chain operations
PTS provides special operators and statements to simplify null-handling and keep source code succinct (see examples in the PTS null-safety article)
For example, to chain functions that might return null, you can use The Safe Navigation Operator (?) (borrowed from other languages), and simply write:
const result = f1(a1)?.f2(a2)?.f3(a3)
The result will be null if any of the three chained functions returns null. You don't need to write ugly, nested if statements.
wkempf wrote:
isn't this exactly the behavior you'd have if empty collections were allowed?
Yes, the behavior is the same.
The main problem with empty collections is that we often forget that they represent corner cases which must be handled differently. The advantage of using null instead of an empty collection (in a null-safe language) is this: If we forget to handle the corner case then the compiler reminds us to do it.
Why are null pointer errors so dreaded? Because they occur often (even in production). And why do they occur often? Because we programmers often forget to check for null. Hence, null-safety helps us to write more reliable code, and by using null instead of empty collections and empty strings, this advantage is leveraged to all "there are no elements" corner cases. There's no risk anymore of overlooking this kind of corner cases.
wkempf wrote:
You really aren't solving any issues here.
Really? There are many considerable benefits explained in section Should We Do It?, and demonstrated by examples in the article.
wkempf wrote:
You're applying concepts that are already applied in other languages
I am not aware of any programming language that ensures null-safety (as done in PTS) and doesn't permit immutable collections to be empty, leading to the benefits explained in the article.
However, if you know such a language, then please let us know.
wkempf wrote:
I don't want a "null safe" language. I want a language without null, with Option/Maybe types
That might be a good choice because Option/Maybe types also eliminate null pointer errors, and they have other advantages which might be important in your context.
wkempf wrote:
... with empty collections
Then you can't benefit from the advantages demonstrated in this article. As explained in section Languages That Use an Optional/Maybe Type, the idea of non-empty strings and collections can also be applied in these languages, thereby reaping benefits akin to those demonstrated in this article.
wkempf wrote:
... and a type system strong enough to apply constraints to new types
Yes, that's a very important type system feature, very helpful to write reliable code.
wkempf wrote:
not being able to have an empty collection type simply can't be considered a good thing
I find several of these articles to be either naive, or worse, disingenuous. The article comparing to Option/Maybe in particular.
Quote:
Haskell uses the Maybe type with a generic type parameter. Form the Haskell doc.: "A value of type Maybe a either contains a value of type a (represented as Just a), or it is empty (represented as Nothing). The Maybe type is also a monad."
On the other hand, PPL uses union types to state that a value is either a specific type, or null.
Maybe is a union type. Everything this article discusses is purely about language syntax differences, not about any differences in the type system. I honestly see NO differences in the type systems. int or null is entirely identical from a type system perspective as Maybe<int>.
Quote:
Compile-time null-safety - used in some modern programming languages.
You've said this several times, but what other languages support what you call "compile-time null-safety"? If there are other languages, why are these articles of interest and why does PPL exist?
That entire blog post goes out of its way to show "failings" of the Haskell language compared to PPL (in quotes, because every single one is a questionable point), NOT to show differences in the type systems.
Indeed, a collection can be null, empty or non empty; and we can argue about the usefullness of the empty case.
In contrast, a string can even be whitespace-only.
And what about a non-empty collection whose elements are all null or logically-empty?
From a logical standpoint, whitespace-only strings are often valuable as much as empty strings, especially when originating from user input.
Note that in .NET, along with IsNullOrEmpty, we also have IsNullOrWhiteSpace (that checks for null, empty or whitespace only cases).
Furthermore, nullable behavior and empty concept are extendible to any class or struct type.
Why should we embrace the non-empty string and non-empty collection paradigm, while ignoring it for other types?
E.g.: should we enforce null rectangles (Rectangle? or Nullable<Rectangle>) in place of empty rectangles? When would be this helpful? When would this be harmful?
As a side note, I prefer to use null to represent missing or unavailable information and empty values when the information is available and explicitely not populated.
a string can even be whitespace-only ... whitespace-only strings are often valuable
Yes, and a whitespace-only (non-empty) string is a perfectly valid value in PTS.
A whitespace-only string could represent, for example, the indent in a line of a YAML document (important data required to parse the YAML structure correctly).
Daniele Rota Nodari wrote:
And what about a non-empty collection whose elements are all null or logically-empty?
That's also a valid case in PTS. If an object reference is of type list<string or null>, then elements in the list may (possibly all) be null.
Daniele Rota Nodari wrote:
in .NET, along with IsNullOrEmpty, we also have IsNullOrWhiteSpace
Thanks, good to know.
Daniele Rota Nodari wrote:
should we enforce null rectangles (Rectangle? or Nullable<Rectangle>) in place of empty rectangles?
Could you please explain what you mean by "empty rectangle"?
If you have a type Rectangle, then you can either have an object reference of type Rectangle (a struct with width, height, etc.), or an object reference of type Rectangle or null (or Rectangle?), in which case the object reference might point to null (i.e. no rectangle data available).
Daniele Rota Nodari wrote:
I prefer to use null to represent missing or unavailable information and empty values when the information is available and explicitely not populated.
Then you are assigning different meanings to null and empty, which, as explained in the article, can be error-prone and should be avoided (see "Argument #3" in section Can We Do It?)
Microsoft puts it like this in Guidelines for Collections: "The general rule is that null and empty (0 item) collections or arrays should be treated the same."
Side note: I really like your efforts in PTS to make it strong in handling null or empty objects (especially when it comes to immutable and mutable counterparts).
ChristianNeumanns wrote:
Could you please explain what you mean by "empty rectangle"?
An empty rectangle is a rectangle object that geometrically has no area and/or perimeter; that is, its width and/or height are 0.
Depending on the implementation, all its properties, including x and y, can be set to 0, or only some of them.
For example, in .NET:
- a Rectangle struct is empty if all of its members (X, Y, Width, Height) are 0.
- a RectangleF struct is empty if either of its size-related members (Width or Height) is 0 or negative. (Note that the documentation, including the XML in the source code, is inaccurate regarding the actual implementation).
ChristianNeumanns wrote:
assigning different meanings to null and empty, which, as explained in the article, can be error-prone
Yes, I agree: can be error-prone: it depends on implementation practices and on the role of the specific object (input data, optional value, etc.).
ChristianNeumanns wrote:
The general rule is that null and empty (0 item) collections or arrays should be treated the same
Indeed, I usually test for both cases before performing certain operations.
Clearly, I have to test for both because the data is allowed to be in either state; otherwise, I would need to test for only one of them (presumably null).
Just for reference, I usually don't trust any input: unless previously validated as appropriate, I write the proper checks in the methods.
I don't even trust Microsoft guidelines because they are often inconsistent with their own method implementations (as you can see from the IsEmpty properties of Rectangle and RectangleF).
Last Visit: 31-Dec-99 18:00 Last Update: 2-Jul-24 9:38