I change my previous opinion because I think I read your article too quickly !
I focused on the differences of pXML with JSON, and not on HTML, where indeed everything is in character string ...
The format that you offer does not guarantee the restitution of the original data
I'm sorry, but I don't understand what you mean. Could you give us an example please, and explain exactly what you mean by "does not guarantee the restitution of the original data".
DidierO wrote:
no guarantee on the type
All values in XML documents are just strings. That's how XML works, and therefore the same is true in pXML. There is no native way in standard XML to specify 'types'. You can add 'type information' with metadata, and you can define XML schemas to validate string values. But it's not like in JSON, where native values can be strings, integers, boolean, null. It seems that you are not aware of the fundamental basic differences between XML and other formats.
DidierO wrote:
loss of white characters at the ends of the values
That's simply not true, unless I totally misunderstand your point. If you write [name foo ] in pXML, then the trailing space after "foo" is part of the value of name. Please provide an example if this is not what you are talking about.
I honestly think that your vote is totally unjustified (because your arguments are wrong). You might consider reevaluating your arguments and vote.
Isn't that true for any format when you serialize for storage?
Yes, true for most formats.
My intention is to (later) add types as an optional extension to pXML. Besides predefined types like boolean, number variations, date, time, list, map, etc. it must be easy for a user to add customized types. I have a very concrete idea about how to do that (without changing pXML's syntax), and I might publish a "Suggestion for types in pXML" article in the future, and consider feedback from the community.
Most large documents are created by WYSIWYG editors. Style is preset by the developers of the editor and difficult to change. The current solution is Cascading Style Sheets (CSS). These can easily become a maintenance nightmare. What is needed is named blocks — sort of like subroutines in code. A syntax is needed to define a name and its pXML code block, both with and without the use of an external style file.
An important feature to simplify maintenance is to prevent redefinition of a block name using different code within a document. This a prevents block named StyleFoo from being redefined in a sub-sub-document and screwing up the formatting from that point on. This problem often arises when multiple documents become merged into a larger document, such as short stories in an anthology or as chapters into a user manual.
In my experience, the designers of XML documents design a style sheet which they know and understand and use very effectively. Years later, maintenance must modify the document, but the time to understand the style sheets is not available, so the maintainers use local formatting for the modifications. When the style sheets change, such as happens when two companies merge or the company's graphics change, the document becomes an instant mess. I have never seen management budget for the time required to fix these document issues.
Cascading Style Sheets is a good example. They were not mentioned in the description of the proposed syntax.
The problem with CSS is that styles can be redefined, causing the document to screw up after the redefinition. For a style that is used only occasionally, finding the redefinition can be time consuming and management never allocates sufficient (if any ) time document modification.
I suggest that if a named block definition is repeated identically, a warning should be displayed. If the definition differs, an error should be displayed and the original definition should be retained.
I have seen a hierarchy of CSS files redefine the style for the same element — usually <title> or <hn> and, of course, various table elements — multiple times. Of course, changing a CSS file to fix one document may well break another document that relies on the same file.
Disclaimer:
I am a software developer and maintainer who uses xml codes in documentation. I do not have the time to study the chain of CSS files used by existing documents that I have to modify. I am not, by any stretch of the imagination, an expert in xml document tags.
XML is definitely not terse. If just representing data is the goal there are, as you mention, many other syntaxes to use.
But the reason to use XML is because there can be a schema or (for the old school) a DTD. These definitions can describe in very great detail the structure of XML instance documents (the ones with tags and data). This allows the creator of an instance document to check that it contains valid content which covers not just structure but element values. The recipient of the document can also verify the document is valid.
Because the XML specification is as old as the hills, most languages include features to validate an XML instance document against a schema document.
the reason to use XML is because there can be a schema
Yes, that's one of the very useful additions to XML. As said in the article, an XML schema can also be applied to a document using the pXML syntax. Once a pXML document is parsed into an XML structure, all these great XML additions and tools can still be used (including XML schema). I plan to publish a follow-up article to show examples of how XML technology can be used with pXML as well.