|
If i drop non clustered index will i any benefit on insertion? If i found matched record i am updating otherwise insert.
|
|
|
|
|
kali siddhu wrote: If i found matched record i am updating otherwise insert Right there is your problem, for every record you are checking for an existing record and when you do the update the system has to check referential integrity.
Split your data into 2 set, those to be inserted and those to be updated. OR delete the records to be updated and insert all records. This may not be viable as it will break existing RI.
Never underestimate the power of human stupidity -
RAH
I'm old. I know stuff - JSOP
|
|
|
|
|
what is the name of designer program like fruity loops or another multimedia programs because I feel is not visual basic or another programing known
|
|
|
|
|
I want to create data notation (like JSON is used).
1) Is it good enough to use languages built-in features (types, notation etc), extend it (e.g. with another types), and output some JSON?
2) Or should I build it from scratch and parse all built-in features and add my additions then output it into JSON?
By using 1) I don't have to implement core things. If there are fixes - then it's good. If there are changes that I don't like I can deal with them from case to case... I guess.
Howerer I'm tied to a programming language - so users had to use the programming language (instead library).
By using 2) I have to build everything but I'm not tied to one particular language.
Maybe I can mix it ( 1) for the language, 2) for other languages).
What are your toughts on this topic.
ps. I was thinking about using the Red ( red-lang.org/ ). It's in alpha but I don't think it will change a lot.
|
|
|
|
|
Why reinvent the wheel?Both JSON and XML provide everything you could need already. And they are both supported by nearly all major programming languages.
|
|
|
|
|
JSON and XML and YAML and ... Isn't the whole bunch of them wheel reinventions? When everybody else are creating new wheels which are better suited for the purpose than all the old ones, why shouldn't I do the same?
Now I have personally come to one conclusion, in particular from many years of exposure to XML: Data description languages are for computers, not for humans. This kind of stuff you, a human, do not handle better than a computer does. You make typos, you do not structure it according to the rules, in brief: You mess it up. So keep humans out of it!
The best way of doing that is to make it unreadable. Binary. I know that is a highly Politically Uncorrect statement; yet I think that what humans should not mess up, should not be made available for messing up - especially not with as simple tool as a plain text editor. You can also mess up by using a binary generator (/editor), but that takes a lot more deliberate action. The mess comes from "You asked for it, you got it" - not from "Ooooops!"
So when I need to store data for my own applications (and there are no requirements for sharing the data files with other applications), I do it as binary files. Always in a Tag-Length-Value format, evading all sorts of escape mechanisms. No need to search for the end of the field. Arbitrary binary data. Space allocation for the value can be made before it is actually read. Parsing the file is extremely fast. The space overhead is quite moderate.
Details of how you do the TLV format may vary slightly. E.g. in some applications, there will never be more than a couple hundred distinct tags, so it is stored in 15 bits; the "sign bit" is a flag indicating that the Value is in fact a sequence of TLV values. If values are small, the length is 16 bits, too. If there is any risk at all of overflow, I use the BER style of variable length handling: The length of the enclosing TLV is 0, each member carries its own length; the member sequence is terminated by and all zero TLV. (Then you cannot preallocate space for the entire structure without reading it, but usually a composite value won't be stored as a single unit anyway.)
Like all class definitions have a ToString, they have a ToTLV. And a FromTLV. The "Schema" is represented by these ToTLV funcitons. If any other application needs data in, other formats, adding ToXML, ToJSON, ToYAML, ... alongside with ToString and ToTLV is straigtforward. But for the application's private file, the binary ToTLV is used.
|
|
|
|
|
My book discusses the advantages of TLV over text-based encodings. The latter typically require more space and processing time. Readability is touted as an advantage of text, but it's often as you say.
Text's advantage is in interoperability between big and little endian systems. If that's a requirement, TLV is a non-starter unless all the fields are the same length. A protocol standard has to consider this, but a proprietary system can standardize on one endianism and use TLV more freely, although it still has to maintain protocol backward compatibility unless it's OK to shut down the entire network during an upgrade.
|
|
|
|
|
Text representation does not completely evade endianism - at least not with UTF-16!
If you consider UTF-8 an alternative: The multibyte encoding of a code point is just a compression method for an integer. You can use that for the Tag and Length fields as well - that could save you a few bytes when tags are few and lenghts short, and it solves endianness equally well for 32 bit tags and lengths as it does for text files. I have been considering this solution, but there hasn't been any need for it yet.
You obviously still have an endiannes-issue if the value field contains any binary numeric value at all (including UTF-16 characters). A large group of application/data formats are mainly targeted at user environments where CPUs of one given endianness is dominant. Defining that as The byte order for your format, and clearly indicate to those readers / writers in the opposite endianness that they have to flip bytes (some CPUs have special instructions for that!) is, in my opinion, a far better solution than converting everything to text.
Text doesn't solve all format problems either, unless you define one of many alternate formats as The format (analogous to defining the endianness of the format). How do you represnent dates? 05/19 is unambiguous (but must be converted to e.g. ISO standard before presenting to a Norwegian user). A week ago, 05/12, is ambiguous unless the representation is explicitly defined. Time: AM/PM is virtually unknonwn in many languages/cultures. Numerics: Is 1,500 one and a half, or fifteen hundred?
Text: How do you represent characters beyond ASCII? 8859-1? 8859-x, with x specified in metadata? UTF-16? UTF-8? Maybe you will stick to ASCII and use QP, or Base64? HTML charcter entities? (named, # or either?) Backslash escapes? (hex, decimal, octal, or any of those?) URL percent-encoding? Which characters do not need to be escaped? How is newline and end of string represented - is NUL accepted as a fill byte, in accordance with ISO standards?
And so on and so on. Text representation certainly doesn't solve all problems. (I'd say that binary encoding solves more!)
In the days when I was working with ASN.1 and BER, a BER string had to be inspected using a BER reader (which should have access to the ASN.1 to provide symbolic names). The readability was a lot better than with XML! When I went from BER to XML, I was considering making a similar XML reader to make it readable; I never got around to do that.
Today, most systems for displaying plain text have some facilities for improving readability, starting with collapsing inner structures, then highlighting of tags, and so on. You could say that such functions illustrate that the plain text format is not good enough. If I need a display tool that parses and transforms XML or whatever into something readable, it might as well transform some TLV format into something readable.
There is one issue that still remains, though: How self-describing the file should be. TLV tags are usually opaque, just some integer number. When you see an XML "p" tag, you know that it may have to do with a person, a product, a paragaph or something associated with the "p" (usually as the initial letter). At one presentation of handling of arbitrary XML documents, I had a sami colleague give me Northern Sami terms for chapter, section, picture and so on, for me to use in the examples: The tags were just for illustration (something like Ipsum lorem), but for the audience to realate to this as a document was difficult
I made one TLF format a few years ago: The file contained zero or more tag name tables, providing symbolic tags for presentation purposes; each table was headed by a language code. For simplicity, in that format, tags were unique. If partial structures could have had "locally defined" tags (as allowed e.g. in ASN.1), a more complex scheme would be required, easily growing into a complete scheme representation. In this case, that would be overkill; global tags was far easier and fully acceptable.
Such issues to not arise at all with textual tags; they are at least at some level self-describing. An they rise issues of e.g. case significance, allowed character set, and a bunch of other issues that a numeric tag evades.
When ASN.1/BER was in war with other alternatives, the lack of symbolic tag names in BER, mandating the receiver to have access to the ASN.1 scheme for interpretation, was one of the strongest critisisms of BER (/DER/CER). Later, we got XML and JSON encoding rules, encoding symbolic names from the ASN.1 scheme into the stream, but this was only a half-way solution: Matching (and keeping in synchronization) an ASN.1 scheme to an XML scheme is, for all practical purposes impossible, certainly over time. So it mostly served as to poor mans BER reader
I see a lot of areas where computer guys are rather unwilling to seriously assess the commonly used solutions, asking critically if they really are the best. Textual encoding is one of those. We use it because that's the way we do it. Because textual encoding is there, not because it came out with the highest evaluation score. Sure, it is there, we have to accept it when exchanging data with others. But in "local" contexts (such as private files for an application), I tend to use other alternatives.
|
|
|
|
|
Member 7989122 wrote: I see a lot of areas where computer guys are rather unwilling to seriously assess the commonly used solutions, asking critically if they really are the best. Textual encoding is one of those. We use it because that's the way we do it. Because textual encoding is there, not because it came out with the highest evaluation score. The .NET framework works by default with UTF16. We need not think about limiting to ASCII, because we're no longer limited by the space on a floppy. Not much to gain there, and hardly worth the money for the time spent on it.
And no, you don't go back questioning the design of the screws if you're building a car. You take the industry standard, take a brief glance at other screws, and try realize there's a reason why it is the current standard.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Eddy Vluggen wrote: And no, you don't go back questioning the design of the screws if you're building a car. You take the industry standard, take a brief glance at other screws, and try realize there's a reason why it is the current standard. That is certaily true. Sometimes there are reasons for that component design that you do not realize, and if you try to "improve" it, you may be doing the opposite. When a partial solution is given, it is given.
Textual encoding may be that way, in particular when you are exchanging data with others.
But when you are not bound to one specific solution, e.g. you are defining a storage format for the private data of an application, or you have several alternatives to choos from, e.g. 8 bits text is given but you need to select an escape mechanism either for extended characters or characters with special semantics, then you should know the plusses and minuses for the alternatives.
"Because we used it in that other product" is not an assessment Yet, I often have the feeling that we are arguing like that. We should spend some of our efforts on learning why these othere alternatives were developed at all. There must be some reason why someone preferred it another way! Maybe those reasons will pop up in some future situation; then you should not select an inferior solution because "that is what we always do".
What I (optimistically) excect from my colleagues is that they are prepared to realate to the advantages and disadvantages of text and binary encoding. If they are network guys: That the know enough to explain the greatness of IP routing vs. virtual circuit routing, the advantage over layer-3 routing rather than layer 1 switching. Application developers should relate to explicit heap management vs. automatic garbage collection, use of threads vs. processes, semaphores vs. critical regions. And so on.
Surprisingly often, developers know well the solution they have chosen - but that is the only alternative they know well. They cannot give any (well) qualified explanation why other alternavtives were rejcected. I think it is important (in any field, both engineering ones and others) to be capable of defending the rejection of other alternatives as it is to defend the selected one. If you cannot, then I get the impression that you have not really considered the alternatives, just ignored them. And that is what worries me.
For UTF16: yes, that is given, as an internal working format. Yet you should consider what you will be using an external format: UTF-8 is far more widespread for interchange of text info. When is it more appropriate? If you go for UTF-16, will you be prepared to read both big- and little-endian variants, or assume that you will exchange files only with other .net-based applications? Will you be prepared to handle characters outside the Basic Multilingual Plane, i.e. with code points >64Ki?
Even if your response is: We will assume little-endian, we will assume that we never need to handle non-BMP-characters, we will assume that 640K is enough for everyone, these should be deliberate decisions, not made by defaulting.
When Bill Gates was confronted with the 640k-quote, he didn't positively confirm it, but certainly didn't deny it: He might very well have made that remark in the discussion of how to split the available 1 Mbyte among the OS and user processes. Given that 1 MB limit, giving 384 kB to the OS and 640 kB to application code should be a big enough share for the applications, otherwise the OS will be cramped in too little space. 640k is enough for everyone. - In such a context, where the reasoning is explained, the quote suddenly makes a lot more sense. Actually, it is quite reasonable!
That is how I like it. Knowing why you make the decisions you do, when there is a decision to make. Part of this is includes awareness of when there is a decision to make - do not ignore that you actually do have a choice between your default alternative and something else.
|
|
|
|
|
Member 7989122 wrote: We should spend some of our efforts on learning why these othere alternatives were developed at all. 8 bit is not developed as an alternative. ASCII is not an alternative for UTF16.
Member 7989122 wrote: If you cannot, then I get the impression that you have not really considered the alternatives, just ignored them. And that is what worries me. What worries me is that you see improvements of the wheel (with a documented history) as alternatives for more modern standards.
Member 7989122 wrote: or assume that you will exchange files only with other .net-based applications? No, you don't assume; you define an exchange-protocol in specific text-encoding. Should be part of the specs.
Member 7989122 wrote: the quote suddenly makes a lot more sense. The quote that's not his, you mean?
Member 7989122 wrote: That is how I like it. Knowing why you make the decisions you do, when there is a decision to make. Aw, can't argue with that. I assume all your databases are in BCNF?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Member 7989122 wrote: Isn't the whole bunch of them wheel reinventions? No, they're refinements of said wheel.
Member 7989122 wrote: yet I think that what humans should not mess up, should not be made available for messing up Reading and writing aren't the same thing; making human validation impossible does not help with ensuring a correct write - after all, your application might have a bug and write the wrong stuff. The only thing that making it unreadable does, is prevent a human validation.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
A binary format certainly does not mean that the information and its structure cannot be inspected at all! You do have a tool for inspecting e.g. a binary ASN.1/BER format that let you navigate in the structure, detect format errors (and the reader should support you in that!) etc.
As I mentioned in another post: I made an XML document example using tags in Nortern Sami, making no sense to the audience (nor to me - I got the Sami terms from a collague). Then, there is very little value in the "textual" format, when all you know is that "something" is nested within "something else". I also used an example with a "p" tag, where "p" represented a person p (in one part of the scheme), ordering a product p (in another part), and in the payment information, p indicated a paragrap in the text. Understanding the XML record properly suffers from the use of seemingly readable, but highly amibiguous tag names.
You may limit your application or data format to English format, just to ensure that you as an English speaker can make sense of it. But please state that explicitly as a limitation, then! "This data specification format should not be used in any non-English context". That could be valid for softare development tools used by IT professionals only, but certainly not in a general document context. Administration, business. Home use. Educational material... Be prepared for Chinese macro names. Russian XML tags. ÆØÅ in variable names. Dates in ISO format and 24 hour clock. Those are more or less absolute requirements as soon as you move your application out of the computer lab.
For multi-lingual applications, binary formats give a lot of flexibilty compared to text formats. Of course you can translate on-the-fly, but using a plain integer as an index into a language table is a lot easier than word-to word translation. And you may supply extra info in that language table, e.g. indicated plural forms, gender etc. giving a much better translation.
|
|
|
|
|
Member 7989122 wrote: Be prepared for Chinese macro names. Russian XML tags. ÆØÅ in variable names. We are, since we're no longer limited to ASCII.
Member 7989122 wrote: Dates in ISO format and 24 hour clock. Date-formats are another topic; you should save in ISO, but display nicely in the format that the user has set as his preference in Windows. That's not a suggestion, nor is there a discussion.
Member 7989122 wrote: For multi-lingual applications, binary formats give a lot of flexibilty compared to text formats. Ehr.. no. You could have ASCII in binary, with a completely useless date format.
Member 7989122 wrote: Of course you can translate on-the-fly, but using a plain integer as an index into a language table is a lot easier than word-to word translation. And you may supply extra info in that language table, e.g. indicated plural forms, gender etc. giving a much better translation. We use keys, not integers, and resource-files.
You started with a wheel, now you're also including a dashboard and breaks. I have no idea what you are trying to say
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Eddy Vluggen wrote: We are, since we're no longer limited to ASCII. I was primarily thinking of readability and comprehension, not representation. If you are receiving a support request or error report, and all supporting documentation uses characters that make no sense to you, you may have great difficulties in interpreting the bug report or error request.
And: The alternative to UTF-16 (which is hardly used at all in files) is UTF-8, not ASCII. In the Windows world, you may still see some 8859-x (x given by the language version of the 16-bit Windows), but to see 7-bit ASCII, you must go to legacy *nix applications. Some old *nix-based software and old compilers may still be limited to ASCII - I have had .ini files that did not even allow 8859-1 in comments! But you must of course be prepared for 8859 when you read plain text files from an arbitrary source (and ASCII is the lower half of 8859).
you should save in ISO, but display nicely in the format that the user has set as his preference in Windows Then we are talking about not reading a text representation as as text file, but using an interpreter program to present the information. Just as you would do with a binary format file.
Ehr.. no. You could have ASCII in binary, with a completely useless date format. I am not getting this "ASCII in binary". Lots of *nix files with binary data use Unix epoch to store date and time. If your data is primarily intended for the Windows market, you might choose to store it as 100 ns ticks since 1601-01-01T00:00:00Z - then you can use standard Windows functions to present it in any format. Conversion to Unix epoch is one subtraction, one division. If you insist on ISO 8601 character format, you may store it in any encoding you want, all the way down to 5-bit baudot code
You started with a wheel, now you're also including a dashboard and breaks. Did you ever roll snowballs to make a snowman when you were a kid?
I have no idea what you are trying to say One major point is that binary data file formats, as opposed to a character representation, is underestimated; most programmers are stuck in the *nix style of representing all sorts of data in a character format, where a binary format would be more suitable. (The same goes for network protocols!) I am surprised that you haven't discovered that point.
|
|
|
|
|
Member 7989122 wrote: I was primarily thinking of readability and comprehension, not representation. Readability can't be without representation.
Member 7989122 wrote: If you are receiving a support request or error report, and all supporting documentation uses characters that make no sense to you, you may have great difficulties in interpreting the bug report or error request. No, I mail the provider of said and burn them for not documenting.
Member 7989122 wrote: And: The alternative to UTF-16 (which is hardly used at all in files) is UTF-8, not ASCII. That's not an alternative. One is a more limited version of wheel then the other.
Member 7989122 wrote: But you must of course be prepared for 8859 No, in general I'm not; the specs specify what I should support, and outdated isn't supported.
Member 7989122 wrote: Then we are talking about not reading a text representation as as text file, but using an interpreter program to present the information. Just as you would do with a binary format file. Bin nor text need an interpreter.
Member 7989122 wrote: I am not getting this "ASCII in binary". Lots of *nix files with binary data use Unix epoch to store date and time. ASCII is a text-representation that is stored as bits. Unix epoch has nothing to do with any discussion of text-formats.
Member 7989122 wrote: Did you ever roll snowballs to make a snowman when you were a kid? No. What's the use of that?
Member 7989122 wrote: One major point is that binary data file formats, as opposed to a character representation, is underestimated A representation is not a format. They're all stored as bytes. Google for an ASCII-table, it shows what bytes are used for the character.
Member 7989122 wrote: I am surprised that you haven't discovered that point. I deduce you're not asking a question, but trying to make a point. Mixing text-encodings and date-encodings, trying to prove that not human readable binary is somehow superiour.
You fail to give a simple example to prove so, and your explanation isn't helping me.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
While binary format described by you is interesting it's not what I asked about.
I'll try creating one in the future nevertheless.
|
|
|
|
|
XML is very verbose and JSON doesn't have extendable types.
|
|
|
|
|
XML existed before JSON.
And data interchange formats benefit from being verbose. Due to readability; it's not a binary format.
Come to the point please.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
How does "XML existed before JSON" relate to either "XML is very verbose" or "JSON doesn't have extendable types"?
In which ways do "data interchange formats benefit from being verbose"?
Most users today do not read the raw data interchange format directly, as-is - they process it by software that e.g. highlights labels, closing tag etc, and allow collapsing of substrucures. When you pass it through software anyway, what impact on readability does the format of the input to this display processor have? With semantically identical information, but binary coded, as input to the display processor, why would the readabilty be better with a character encoding of the information rather than by a binary encoding?
|
|
|
|
|
Semantical bullshit, aka wordsmithing. I been on that train before.
You trying to do as if binary is the solution to formats; it's not. Anything, text or date, is stored as bits, and is thus in binary. ASCII is a representation of that, UTF is a better form of ASCII. Dates are stored as floats.
I don't care what university. You can either learn or be rediculed. And damn right I will, at every opportunity.
And yes, being "kind"
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
If you really want me to explain to you the difference between storing an integer, say, as a 32 bit binary number vs. storing it as a series of digit characters, bedayse "ASCII is bits, hence digital", then I give up. Sorry.
|
|
|
|
|
Member 7989122 wrote: If you really want me to explain to you the difference between storing an integer, say, as a 32 bit binary number vs. storing it as a series of digit characters I didn't say that; and not going to explain either. I've no need to, nor any desire.
Member 7989122 wrote: then I give up. Sorry.
Good timing. And please do.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
They are not good enough so I won't use it.
|
|
|
|
|
They might not be efficient to you; but lots of us use them, both, where appropriate.
Try to explain why XML isn't good enough, and to how many floppy-discs you're limited to that you need that optimization.
Do elaborate, please.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|