Click here to Skip to main content
15,891,951 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

In ascii file 'space' is "32" whereas 'next line' is a "carriage return/line feed combination" (windows) or "a single line feed" (Linux).

In Binary files how "space" and "next line" is stored? If it is same as above then how binary files are compatible across different platforms(OS).

Cheers,
Yuvaraj
Posted
Updated 28-Aug-10 23:27pm
v2

It doesn't exists a standard way to store strings (i.e. text) in binary files: a binary file is application specific, as an example, Microsoft Word save its text documents in a binary format and inside the file it saves the text together with its formatting (font name and size, justification, paragraph and page format, and so on).
To write a binary file that holds just plain text is the same than writing a text file.
 
Share this answer
 
Comments
Yuvaraj Gogoi 29-Aug-10 6:03am    
Thanks for the answar!
But I still didn't get any answar for my second question!
We see a magic character something like '0' but it is not '0' for "space" if we open the binary file in notepad in Windows! Does it mean that the notepad is unable for conversion from binary to ascii. If this is the case then, how does it do the coversion for other ascii characters like "A", "B", ...?
I have created the binary file in C. Thanks again for your reply!
We have to back in history to understand the reasons of such differences.
At the time of CP/M (The grandfather of DOS) "Binary file" stands for "bulk copy of the memory", while "text file" means "something suitable for the command line editor / line printer".

The difference between DOS (then Windows) and Unix (the inspirator of Linux) mainly comes from a different interpretation the developers gives to "line editor / line printer": DOS doesn't do any sort of translation during copy, hence a command like "copy file.txt lpt1" needs CR/LF to stay in pairs in the source file to let the priter carriage to move properly (you have to both advance the paper and move the carriage back).
Unix allows the device driver to do the translations, hence a simple LF is enough: the equivalent of the LPT file driver will insert the CR if the printer requires it.

When strings are in memory, it's up to the program to decide what is the most efficient representation for its purposes.
C programmers -by K&R tradition- use to write code using \n (10) as "newline", hence text will go with 10 every time an C '\n' is found.

We don't know how notepad manages internally the concept of "newlines": it just relies on Windows to read what is supposed to be a Windows text file. So CR/LF becomes '\n', simple CR remains '\r' and simple LF remains '\n'.

You second question cannot be answered. In binary files every one does what it likes to do. The OS (whatever it is) just copies the bytes tat are given, so -typically- space is 32, carriage return is 13, line feed is 10.

Reading/Writing binary files among different OSs in fact is like telling the OS "don't act on it". Its up to the program used to read / write -at this point- to adhere to a same convention.
 
Share this answer
 
v2
Comments
Yuvaraj Gogoi 3-Sep-10 3:37am    
////When strings are in memory, it's up to the program to decide what is the most efficient representation
////for its purposes.
Does it mean that for space(" "), a notepad in windows is not able to find the efficient representation and so displaying some magic characters?

Thanks a lot for the answar!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900