Click here to Skip to main content
15,867,781 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I have a major doubt it would be good if it is cleared.

I am just writing what I know, please correct me if I am wrong.

A 32bit register can hold 32bits or 4 bytes of data only, but when I write and execute the statement
msg db 'Hello, world!',0xa
mov ecx,msg

How is it that it is valid? The string 'hello world!', even in ASCII representation is 11 bytes and then there is a null byte also at the end (please correct me if I am wrong).
So how does this happen? Is it that only the pointer to the memory address where this string is located is put into ecx?

I never paid attention to this before, it will be great if you can help me.

Thanks.

Edit: I also want to know what does the keyword
db
do. What I know is that it means 'define byte', how do you store a 11 byte data into 1 byte memory?
Posted
Updated 9-Jul-15 1:37am
v2

"how do you store a 11 byte data into 1 byte memory?"

:laugh:
You don't.

db allocates bytes, not just a byte - so if you use
msg db 'Hello, world!',0xa
then it will allocate an area for an fourteen character string with the data filled in, not a single byte. The label msg will contain the address of the first byte of that string.
So when you execute
mov ecx,msg
The address of the first byte is placed in the register.


"Another thing that I forgot to ask is that if 'db' allocates bytes, what is the purpose of having 'dw', 'dd', etc.?"


They assign different sizes:
db allocates in units of one byte, and aligns to a single byte address.
dw allocates in units of two bytes, and aligns to the next two byte address.
dd allocates in units of four bytes, and aligns to the next four byte address.

The alignment is important: when you access a byte, you can use any address. When you access a word, the least significant bit of the address is ignored, because words are always supposed to be on a word boundary. If you provide an odd-numbered address to a word fetch or store instruction, it will read or write at the wrong location.
Suppose the address starts at hex 100:
x   db   1
y   dw   1001

If the alignment didn't happen, then x would be "0x100" and y would be "0x101" (We'll ignore segments, and 32 / 64 bit addressing here).
So when you write to y the hardware in the processor executes a word based instruction and "rounds" the address to write to down by ignoring the least significant bit. As a result the write to y overwrites x as well!

dw forces the alignment to be correct for word operations by making x hold "0x100" and y "0x102" - dd does the same for 4 byte addresses and so forth.

Make sense?
 
Share this answer
 
v2
Comments
0xF4 9-Jul-15 7:49am    
But how does the CPU know how long the string is?
OriginalGriff 9-Jul-15 7:54am    
You told it how long it was when you provided the string data as a parameter to the db instruction - it looks at the number of characters between the double quotes and starts counting!
0xF4 9-Jul-15 7:56am    
But it will (maybe) have to save the length either in a register or the ram?
OriginalGriff 9-Jul-15 8:15am    
It could - but it doesn't in this case (you would have to do that yourself if you wanted to). Normally a string will have a terminating character (such as a null or newline) and the function that processes it will start working when it reaches it.
In your case, I suspect that the termination character is the 0xA at the end of the string, being a Line Feed character.

This is one of the really good things about assembler work: nothing is assumed for you, you are in charge and have to make all the decisions, terminate your strings, allocate your space, make sure your variables are on the right address boundaries, all of it. And if you get it wrong, there are no friendly little messages to say "index out of bounds" as you get with a high-level language - your code just continues until something catastrophic happens!

Seriously, you need to read up on addresses, registers, and addressing modes - if you don;t understand the basics of this stuff you will just confuse yourself horribly trying to read code and work out what is happening. This can be very complicated to start with, and you need to get it all very clear in your head first.
0xF4 9-Jul-15 8:21am    
Can you suggest a good book?
msg db "ABC"


the above places "ABC" at that memory address and that memory address is labelled as "msg". So if my code is at memory address 10000 then the following assembly

msg db "ABC"
NOP

will result in

10000 A
10002 B
10004 C
10006 00 (00 being the NOP)

Memory address "10000" is now known as "msg", and my "NOP" operation is stored at 10006. If my code was

msg db "ABCD"
NOP

I would have

10000 A
10002 B
10004 C
10006 D
10008 00 (00 being the NOP)

the "mov ecx, msg" means make the register ecx equal the value msg. Remember msg is 10000, so ECX is 10000, as 10000 is the address where my string can be found. So it doesn't matter how long my string is, all my code needs to know is where it starts, and it starts at 10000. How does it know how long the string is? It doesn't. In assembly you need to do your own memory management. Some functions that process the string might consider the string to be ended when it encounters a special value, like "0" or "$", or however that function is coded. Some functions might require both the address of the string and the length of the string so it knows how many characters to process, and you'll have to tell that function that the address is 10000 and the length if 3\4\whatever. When doing assembler you have to remember that memory allocation, and string variables knowing how long they are, classes knowing how big they are, are all features provided by the higher level language so you don't need to worry about it as a coder. With assembler there are no such niceties and you have to do your own variable and memory management.
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900