Click here to Skip to main content
15,888,454 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Code Snippet:

int Convertchar_wchar(char* pData, int pDataLength)
{
    wchar_t wcsQuery[4096*2 + 2];
    memcpy((void*)wcsQuery, pData, pDataLength)
    wcout << wcsQuery << endl;
}


I am trying to execute the code for multilingual support. Hence I need to handle non-english languages on English OS(here Win2K3). Now the problem is when I pass any non english characters(I have tried with Japanese) instead of passing the non-english characters it is converting it to ?????. I have confirmed its not display problem but when memcpy is called the value inside is getting changed. The value being passed through char* pData is in the form of UNICODE value, still it is converting into wrong values.

Can some one help me to understand why memcpy is converting the value? Does memcpy internally uses default code page value? How can I pass the correct value to the wide char pointer?


I have already tried wcscpy, RtlCopyMemory. I am not sure what Code Page to pass if MultiByteToWideChar is used such that it will support all the languages.

Waiting for some input ASAP.

Thanks
Posted

Your starting point seems incorrect; you cannot have Unicode characters in an array defined as char* pData. This is (at best) multibyte data so doing a straight memcpy() to a wchar_t array still leaves you with multibyte characters. As Superman mentioned you need to convert it to Unicode (assuming that is what you are trying to achieve).

However I suspect the basic problem is your use of wcout to display the characters. This stream accepts Unicode characters; however you are not passing Unicode characters so your data gets converted to garbage. Try the following:
int Convertchar_wchar(char* pData, int pDataLength)
{
    cout << pData << endl;
}

// or if you want to do the conversion

int Convertchar_wchar(char* pData, int pDataLength)
{
    wchar_t wcsQuery = new wchar_t[pDataLength + 1];
    MultiByteToWideChar(CP_UTF8, 0, pData, pDataLength, wcsQuery, pDataLength + 1);
    wcout << wcsQuery << endl;
    delete [] wcsQuery; // Don't forget to deallocate the buffer!
}
 
Share this answer
 
v2
If you're dealing with UNICODE characters, you should be using wchar_t buffers from the very beginning itself. You shouldn't have the need to make a conversion.
If it cannot be avoided you should do this -
MultiByteToWideChar(CP_UTF8, 0, charArray, -1, wcharArray, wcharArrayLen);
 
Share this answer
 
The thing is that its a legacy code which i need to enhance. The data is actually coming was unicode from a java application over the wire.

Till now unicode work good for Japanese data on Japanese OS or Chinese data on Chinese OS. The problem I am facing is to support this languages in English OS. As far as i understand memcpy should just copy the data and it should not matter what kind of data it is. But my understanding stood wrong practically.

Hence i need to know, does memcpy internally uses default code page value?

Also, the data coming over the wire will not be UTF8 always. In case i will need to determine before hand the type of data, can you help how i can determine that?
 
Share this answer
 
Comments
Richard MacCutchan 17-Aug-10 7:07am    
memcpy does what it says on the tin; it copies memory byte for byte, it does not know, nor does it care what the content is. If you need to convert characters from WCHAR to MBCS or vice versa then you need to know in advance what format the source is and choose the appropriate conversion method, and the correct code page. Check the MSDN entries for the conversion functions mentioned in the previous answers for ful details.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900