Click here to Skip to main content
15,887,175 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi friends,
i have a file which is having Japanese string what my code is doing is:

read from ansi and convert to unicode , but i hard code the buffer as 1024 .. with this its giving problem .. so what i thought get the line by line count and allocate memory .... for that char and tchar.. but not wrking..

here is the code
C#
do
    {
        Lengith = GetFileContent() + 1;

        HANDLE hFileMac = m_pFile;

        char*szMBuf = new char[Lengith];
        memset(szMBuf, 0, Lengith + 1);
        TCHAR*cszMBuf = new TCHAR[Lengith];
        memset(cszMBuf, 0, Lengith + 1);


        if ((lBytes = ::PReadFromFile(hFileMac, szMBuf, Lengith, TRUE)) < 0L)
        {
            ::CloseHandle(hFileMac);
            //  return SetMacroError(NMLANG_MacExcErr, szMBuf);
        }

        str1 = szMBuf;
        int nLen = MultiByteToWideChar(932, 0, str1, -1, NULL, NULL);
        int i = MultiByteToWideChar(932, 0, str1, -1, cszMBuf, nLen);

        if (nBOM == 0) { arcOut.Write(&bom, 2); }

        arcOut.WriteString(cszMBuf);


        memset(szMBuf, 0, lBytes + 1);


    } while (lBytes == Lengith);
Posted
Comments
Kornfeld Eliyahu Peter 19-May-15 8:26am    
Why not use CString?
[no name] 19-May-15 8:32am    
CString not working
Richard MacCutchan 19-May-15 8:37am    
Your memset calls will corrupt memory, as you are clearing 1 more byte than you have reserved. Also, what does PReadFromFile actually read, I can find no MSDN reference for it? And in your calls to MultiByteToWideChar, if nLen happens to be larger than Lengith then you will corrupt some mre memory.
[no name] 19-May-15 8:40am    
inline LONG PReadFromFile(HANDLE hFile, LPVOID lpBuffer, DWORD nNumberOfBytesToRead,
BOOL bGetFileError = FALSE)
{
DWORD dwNumberOfBytesRead = 0L;
if (!::ReadFile(hFile, lpBuffer, nNumberOfBytesToRead, &dwNumberOfBytesRead, NULL))
{
if (bGetFileError)
//PGetFileError((LPTSTR)lpBuffer, nNumberOfBytesToRead);
return -1L;
}
return (LONG)dwNumberOfBytesRead;
}
[no name] 19-May-15 8:40am    
nlen will always be same as ilength ....

There are more than one error in your code, and many possibly failure reasons:
1. Reading an arbitrary number of bytes from a multibyte stream does not guarantee that you aren't interrupting a multibyte sequence.
2. While the assumption that the unicode buffer can't be longer that the number of bytes of multibyte buffer is correct, the initialization to 0 is wrong. You use memset() that replaces the number of chars (each=1 byte) with 0, while a wide char = 2 bytes. So half your buffer is not cleaned. Here you hhave to use wmemset() instead.
3. As Richard said your buffers are small, if you want to have whole string in it + the ending null allocate the numer of tchars you want plus 1 for the ending null!
4. for the reason at point 1, a partial multibyte sequence picked from disk can use the ending null as part of the multibyte, in this case the parsing will go on in memory up to a congruent multibyte sequence, or would stop not assigning the ending null to the unicode string.

Unless you can preevaluate the multibyte sequence it would be better to read and convert it in one pass.
Allocate buffers with correct size and correct initialization.
 
Share this answer
 
v2
Comments
[no name] 19-May-15 9:58am    
if the size is more than 10000 bytes then it will be problem
Frankie-C 19-May-15 10:08am    
I agree.
Then maybe you have to make a prescan of the buffer. I.e. loook for end of sentence or the like ... :(
BTW 10k is not a threat for a modern OS ... unless you make this online simultaneously on many concurrent threads...
if the size is more than 10000 bytes then it will be problem
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900