Doubt - please help me

Question

0.00/5 (No votes)

See more:

Hi friends,
i have a file which is having Japanese string what my code is doing is:

read from ansi and convert to unicode , but i hard code the buffer as 1024 .. with this its giving problem .. so what i thought get the line by line count and allocate memory .... for that char and tchar.. but not wrking..

here is the code

C#

do
    {
        Lengith = GetFileContent() + 1;

        HANDLE hFileMac = m_pFile;

        char*szMBuf = new char[Lengith];
        memset(szMBuf, 0, Lengith + 1);
        TCHAR*cszMBuf = new TCHAR[Lengith];
        memset(cszMBuf, 0, Lengith + 1);


        if ((lBytes = ::PReadFromFile(hFileMac, szMBuf, Lengith, TRUE)) < 0L)
        {
            ::CloseHandle(hFileMac);
            //  return SetMacroError(NMLANG_MacExcErr, szMBuf);
        }

        str1 = szMBuf;
        int nLen = MultiByteToWideChar(932, 0, str1, -1, NULL, NULL);
        int i = MultiByteToWideChar(932, 0, str1, -1, cszMBuf, nLen);

        if (nBOM == 0) { arcOut.Write(&bom, 2); }

        arcOut.WriteString(cszMBuf);


        memset(szMBuf, 0, lBytes + 1);


    } while (lBytes == Lengith);

Posted 19-May-15 2:18am

imran.prdc

Add a Solution

Comments

Kornfeld Eliyahu Peter 19-May-15 8:26am

Why not use CString?

[no name] 19-May-15 8:32am

CString not working

Richard MacCutchan 19-May-15 8:37am

Your memset calls will corrupt memory, as you are clearing 1 more byte than you have reserved. Also, what does PReadFromFile actually read, I can find no MSDN reference for it? And in your calls to MultiByteToWideChar, if nLen happens to be larger than Lengith then you will corrupt some mre memory.

[no name] 19-May-15 8:40am

inline LONG PReadFromFile(HANDLE hFile, LPVOID lpBuffer, DWORD nNumberOfBytesToRead,
BOOL bGetFileError = FALSE)
{
DWORD dwNumberOfBytesRead = 0L;
if (!::ReadFile(hFile, lpBuffer, nNumberOfBytesToRead, &dwNumberOfBytesRead, NULL))
{
if (bGetFileError)
//PGetFileError((LPTSTR)lpBuffer, nNumberOfBytesToRead);
return -1L;
}
return (LONG)dwNumberOfBytesRead;
}

[no name] 19-May-15 8:40am

nlen will always be same as ilength ....

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Frankie-C · Answer 1 · 2015-05-19T03:32:00

There are more than one error in your code, and many possibly failure reasons:
1. Reading an arbitrary number of bytes from a multibyte stream does not guarantee that you aren't interrupting a multibyte sequence.
2. While the assumption that the unicode buffer can't be longer that the number of bytes of multibyte buffer is correct, the initialization to 0 is wrong. You use memset() that replaces the number of chars (each=1 byte) with 0, while a wide char = 2 bytes. So half your buffer is not cleaned. Here you hhave to use wmemset() instead.
3. As Richard said your buffers are small, if you want to have whole string in it + the ending null allocate the numer of tchars you want plus 1 for the ending null!
4. for the reason at point 1, a partial multibyte sequence picked from disk can use the ending null as part of the multibyte, in this case the parsing will go on in memory up to a congruent multibyte sequence, or would stop not assigning the ending null to the unicode string.

Unless you can preevaluate the multibyte sequence it would be better to read and convert it in one pass.
Allocate buffers with correct size and correct initialization.

imran.prdc · Answer 2 · 2015-05-19T03:58:00

Solution 2

if the size is more than 10000 bytes then it will be problem

Posted 19-May-15 3:58am