Click here to Skip to main content
15,886,362 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am not new to coding, but I am a bit newbie to VB.net

While experimenting with StreamReader in VB.net in .Net Framework 2, I noticed if when I try to read the file characters with UTF-8 encoding while keeping a wrapper integer for counting byte position, there is a mismatch that occurs at the end for smaller files and anywhere for much larger files, that won't let the return values to add up to basestream.length. I confirmed the encoding everytime to be UTF-8 through BOM/Preamble in a nice little function and that matches whatever other editors assumes from the file. My understanding is, somewhere along the way, a character with multiple bytes is getting split up causing this mismatch. My ultimate goal is to index certain character/string position within the text file, so knowing the correct position is paramount. I appreciate that the buffer position can be different from the readposition in StreamReader, but discarding buffer whenever the mismatch occurs doesn't help (I can track mismatch when my counter and basestream.position doesn't match). I found using UTF-7 encoding in these files brings the summation of returned length much closer to basestream.position. I am using readblock by the way, and for efficiency/speed purposes, don't intend to parse bytes to characters myself (as it is provided out-of-the-box in readblock).

So my question is, how can I prevent this mismatch? Can UTF-7 be interchangeably used with UTF-8 with BOM in StreamReader and StreamWriter without corrupting any part of the data?

Any solution in c# might also help in addition to vb, as I will see through the concept. I really appreciate your patience. Thanks...

What I have tried:

Web research, extensive trial-and-error testing
Posted
Updated 9-Nov-16 19:01pm

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900