Click here to Skip to main content
15,867,568 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a program that uses ACS() to extract a byte from a string recieved via a dll:

These two lines create the string ReadString from a usb device: (the string is created from 7 bytes recieved from the usb device)

VB
ReadString = "       " '(7 space placeholders)
ftResult = FT_Read(ftHandle, ReadString, 7, bytesRead)


and then this (and similar code) extracts the byte values

VB
ver1 = Asc(Mid(ReadString, 5, 1))


The code has worked fine for years ( I've verified that every possible byte value is correctly received),but now one user installing on a new laptop with windows 10 is experiencing this error:
System.ArgumentException: The output byte buffer is too small to contain the encoded data, encoding 'Unicode (UTF-8)' fallback 'System.Text.EncoderReplacementFallback'.
Parameter name: bytes
   at Microsoft.VisualBasic.Strings.Asc(Char String)


What I have tried:

I've googled for days now and gone down a few rabbit holes, but what I think I've narrowed it down to is that most machines encode the string ReadString as UTF16 (VB.net says this is the default) and this particular user's machine is encoding ReadString as UTF-8.
I was hoping to force my code to use UTF-16 regardless of the machine used but can't figure out how to achieve this and I'm also worried that the DLL that is call to recieve the string might be using it's own encoding that I can't change.

Anyone have any useful guidance on this?
Posted
Updated 24-Feb-22 1:46am
v5
Comments
Maciej Los 23-Feb-22 10:33am    
Is this old VB (>=VB.6) or modern VB (VB.NET)? Please, clarify.
Ian Burton 2021 23-Feb-22 11:00am    
As per the title this is in VB.net
Thanks for formatting my post properly.
Maciej Los 23-Feb-22 11:02am    
Take a look at tag. It's: VB instead of VB.NET
Please improve your question and change tag appropriate.
Ian Burton 2021 23-Feb-22 11:05am    
Yes, for some reason it won't accept VB.net as a tag so I went with VB and put the .net in the title.
Maciej Los 23-Feb-22 11:15am    
OK

 
Share this answer
 
Comments
Maciej Los 24-Feb-22 13:15pm    
Sorry, i forgot to upvote your answer. Which is already done. :)
Ralf Meier 24-Feb-22 15:05pm    
Thank you, Maciej ... :)
In addition to solution #1 by Ralf Meier, i'd suggest to read this: How to use character encoding classes in .NET | Microsoft Docs[^]

More at:
UTF8Encoding.GetString(Byte[], Int32, Int32) Method (System.Text) | Microsoft Docs[^]
UTF8Encoding.GetBytes Method (System.Text) | Microsoft Docs[^]

VB.NET
Dim enc As Encoding = Encoding.GetEncoding(1250) 'default polish code page
Dim dec As Encoding = Encoding.UTF8

Dim input As String = "ŁOŚ"

Dim asciibytes = enc.GetBytes(input)
Dim output As String = dec.GetString(asciibytes, 0, asciibytes.Length)
'prints: �O�


Here is a complete list of Windows Code Pages (encodings): Encoding.WindowsCodePage Property (System.Text) | Microsoft Docs[^]
 
Share this answer
 
Comments
Ian Burton 2021 23-Feb-22 13:52pm    
Thanks for the references.
This page is also very enlightening https://www.codeproject.com/tips/144792/string-to-byte-conversion-using-net-encoders-8bit
I can now confirm that my system is using the default codepage 1252 which maps every byte value one for one to a character. So I must assume the users system is using a different codepage (probably utf-8) and then the function asc() is choking on a double byte for certain characters.
My question now is whether there is a simple way to force my application to use codepage 1252 rather than the system default codepage.
I'm imagining a code line in the FormLoad event to the effect "Stop using the system default code page and use the 1252 code page". Any insights welcome!
Maciej Los 24-Feb-22 12:00pm    
Your application should use UTF8 encoding. This is very easy to achieve that. In each function responsible for data encoding/decoding use UTF8. ;)
Ian Burton 2021 24-Feb-22 12:27pm    
I'm struggling to figure this out. I have a string called ReadString, each character's codepoint (I hope that's the right term) will be a value from 0 to 255 as it represents a byte received from a usb device. With codepage 1252 I could use ASC() to retrieve the codepoint (and thus my byte). But what function would retrieve the codepoint from the UTF-8 encoded string? Asc() fails above 127 and ascW() gives incorrect results.
Maciej Los 24-Feb-22 12:33pm    
As i mentioned, you should STOP using ASC, Mid, etc. functions! Use proper, modern functions. Get text from text box, get user code page, convert text to UTF8 and voile la! Then use the same logic in oposite direction.

BTW: user/OS encoding you can this way:
OSEncoding = Encoding.Default;
Ian Burton 2021 24-Feb-22 13:06pm    
Write proper code, why did I not think of that!
Thanks for you help, I'm sure I'll get it figured out eventually.
While this does not exactly resolve the issue it is a work around and get's my user back up and running while I create a permanent solution: Perhaps it will help someone else with a similar issue:
In control panel> region> administrative tab> clicking "Change system locale" exposes a check box titled "Beta: Use Unicode UTF-8 for worldwide language support". The UTF-8 default codepage is incompatible with the ASC() function in the ranges my application uses.
 
Share this answer
 
v2
Comments
Maciej Los 24-Feb-22 11:58am    
Well... What if user of your application does not select this checkbox?
Ian Burton 2021 24-Feb-22 12:17pm    
Thanks for your observation Maciej. I've amended my "solution" appropriately.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900