You should never assume any particular UTF encoding. All the .NET API is well abstracted from this representation. Remember that with all UTFs except UTF-32, the characters are represented using different number of bytes. With UTF-16, characters beyond BMP are encoded using surrogate pairs. You can serialize them using the
Encoding
class. As to the Unicode code points, they should be understood and pure mathematical integer number, fully abstracted from their computer presentation, in natural order.
As to your particular problem, your approach is correct, because UTF32LE, in the range 0.. 10FFFF is encoded exactly as the code point would be encoded. However, I never saw a Windows font supporting this range for musical notation (
http://unicode.org/charts/PDF/U1D100.pdf[
^]). Maybe, this is the only problem.
As to the second question: yes, the length 1 is correct. The property returns number of characters, not 16-bit words. You converted the code point to a surrogate pair, right? And this is one character.
—SA