First of all: strictly speaking, a character is not 8, 16 or 32-bit. A character is characterized by its Unicode code point, which is an integer understood in its abstract mathematical sense, regardless of its binary presentation in a computer,
endianess, etc. The machine presentation is defined by Unicode UTFs (
Unicode Transformation Format).
With .NET, UTF-16 encoding is used. The characters beyond BOM (
Base Multilingual Plain, code points 0 to 0xFFFF) are represented by
surrogate pairs, two 16-bit words.
See:
http://unicode.org/[
^],
http://unicode.org/faq/utf_bom.html[
^].
Now, encryption is performed in a way totally agnostic to the structure of data. All the data is presented as array of bytes. So, the first step — obtaining array of bytes, is performed by
System.Text.Encoding.GetBytes
. It's important, that after decryption the same encoding was used to call
GetChars
method. See
http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx[
^].
For any Unicode strings, four encodings are good, no matter if characters fit in BMP or go outside of it:
System.Text.UnicodeEncoding
(UTF-16, in fact — ugly Windows jargon),
System.Text.UTF32Encoding
,
System.Text.UTF7Encoding
(pretty much outdated) and
System.Text.UTF8Encoding
(most used for stream, interop, Web, etc.)
When a topic of serialization of Unicode strings is clear, we're ready for encryption.
For encryption topics, see
System.Security.Cryptography
namespace:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.aspx[
^].
—SA