See the
UTF-8 encoding at Wikipedia[
^]. According to the table, (the first byte of) a single byte character has the most significant bit cleared (
0
). You may test such a condition by
AND
ing such byte with
0x80
(that is
10000000
in binary).
Similarly, all two-byte characters starts with the
110
marker and you can test it by
b & 0xE0 == 0xC0
(that is
b & 11100000b == 11000000b
).
And so on.