UTF-16 encoded files should always contain a
Byte order mark - Wikipedia[
^]. With UTF-8 files this is optional.
So you should check first for a BOM. If there is none, you might check for valid UTF-8.
With Windows you can use the
MultiByteToWideChar function (Windows)[
^] to do that (it must be probably called anyway to convert UTF-8 text to UTF-16 which is used by Windows).
Another option is using the ICU converter library (
Using Converters - ICU User Guide[
^]).
There are also some projects providing converters and check functions like
UTF8-CPP: UTF-8 with C++ in a Portable Way[
^].
Or write your own according to the allowed code points. I once found a sample implementation based on the Unicode recommendations but I did not find it anymore.
Note that all checks will return true (valid UTF-8) for plain ASCII files. So it might be necessary to check first for characters >= 0x80.