Converting ANSI to Unicode and back

Doc Lobster

5.00/5 (7 votes)

15 May 2011CPOL

25.9K

String conversion using the C++ Standard Library only

Having just looked at ASCII strings to Unicode in C++[^], here's my preferred solution to this part of the never-ending story of string conversion:

C++

#include <locale>
#include <string>
std::wstring widen(const std::string& str)
{
    std::wstring wstr(str.size(), 0);
#if _MSC_VER >= 1400    // use Microsofts Safe libraries if possible (>=VS2005)
    std::use_facet<std::ctype<wchar_t> >(std::locale())._Widen_s
        (&str[0], &str[0]+str.size(), &wstr[0], wstr.size());
#else
    std::use_facet<std::ctype<wchar_t> >(std::locale()).widen
        (&str[0], &str[0]+str.size(), &wstr[0]);
#endif
    return wstr;
}

std::string narrow(const std::wstring& wstr, char rep = '_')
{
    std::string str(wstr.size(), 0);
#if _MSC_VER >= 1400
    std::use_facet<std::ctype<wchar_t> >(std::locale())._Narrow_s
        (&wstr[0], &wstr[0]+wstr.size(), rep, &str[0], str.size());
#else
    std::use_facet<std::ctype<wchar_t> >(std::locale()).narrow
        (&wstr[0], &wstr[0]+wstr.size(), rep, &str[0]);
#endif
    return str;
}

Yes, it does look nasty - but it is the way to go in pure C++. Funny enough, I never found any good and comprehensive documentation on C++ locales, most books tend to leave the topic unharmed.

By using the standard constructor of std::locale in the functions, the "C" locale defines the codepage for the conversion. The current codepage can be applied by calling std::locale::global(std::locale("")); before any call to narrow(...) or widen(...).

One possible problem with this code is the use of multi-byte character sets. The predefined size of the function output strings expects a 1:1 relationship in size() between the string formats.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)