Having just looked at
ASCII strings to Unicode in C++[
^], here's my preferred solution to this part of the never-ending story of
string
conversion:
#include <locale>
#include <string>
std::wstring widen(const std::string& str)
{
std::wstring wstr(str.size(), 0);
#if _MSC_VER >= 1400 // use Microsofts Safe libraries if possible (>=VS2005)
std::use_facet<std::ctype<wchar_t> >(std::locale())._Widen_s
(&str[0], &str[0]+str.size(), &wstr[0], wstr.size());
#else
std::use_facet<std::ctype<wchar_t> >(std::locale()).widen
(&str[0], &str[0]+str.size(), &wstr[0]);
#endif
return wstr;
}
std::string narrow(const std::wstring& wstr, char rep = '_')
{
std::string str(wstr.size(), 0);
#if _MSC_VER >= 1400
std::use_facet<std::ctype<wchar_t> >(std::locale())._Narrow_s
(&wstr[0], &wstr[0]+wstr.size(), rep, &str[0], str.size());
#else
std::use_facet<std::ctype<wchar_t> >(std::locale()).narrow
(&wstr[0], &wstr[0]+wstr.size(), rep, &str[0]);
#endif
return str;
}
Yes, it does look nasty - but it is the way to go in pure C++. Funny enough, I never found any good and comprehensive documentation on C++ locales, most books tend to leave the topic unharmed.
By using the standard constructor of
std::locale
in the functions, the "C" locale defines the codepage for the conversion. The current codepage can be applied by calling
std::locale::global(std::locale(""));
before any call to
narrow(...)
or
widen(...)
.
One possible problem with this code is the use of multi-byte character sets. The predefined size of the function output strings expects a 1:1 relationship in
size()
between the
string
formats.