您的位置:首页 > 编程语言

C风格字符串转换到宽字节,中文宽字节输出的源代码

2011-08-18 17:24 351 查看


右键点了下 MSND 查找代码页 看到 setlocale ( LC_ALL, "" );
// 用这句 GCC 和 VC编译器都支持输出宽字节 总算也有所所获

Code Pages

A code page is a character set, which can include numbers, punctuation marks, and other glyphs. Different languages and locales may use different code pages. For example, ANSI code page 1252 is used for American English and most European languages;
OEM code page 932 is used for Japanese Kanji.

A code page can be represented in a table as a mapping of characters to single-byte values or multibyte values. Many code pages share the ASCII character set for characters in the range 0x00 – 0x7F.

The Microsoft run-time library uses the following types of code pages:

System-default ANSI code page. By default, at startup the run-time system automatically sets the multibyte code page to the system-default ANSI code page, which is obtained from the operating system. The call
setlocale ( LC_ALL, "" );

also sets the locale to the system-default ANSI code page.

Locale code page. The behavior of a number of run-time routines is dependent on the current locale setting, which includes the locale code page. (For more information, see
Locale-Dependent Routines.) By default, all locale-dependent routines in the Microsoft run-time library use the code page that corresponds to the “C” locale. At run-time you can change or query the locale code page in use with a call to
setlocale.

Multibyte code page. The behavior of most of the multibyte-character routines in the run-time library depends on the current multibyte code page setting. By default, these routines use the system-default ANSI code page. At run-time you can query and change
the multibyte code page with _getmbcp and _setmbcp, respectively.

The “C” locale is defined by ANSI to correspond to the locale in which C programs have traditionally executed. The code page for the “C” locale (“C” code page) corresponds to the ASCII character set. For example, in the “C” locale,
islower returns true for the values 0x61 – 0x7A only. In another locale,
islower may return true for these as well as other values, as defined by that locale.

// C风格字符串转换到宽字节 输出的源代码
#include <iostream>
#include <windows.h>
using namespace std;
int main()
{
setlocale ( LC_ALL, "" );  // 用这句 GCC 和 VC编译器都支持输出宽字节,下面这句只支持VC
//  wcout.imbue(locale( "chs"));  //需要设置imbue才能输出中文字符 // set locale to argument

char *s = "一二三四五六七八";
DWORD dwNum = MultiByteToWideChar(CP_ACP, 0, s, -1, NULL, 0); // 转换成 宽字节大小9 (8+1)
WCHAR *wch = new WCHAR[dwNum]; // 按宽字节 new 内存
memset(wch, 0, dwNum*sizeof(wchar_t)); // 全置0  可以用这句 memset(wch, 0, sizeof(wch));
// memset( the_array, '\0', sizeof(the_array) ); 这是将一个数组的所以分量设置成零的很便捷的方法

MultiByteToWideChar(CP_ACP, 0, s, -1, wch, dwNum*sizeof(wchar_t)); //s转换到宽字节wch

wcout << wch << endl; // 宽字节输出
return 0;
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: