保存文件为UTF8格式XML file(Writing UTF-8 files in C++)
2015-12-29 14:44
489 查看
Let’s say you need to write an XML file with this content:
How do we write that in C++?
At a first glance, you could be tempted to write it like this:
When you open the file in IE for instance, surprize! It's not rendered correctly:
So you could be tempted to say "let's switch to wstring and wofstream".
And when you run it and open the file again, no change. So, where is the problem? Well, the problem is that neither ofstream nor wofstream write the text in a UTF-8 format. If you want the file to really be in UTF-8 format, you have to encode the output buffer
in UTF-8. And to do that we can use WideCharToMultiByte(). This Windows API maps a wide character string to a new character string (which is not necessary from a multibyte character set). The first argument indicates the code page. For UTF-8 we need to specify
CP_UTF8.
The following helper functions encode a std::wstring into a UTF-8 stream, wrapped into a std::string.
With that in hand, all you have to do is doing the following changes:
And now when you open the file, you get what you wanted in the first place.
And that is all!
< ?xml version="1.0" encoding="UTF-8"? > < root description="this is a naïve example" > < /root >
How do we write that in C++?
At a first glance, you could be tempted to write it like this:
#include< fstream > int main() { std::ofstream testFile; testFile.open("demo.xml", std::ios::out| std::ios::binary); std::string text = "< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n" "< root description=\"this is a naïve example\" >\n< /root >"; testFile << text; testFile.close(); return0; }
When you open the file in IE for instance, surprize! It's not rendered correctly:
So you could be tempted to say "let's switch to wstring and wofstream".
int main() { std::wofstream testFile; testFile.open("demo.xml", std::ios::out| std::ios::binary); std::wstring text = L"< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n" L"< root description=\"this is a naïve example\" >\n< /root >"; testFile << text; testFile.close(); return0; }
And when you run it and open the file again, no change. So, where is the problem? Well, the problem is that neither ofstream nor wofstream write the text in a UTF-8 format. If you want the file to really be in UTF-8 format, you have to encode the output buffer
in UTF-8. And to do that we can use WideCharToMultiByte(). This Windows API maps a wide character string to a new character string (which is not necessary from a multibyte character set). The first argument indicates the code page. For UTF-8 we need to specify
CP_UTF8.
The following helper functions encode a std::wstring into a UTF-8 stream, wrapped into a std::string.
#include< windows.h > std::string to_utf8(constwchar_t* buffer,int len) { int nChars =::WideCharToMultiByte( CP_UTF8, 0, buffer, len, NULL, 0, NULL, NULL); if(nChars ==0)return""; string newbuffer; newbuffer.resize(nChars); ::WideCharToMultiByte( CP_UTF8, 0, buffer, len, const_cast<char*>(newbuffer.c_str()), nChars, NULL, NULL); return newbuffer; } std::string to_utf8(const std::wstring& str) { return to_utf8(str.c_str(),(int)str.size()); }
With that in hand, all you have to do is doing the following changes:
int main() { std::ofstream testFile; testFile.open("demo.xml", std::ios::out| std::ios::binary); std::wstring text = L"< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n" L"< root description=\"this is a naïve example\" >\n< /root >"; std::string outtext = to_utf8(text); testFile << outtext; testFile.close(); return0; }
And now when you open the file, you get what you wanted in the first place.
And that is all!
相关文章推荐
- c语言位域
- 《C++Primer 5e》学习笔记(3):表达式
- 100行代码精通C语言-目录
- C++小题(五)
- 值得推荐的C/C++框架和库
- C/C++进阶笔记(0)
- C++ template —— 模板基础(一)
- c语言使用stdin相关函数时一个蛋疼的问题
- std::set进行排序并删除重复数据
- 输入6个字符串,并对它们从小到大顺序排序后输出
- C++11 多线程 thread, lambda, CPU周期
- C++ 中获取 可变形參函数中的參数
- c/c++指针基础使用
- C++变量的存储持续性、作用域和链接性
- VS2010 c/c++ 本地化 emscripten 配置
- 生命游戏(c语言)
- C++11 sort, vector, lambda——vector 自定义排序
- 梯度下降法的C语言实现
- PC-Lint与CC++代码质量
- 在Visual Studio 2013 中使用C++单元测试