Handling Invalid Characters in an XML String (zz.IS2120.BG57IV3)
2012-11-15 17:41
441 查看
There are 5 predefined entity references in XML:
//z 2013-08-20 18:03:27 IS2120@BG57IV3.T3597203987.K[T191,L2147,R75,V2925]
< | < | less than |
> | > | greater than |
& | & | ampersand |
' | ' | apostrophe |
" | " | quotation mark |
严格来讲,只有 < 和 & 在xml是非法的。但作为一个良好的习惯,上述字符串最好都替换掉的。
Note: Only the characters "<" and "&" are strictly illegal in XML. Apostrophes, quotation marks and greater than signs are legal, but it is a good habit to replace them.
Recipe 15.7. Handling Invalid Characters in an XML String
Problem
//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]You are creating an XML string. Before adding a tag containing a text element, you want to check it to determine whether the string contains any of the following invalid characters:
< > " ' &
If any of these characters are encountered, you want them to be replaced with their escaped form:
< > " ' &
Solution
//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]There are different ways to accomplish this, depending on which
XML-creation approach you are using. If you are using XmlWriter, theWriteCData,WriteString,WriteAttributeString,WriteValue,
and WriteElementString methods take care of this for you. If you are usingXmlDocument andXmlElements, theXmlElement.InnerText
method will handle these characters.
The two ways to handle this using an XmlWriter work like this. TheWriteCData method will wrap theinvalid
character text in aCDATA section, as shown in the creation of theInvalidChars1 element in the example that follows. The other method, usingXmlWriter, is to use theWriteElementString method that will automatically escape
the text for you, as shown while creating theInvalidChars2 element.
// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; using (XmlWriter writer = XmlWriter.Create(Console.Out, settings)) { writer.WriteStartElement("Root"); writer.WriteStartElement("InvalidChars1"); writer.WriteCData(invalidChars); writer.WriteEndElement(); writer.WriteElementString("InvalidChars2", invalidChars); writer.WriteEndElement(); }
The output from this is:
<?xml version="1.0" encoding="IBM437"?> <Root> <InvalidChars1><![CDATA[<>\&']]></InvalidChars1> <InvalidChars2><>\&'</InvalidChars2> </Root>
There are two ways you can handle this problem with XmlDocument andXmlElement. The first way is to surround the text you are adding to the XML element with a CDATA section and add it to theInnerXML
property of the XmlElement:
// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); invalidElement1.AppendChild(xmlDoc.CreateCDataSection(invalidChars));
The second way is to let the XmlElement class escape the data for you by assigning the text directly to theInnerText property like this:
// Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); invalidElement2.InnerText = invalidChars;
The whole XmlDocument is created with these XmlElements in this code:
public static void HandlingInvalidChars( ) { // Set up a string with our invalid chars. string invalidChars = @"<>\&'"; XmlDocument xmlDoc = new XmlDocument( ); // Create a root node for the document. XmlElement root = xmlDoc.CreateElement("Root"); xmlDoc.AppendChild(root); // Create the first invalid character node. XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); // Wrap the invalid chars in a CDATA section and use the // InnerXML property to assign the value as it doesn't // escape the values, just passes in the text provided. invalidElement1.InnerXml = "<![CDATA[" + invalidChars + "]]>"; // Append the element to the root node. root.AppendChild(invalidElement1); // Create the second invalid character node. XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); // Add the invalid chars directly using the InnerText // property to assign the value as it will automatically // escape the values. invalidElement2.InnerText = invalidChars; // Append the element to the root node. root.AppendChild(invalidElement2); Console.WriteLine("Generated XML with Invalid Chars:\r\n{0}",xmlDoc.OuterXml); Console.WriteLine( ); }
The XML created by this procedure (and output to the console) looks like this:
Generated XML with Invalid Chars: <Root><InvalidChars1><![CDATA[<>\&']]></InvalidChars1><InvalidChars2><>\ &'</InvalidChars2></Root>
Discussion
The CDATA node allows you to represent the items in the text section as character data, not as escapedXML, for ease of entry. Normally thesecharacterswould need to be in their escaped format (< for< and so on), but theCDATA section allows you to enter them as regular text.
When the CDATA tag is used in conjunction with the
InnerXml property of theXmlElement class, you can submit characters that would normally need to be escaped first. TheXmlElement
class also has an InnerText property that will automatically escape any markup found in the string assigned. This allows you to add these characters without having to worry about them.
See Also
See the "XmlDocument Class," "XmlWriter Class," "XmlElement Class," and "CDATA Sections" topics in the MSDN documentation.//z 2012-11-15 17:45:37 IS2120@BG57IV3.T760357750 .K[T3,L107,R3,V27]
XML 非法 字符 转义 字符 处理
相关文章推荐
- Different ways how to escape an XML string in C# (zz)
- Lucene.Net 2.9.2.2 Invalid shift value in prefixCoded string (is encoded value really an INT?) 异常排查处理
- hexadecimal value 0x0C, is an invalid character-如何去除XML无效字符
- A string s is LUCKY if and only if the number of different characters in s is a fibonacci number. Gi
- XML hexadecimal value 0x__, is an invalid character
- XmlDocument的Load方法报“hexadecimal value .., is an invalid character”
- XML解析出现An invalid XML character (Unicode: 0x**) was found in the comment.
- XML解析出现An invalid XML character (Unicode: 0x**) was found in the comment.
- invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
- Unity Android 5.0上 JNI DETECTED ERROR IN APPLICATION: jclass is an invalid local reference
- Different ways how to escape an XML string in C#
- Column \'表名.某列名\' is invalid in the select list because it is not contained in either an aggregate f
- react-native Element type is invalid: expected a string (for built-in components) or a class/functio
- ORA-06575: Package or function SPLIT is in an invalid state
- 即时通讯(环信)-bug Implicit declaration of function 'NSEaseLocalizedString' is invalid in C99
- Framework XML 反序列化->There is an error in doucment (,) 错误解析
- XML文件处理过程中的0x1A 错误处理,Hexadecimal value 0x is an invalid character
- org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x0) was found in the CDATA sectio
- org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x0) was found in the CDATA sectio
- iOS利用Application Loader打包提交到App Store时遇到错误The filename 未命名.ipa in the package contains an invalid character(s). The valid characters are:A-Z ,a-z,0-9,dash,period,underscore,but the name cannot start w