Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

PRB: Strings Passed to loadXML must be UTF-16 Encoded BSTRs


Symptoms

When you use the loadXML method of the MSXML parser, attempting to include non UTF-16 character sequences in the BSTR parameter passed to loadXML may result in the following error message:
An Invalid character was found in text content.

Furthermore, attempting to change the encoding of the string, specifying an "encoding" attribute on the main XML processing instruction, for example results in the following error message:
Switch from current encoding to specified encoding not supported.

↑ Back to the top


Cause

The string parameter must be BSTR format. BSTR format strings are always UTF-16.

↑ Back to the top


Resolution

The MSXML 3.0 or later parser does not have this restriction and can accept XML strings with other encodings, such as UTF-8.

To download the latest MSXML parser, go to The following workarounds are available for parser versions prior to 3.0:

Scripting developers have two options available:

  1. Convert your XML documents to UTF-16-formatted Unicode, either automatically or by hand.Escape all non-Unicode character encodings inside the XML document using XML Unicode entity references. Any XML character can be encoded in plain ASCII using the form &#xxxx, where xxxx is its index into the Unicode character set.
  2. Escape all non-Unicode character encodings inside the XML document using XML Unicode entity references. Any XML character can be encoded in plain ASCII using the form &#xxxx, where xxxx is its index into the Unicode character set.
Microsoft Visual C++ developers have a third option: load data into MSXML using a method other than loadXML. Typical misuse of loadXML results from the desire to load XML data from memory; the IXMLDOMDocument::Load method actually has several overloads that are superior alternatives to loadXML.

See the following Knowledge Base article for more information:
223337 INFO:Loading/Saving XML Data with Internet Explorer XML Parser

Specifically, the Load method can be passed a SAFEARRAY stuffed full of tasty XML data encoded in any scheme.

↑ Back to the top


Status

This behavior is by design.

The MSXML 3.0 or later parser does not have this restriction.

↑ Back to the top


Keywords: kbbug, kbdsupport, kbmsxml300fix, kbmsxmlnosweep, kbprb, kbbillprodsweep, kb

↑ Back to the top

Article Info
Article ID : 247708
Revision : 1
Created on : 1/7/2017
Published on : 6/19/2014
Exists online : False
Views : 144