There are two methods for producing XML documents by using
XSL transformation from MSXML. You can call the
transformNode method of the Document Object Model (DOM) document,
or you can call the
transformNodeToObject method.
Method 1: Use the transformNode method
The
transformNode method always returns a Unicode string (UTF-16). For the English
language, the second byte is 0x00.
If you set the encoding attribute in the
<xsl:output> element to UTF-8, the data is not converted correctly.
If no output method is specified, the default format depends on the kind of output (XML or HTML). If the output is HTML, a META tag that uses "charset=UTF-16" to set the output encoding is inserted into the HTML. For
example,
the following META tag may be used.
<META http-equiv="Content-Type" content="text/html; charset=UTF-16">
If the
response.write function is used in a classic ASP page to
write the stream data back to the browser, the UTF-16-based string that is
returned from the
transformNode method
is converted to ISO-8859-1 characters.
This behavior also occurs if the
Scripting.FileSystemObject method is
used to save the output to a file. ISO-8859-1characters are not two bytes long.
Therefore, a mismatch between the specified encoding and the actual encoding of
data can occur, even if you specified UTF-16 or Unicode
encoding. However, if you saved the message to a file that uses a
suitable byte order, the file will use Unicode encoding.
Method 2: Use the transformNodeToObject method
The
transformNodeToObject method preserves the requested encoding. However, if you specify
UTF-16 encoding, and then you save the document to
a file by using a method that uses UTF-8 or ASNI encoding, a mismatch
occurs. To preserve the
encoding that is specified in the style sheet, you can use the
.Save method for the generated MSXML DOMDocument object.
Best practice
We recommend that you make sure that the specified encoding
matches the actual encoding at every level in a document. The following example creates an HTML document that has "charset=UTF-16" defined in the HTML META tag.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method=�html� encoding=�UTF-16� />
<xsl:template match="/hello-world">
<HTML>
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY>
<H1>
<xsl:value-of select="greeting"/>
</H1>
<xsl:apply-templates select="greeter"/>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="greeter">
<DIV>from <I><xsl:value-of select="."/></I></DIV>
</xsl:template>
</xsl:stylesheet>
To correctly define the HTML document as an UTF-8
encoded document, add an
<xsl:output> element to the style sheet,
and then use an XML API that preserves the encoding.
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="UTF-8"/>
<xsl:template match="/hello-world">
<HTML>
<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY>
<H1>
<xsl:value-of select="greeting"/>
</H1>
<xsl:apply-templates select="greeter"/>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="greeter">
<DIV>from <I><xsl:value-of select="."/></I></DIV>
</xsl:template>
</xsl:stylesheet>