With increasing use of separate services on the same data, the need for portable data formats aroused. XML was one of the first widely used, but lately JSON is blooming.
I don’t have a particular bias here, both serve well in appropriate environment, although JSON carrying the same data could result in about 20% size reduction.
So can they be interchangeable? Just recently, I needed to convert XML data into a JSON format for easier consumption on the client.
The fastest and (sometimes) easiest way to process XML to another format is XSLT.
Downloads
- xml2json.xsl
- xsltr.exe (command line utility to perform XSLT transformations)
- xml2json.exe & xml2json-xsl.dll (compiled XSLT, better performance)
Links
The full XSLT code is at the bottom of this post and on GitHub.
Performing the transformation
Using XSLT to transform XML to another format is pretty easy, as it’s meant to be. :)
Depending on the environment you are running when you need this, there are different ways you can perform the transformation.
What’s important to note is that the same XSLT is used in all of these methods.
Specifying the stylesheet in XML
The easiest would be to just add a stylesheet to you XML document.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="xml2json.xsl"?>
<!-- ... XML ... -->
Just open such a XML file in a browser, and the JSON will be there. Here’s an example page, you will see JSON when you open it, but if you view the source code, you will see XML only. The browser is applying the transformation.
Use an executable to transform XML to JSON
Microsoft provides msxsl.exe
, a free command line utility to perform transformations, but it works only with MSXML 4.0 libraries (link). So it’s not really usable on Windows 7, for example.
I created a similar, but .NET based command line utility and, here is xsltr.exe
that you can download.
C# code excerpt
It boils down to this…
doc = new XPathDocument(xmlfile);
XslCompiledTransform transform = new XslCompiledTransform(true);
transform.Load(xslfile);
XmlTextWriter writer = new XmlTextWriter(outfile, null);
transform.Transform(doc, null, writer);
Compiling the XSLT
But if you need the performance, here is a command line utility together with the compiled XSLT.
I used the xsltc.exe
to create a compiled xslt from the source code. It will compile the XSLT code to IL assembly and it will perform the transformation much faster.
Transform XML to JSON in a browser using JavaScript
To work with XML, DOMParser
can be used in all the modern browsers – Firefox, Opera, Chrome, Safari… Of course, Internet Explorer has it’s own Microsoft.XMLDOM
class.
Here’s a demo page that performs the transformation. There are a couple of XML files that you can transform, but you can also enter arbitrary XML and transform it.
If you prefer to work with libraries, I tried jsxml and it worked flawlessly.
The pure JavaScript code boils down to these pieces.
Load a string into an XML DOM JavaScript (code excerpt)
// code for regular browsers
if (window.DOMParser) {
var parser = new DOMParser();
demo.xml = parser.parseFromString(xmlString, &quot;application/xml&quot;);
}
// code for IE
if (window.ActiveXObject) {
demo.xml = new ActiveXObject(&quot;Microsoft.XMLDOM&quot;);
demo.xml.async = false;
demo.xml.loadXML(xmlString);
}
Apply the XSLT JavaScript code excerpt
// code for regular browsers
if (document.implementation && document.implementation.createDocument)
{
var xsltProcessor = new XSLTProcessor();
xsltProcessor.importStylesheet(demo.xslt);
result = xsltProcessor.transformToFragment(demo.xml, document);
}
else if (window.ActiveXObject) {
// code for IE
result = demo.xml.transformNode(demo.xslt);
}
You can see this in action on the demo page.
XSLT Code
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8"/>
<xsl:template match="/*[node()]">
<xsl:text>{</xsl:text>
<xsl:apply-templates select="." mode="detect" />
<xsl:text>}</xsl:text>
</xsl:template>
<xsl:template match="*" mode="detect">
<xsl:choose>
<xsl:when test="name(preceding-sibling::*[1]) = name(current()) and name(following-sibling::*[1]) != name(current())">
<xsl:apply-templates select="." mode="obj-content" />
<xsl:text>]</xsl:text>
<xsl:if test="count(following-sibling::*[name() != name(current())]) > 0">, </xsl:if>
</xsl:when>
<xsl:when test="name(preceding-sibling::*[1]) = name(current())">
<xsl:apply-templates select="." mode="obj-content" />
<xsl:if test="name(following-sibling::*) = name(current())">, </xsl:if>
</xsl:when>
<xsl:when test="following-sibling::*[1][name() = name(current())]">
<xsl:text>"</xsl:text><xsl:value-of select="name()"/><xsl:text>" : [</xsl:text>
<xsl:apply-templates select="." mode="obj-content" /><xsl:text>, </xsl:text>
</xsl:when>
<xsl:when test="count(./child::*) > 0 or count(@*) > 0">
<xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : <xsl:apply-templates select="." mode="obj-content" />
<xsl:if test="count(following-sibling::*) > 0">, </xsl:if>
</xsl:when>
<xsl:when test="count(./child::*) = 0">
<xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:apply-templates select="."/><xsl:text>"</xsl:text>
<xsl:if test="count(following-sibling::*) > 0">, </xsl:if>
</xsl:when>
</xsl:choose>
</xsl:template>
<xsl:template match="*" mode="obj-content">
<xsl:text>{</xsl:text>
<xsl:apply-templates select="@*" mode="attr" />
<xsl:if test="count(@*) > 0 and (count(child::*) > 0 or text())">, </xsl:if>
<xsl:apply-templates select="./*" mode="detect" />
<xsl:if test="count(child::*) = 0 and text() and not(@*)">
<xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:value-of select="text()"/><xsl:text>"</xsl:text>
</xsl:if>
<xsl:if test="count(child::*) = 0 and text() and @*">
<xsl:text>"text" : "</xsl:text><xsl:value-of select="text()"/><xsl:text>"</xsl:text>
</xsl:if>
<xsl:text>}</xsl:text>
<xsl:if test="position() < last()">, </xsl:if>
</xsl:template>
<xsl:template match="@*" mode="attr">
<xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:value-of select="."/><xsl:text>"</xsl:text>
<xsl:if test="position() < last()">,</xsl:if>
</xsl:template>
<xsl:template match="node/@TEXT | text()" name="removeBreaks">
<xsl:param name="pText" select="normalize-space(.)"/>
<xsl:choose>
<xsl:when test="not(contains($pText, '
'))"><xsl:copy-of select="$pText"/></xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(substring-before($pText, '
'), ' ')"/>
<xsl:call-template name="removeBreaks">
<xsl:with-param name="pText" select="substring-after($pText, '
')"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The XSLT code turned out to be more complicated then I expected- I imagined that the transformation would be more natural, not so case based, but it just isn’t possible (or I don’t see the way).
License
MIT License
Copyright (c) 2012 Bojan Bjelic
Full license text (on Choose a license site).
Resources
Links to the resources I used in the process.
Leave a Reply