Friday, October 23, 2009

Section 11.1. XSLT










11.1. XSLT


Transformations are an idea as old as human thought. Primitive societies had werewolves, werebears, and weretigers. The Greeks had warnings against seeing goddesses bathe, unless one was interested in going to parties stag, literally. During the renaissance, there was Shakespeare's A Midsummer's Night Dream, in which Bottom was made an Ass of. Today we have Jack Chalker's Midnight at the Well of Souls and the Borg from Star Trek. And although the transformations in each of these stories dealt with the physical world and XSLT can affect only XML, they all share many of the same characteristics: Without change, the story can progress no further.


As one who has been working in the programming field for a number of years, I can attest to one thing: About 40 percent of the time, the data is in the wrong format. In ancient times, when great beasts with names such as System major problem. Programs had to be changed or written from scratch to massage the data to make it usable. Changing programs and creating programs has always been a costly undertaking in any day and age.


Now things are different, as time seems to be speeding up. The great beasts are all either dead or behind glass in museums, where people can stare in awe, never realizing that the old 486 machine that they gave to their kids had more power.


Today much of the information that we deal with is in the form of XML, which, interestingly enough, can be transformed by XSLT in much the same manner as Lon Chaney was by the full moon. Thankfully, however, the XML doesn't get hairyunless, of course, we want it to.



11.1.1. XML Magic


Here's the quandary: On the client side, we have XML and we want HTML. It's a real pain in the gluteus, isn't it?


Yes, we can write a script to perform the conversion, but it is a time-consuming task accomplished with ill-suited tools. Face it: The majority of scripting languages aren't really built to handle XML. Although it works just fine, when messing around with individual nodes, JavaScript's XML support comes across like a Bose sound system in a Ford Pinto. I'm not saying that it doesn't workit works just fine, but, unfortunately, six months later it has a tendency to cause questions like, "I wrote this?"


XSLT, as opposed to JavaScript, was designed from the ground up to handle XML. Come to think of it, XSLT is itself a dialect of XML. This has a tendency to lead to some really interesting style sheets when working with XSLT, but that is a topic for another day. Another interesting thing is that although the input has to be XML, nothing says that the output needs to be XML. This means that if you want to transform XML into HTML as opposed to XHTML, by all means do it, but just remember that if you're using SOAP, the package must be well formed.




11.1.2. How Microsoft Shot Itself in the Foot


Back in the old days, during the first browser wars, Microsoft released Internet Explorer version 5.0, the first web browser with XSLT support. It would have been a major victory for Microsoft, if it had not been for one little detail. In their haste, they forgot one little thing about the World Wide Web Consortium's recommendations. You see, recommendations are often vastly different from drafts. In an effort to produce the first browser with XSLT support, Microsoft used a draft as a guide.


For this reason, you sometimes see references to the namespace http://www.w3.org/TR/WD-xsl instead of http://www.w3.org/1999/XSL/Transform.


It was only with the advent of Microsoft Internet Explorer 6 that Internet Explorer started following the recommendation instead of the draft. Personally, I believe that it is a good idea to ignore the old namespace entirely; I think that Microsoft would like to. And although they're currently considered the third most popular browser, at most, individuals using versions 5.0, 5.01, and 5.5 of Internet Explorer comprise only a fraction of the general population. It is a pretty safe bet that you can ignore these web browsers entirely without alienating anyone but technophobes, the White House, and project leaders who use the term blink.




11.1.3. XPath, or I Left It Around Here Someplace


Earlier I stated that XPath was the targeting device for XSLT, which is essentially true. XPath is used to describe the XML node or nodes that we're looking for. As the name suggests, XPath describes the path to the node that we're looking for. For example, let's say that we want the state_name node in the XML document shown in Listing 11-1. A number of different ways exist for locating it, some of which are shown in Listing 11-2.


Listing 11-1. A Sample XML Document






<states>
<state>
<state_abbreviation>AB</state_abbreviation>
<state_name>Alberta</state_name>
</state>
<state>
<state_abbreviation>AK</state_abbreviation>
<state_name>Alaska</state_name>
</state>
<state>
<state_abbreviation>AL</state_abbreviation>
<state_name>Alabama</state_name>
</state>
<state>
<state_abbreviation>AR</state_abbreviation>
<state_name>Arkansas</state_name>
</state>
</states>




Listing 11-2. Sample XPath






/states/state/state_name
/*/*/state_name
/*/*/*[name(.) = 'state_name']
/states/state/*[2]
//state_name




Why so many? With XPath, it is possible to describe complete paths, paths with wildcards, and paths based upon its location, or to describe only the node itself. From a high level, such as an orbital view, it works as shown in Table 11-1.


Table 11-1. High-Level View of XPath

XPath Notation

Description

/

Either the root node, in the case of the first slash, or a separator between nodes

//

Anywhere in the document that meets the criteria

*

Wildcard (I know that there is a node here, but I don't know its name)

.

The context node (where we are at this point)

[2]

A predicate stating that the second node is the one we want

states

Qualified node name

state

Qualified node name

state_name

Qualified node name

name()

A function that returns the name of passed node

[name(.) = 'state_name']

A predicate stating that the desired node name is state_name



Alright, that should be enough XPath to get started. Now let's take a gander at the XSLT shown in Listing 11-3, whose purpose is to build an HTML select object using the XML from Listing 11-1.


Listing 11-3. Sample XSL Style Sheet






<?xml version='1.0'?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" media-type="text/html"/>

<xsl:template match="/">

<select id="myselect" name="myselect">
<xsl:for-each select="/states/state">
<xsl:element name="option">
<xsl:attribute name="value">
<xsl:value-of
select="state_abbreviation" />
</xsl:attribute>
<xsl:value-of select="state_name" />
</xsl:element>
</xsl:for-each>
</select>

</xsl:template>

</xsl:stylesheet>




Pretty cool, isn't it? At first glance, not only is it nicely indented, but it also has the advantage of being one of the most obscure things that you've ever laid your eyes upon. A second glance reveals some details that you might have missed the first time; for example, the select statement looks remarkably like HTML. There is a very good reason for the resemblance: It is HTML. In fact, the xsl:output statement even says that it is HTML, and you can take it from me, xsl:output statements don't lie.


Upon closer examination, some other details might pop out, such as the xsl:template with match="/". From what we covered earlier, the slash means that we're looking for the root node. And while we're examining XPath, you'll find xsl:for-each with select="/states/state". Just in case you're wondering, for-each means exactly what you think it does: Iterate once for every node that matches the predicate.


Another thing that might jump out is the xsl:element node with name="option". This is an alternate method of specifying an output element. The xsl:attribute also does exactly what you'd expect from its name; it defines an attribute of the previous xsl:element. Finally, the xsl:value-of simply copies the node's content from the source document to the output document. In a nutshell, that's pretty much the basics of XSLT and XPath. The next question, of course, is, "So, what does the output HTML look like?" For the answer, check out Listing 11-4.


Listing 11-4. HTML Output






<select id="myselect" name="myselect">
<option value="AB">Alberta</option>
<option value="AK">Alaska</option>
<option value="AL">Alabama</option>
<option value="AR">Arkansas</option>
</select>




Later, both in this chapter and in others, you'll find more detailed examples of client-side XSLT.





11.1.4. What I Learned from the Gecko


Back when I was first learning XSLT, I was developing with the bare minimum, a text editor and a copy of Microsoft Internet Explorer version 5.01and I was happy! Well, at least for about 20 minutes or so, right up to the point I read the World Wide Web Consortium's XSLT recommendation. But we've already covered that, and after I downloaded a copy of Internet Explorer version 6, I was happy againat least, until I found Mozilla and then Firefox.


My first impression was that there was something wrong with the Gecko XSLT processor, but there was a gnawing doubt. The reason for this was that I'd never personally found an error in a Gecko-based browser, and I had found several in Internet Explorer. So with a critical eye and a hard copy of the recommendation, I began to examine the "bugs" that I had found in the Gecko XSLT processor.


The results came as no surprise to me. Gecko strictly followed the published recommendation, whereas IE seemed somewhat looser around the edges. My problem was that I had developed some bad habits developing in a microcosm and had a tendency to tailor my code to that microcosm. Because of this, I now try out my style sheets in at least two different XSLT processors before I consider them even partially tested.


Let's take a look at how to create an instance of the XSLT processor in Microsoft Internet Explorer and every other web browser on the planeter, I mean Firefox, yeah, Firefox. Listing 11-5 shows a little cross-browser web page that uses one XML Data Island, the first containing the XML while the XSLT is loaded from the server via the XMLHttpRequest object. This is nothing flashy, merely a "proof of concept." It just creates an HTML select object and plops it on a page.


Listing 11-5. XSLT Cross-Browser Web Page Example






<html>
<head>
<title>XML Data Island Test</title>
<style type="text/css">
xml
{
display: none;
font-size: 0px
}
</style>
<script language="JavaScript">
var _IE = (new RegExp('internet explorer','gi')).test(navigator.appName);
var _XMLHTTP; //
XMLHttpRequest object
var _objXML; // XML DOM document
var _objXSL; //
Stylesheet
var _objXSLTProcessor; // XSL Processor
var _xslt = 'stateSelect.xsl'; // Path to style sheet

/*
Function: initialize
Programmer: Edmond Woychowsky
Purpose: Perform page initialization.
*/
function initialize() {
if(_IE) {
_XMLHTTP = new ActiveXObject('Microsoft.XMLHTTP');

_objXML =
new ActiveXObject('MSXML2.FreeThreadedDOMDocument.3.0');
_objXSL =
new ActiveXObject('MSXML2.FreeThreadedDOMDocument.3.0');

_objXML.async = false;
_objXSL.async = false;

_objXML.load(document.getElementById('xmlDI').XMLDocument);
} else {
var _objParser = new DOMParser();

_XMLHTTP = new XMLHttpRequest();

_objXSLTProcessor = new XSLTProcessor();
_objXML =
_objParser.parseFromString(document.getElementById('xmlDI').innerHTML,
"text/xml");

}

_XMLHTTP.onreadystatechange = stateChangeHandler;

_XMLHTTP.open('GET',_xslt,true);
_XMLHTTP.send(null);
}

/*
Function: stateChangeHandler
Programmer: Edmond Woychowsky
Purpose: Handle the asynchronous response to an
XMLHttpRequest, transform the XML Data Island and
display the resulting XHTML.
*/
function stateChangeHandler() {
var strXHTML;

if(_XMLHTTP.readyState == 4) {
if(_IE) {
var _objXSLTemplate =
new ActiveXObject('MSXML2.XSLTemplate.3.0');

_objXSL.loadXML(_XMLHTTP.responseText);
_objXSLTemplate.stylesheet = _objXSL;
_objXSLTProcessor = _objXSLTemplate.createProcessor;
_objXSLTProcessor.input = _objXML;

_objXSLTProcessor.transform();

strXHTML = _objXSLTProcessor.output;
} else {
var _objSerializer = new XMLSerializer();

_objXSL = _XMLHTTP.responseXML;

_objXSLTProcessor.importStylesheet(_objXSL);

strXHTML =
_objSerializer.serializeToString(_objXSLTProcessor.transformToFragment
(_objXML, document));
}

document.getElementById('target').innerHTML = strXHTML;
}
}
</script>
</head>
<body onload="initialize()">
<xml id="xmlDI">
<states>
<state>
<state_abbreviation>AB</state_abbreviation>
<state_name>Alberta</state_name>
</state>
<state>
<state_abbreviation>AK</state_abbreviation>
<state_name>Alaska</state_name>
</state>
<state>
<state_abbreviation>AL</state_abbreviation>
<state_name>Alabama</state_name>
</state>
<state>
<state_abbreviation>AR</state_abbreviation>
<state_name>Arkansas</state_name>
</state>
</states>
</xml>
<b>XML client-side transformation test</b>
<div id="target"></div>
</body>
</html>




Alright, now that the proof of concept has been successfully completed, all that remains is to see how it can be applied to our e-commerce website.




A Problem Revisited


Now that we have some of the basics down, let's take a look at how XSLT can be used to provide additional functionality to our e-commerce website. I should point out, however, that when I originally proposed this idea to a client, I was called insane. The comments were that it would be unworkable and that nobody in their right mind would have even suggested it. In my defense, this was the client that used terms such as blink and was "looking into" converting all web applications into COBOL so that developers other than the consultants could understand it.


That's enough introductions; without further ado, allow me to describe what I consider the ultimate "mad scientist" website.


Excluding pop-ups, the site would be a single web page, with all communication between the server and the client taking place using the XMLHttpRequest object. Instead of subjecting the visitor to an endless cycle of unloads and reloads, the page would simply request whatever it needed directly. In addition, when a particular XSLT was obtained from the server, the client would cache it, meaning that the next time it was needed, it would already be there. It was within the realm of possibility that eventually the client would have all the XSLT cached on the web browser. The more the visitor did, the better the shopping experience would become.


Needless to say, the website was never created, alas, and my contract was terminated because they felt that resources could be better used supporting their mainframe applications. Personally, I think that they lacked foresight, and if they had pursued the concept to its logical conclusion, they'd now be mentioned in the same breath as Google. Instead, they decided to regress into the future of the 1960s as opposed to the future of the twenty-first century. But I'm hardly an objective observer.















No comments: