Grouping With XSLT 2.0
2006-06-21 15:18
274 查看
by Bob DuCharme
November 05, 2003
Relational databases have always offered a feature known as grouping, that is, sorting a collection of records on a field or combination of fields and then treating each subcollection that has the same value in that sort key as a unit. For example, if the following XML document was stored in a relational database table, grouping the records by
<files>
<file name="swablr.eps" size="4313" project="mars"/>
<file name="batboy.wks" size="424" project="neptune"/>
<file name="potrzebie.dbf" size="1102" project="jupiter"/>
<file name="kwatz.xom" size="43" project="jupiter"/>
<file name="paisley.doc" size="988" project="neptune"/>
<file name="ummagumma.zip" size="2441" project="mars"/>
<file name="schtroumpf.txt" size="389" project="mars"/>
<file name="mondegreen.doc" size="1993" project="neptune"/>
<file name="gadabout.pas" size="685" project="jupiter"/>
</files>
While XSLT 1.0 lets you sort elements (see the July 2002 column for an introduction), it still forces you to jump through several hoops to do anything extra with the groups that result from the sort. Oracle's lead XML Technical Evangelist Steve Muench developed an approach using the
XSLT 2.0 makes grouping even easier than Steve did. The XSLT 2.0
Let's look at a simple example. The single template rule in the following XSLT 2.0 stylesheet tells the XSLT processor that when it finds a
Just as the XSLT 1.0
It outputs the value of the
It outputs a carriage return.
Using the XML document shown earlier as a source document, the stylesheet creates this result:
It lists the grouping values. This ability to list all the different project values with no repeats in the list may seem simple, but it would have taken a lot more code in XSLT 1.0.
Let's replace the template rule with one that does a bit more:
The contents of this
After the
Applied to the same source document, this second stylesheet produces this result:
If the
it only groups together the potrzebie.dbf/kwatz.xom pair and the ummagumma.zip/schtroumpf.txt pair, since those were the only contiguous
The
The following template rule does this to elements within a
Applying it to the HTML document shown above gives us this result:
The fourth and last way to specify a grouping is the
[/code]
A stylesheet with this template rule creates this result when using the
November 05, 2003
Relational databases have always offered a feature known as grouping, that is, sorting a collection of records on a field or combination of fields and then treating each subcollection that has the same value in that sort key as a unit. For example, if the following XML document was stored in a relational database table, grouping the records by
projectvalue would let us print the records with a subhead for each project name at the beginning of that project's group of records, and it would let us find statistics such as the average or total size of the files in each project.
<files>
<file name="swablr.eps" size="4313" project="mars"/>
<file name="batboy.wks" size="424" project="neptune"/>
<file name="potrzebie.dbf" size="1102" project="jupiter"/>
<file name="kwatz.xom" size="43" project="jupiter"/>
<file name="paisley.doc" size="988" project="neptune"/>
<file name="ummagumma.zip" size="2441" project="mars"/>
<file name="schtroumpf.txt" size="389" project="mars"/>
<file name="mondegreen.doc" size="1993" project="neptune"/>
<file name="gadabout.pas" size="685" project="jupiter"/>
</files>
While XSLT 1.0 lets you sort elements (see the July 2002 column for an introduction), it still forces you to jump through several hoops to do anything extra with the groups that result from the sort. Oracle's lead XML Technical Evangelist Steve Muench developed an approach using the
xsl:keyelement, and this became so popular that it's known as the "Muenchian Method." Jeni Tennison has a fine explanation of it on her site.
XSLT 2.0 makes grouping even easier than Steve did. The XSLT 2.0
xsl:for-each-groupinstruction iterates across a series of groups, with the criteria for grouping specified by its attributes. The required
selectattribute identifies the elements to sort and group, and either the
group-by,
group-adjacent,
group-starting-with, or
group-ending-withattribute describes how to sort and group them.
Let's look at a simple example. The single template rule in the following XSLT 2.0 stylesheet tells the XSLT processor that when it finds a
fileselement it should select all the
filechildren of that element and sort them into groups based on the value of each
fileelement's
projectattribute value. (All examples in this column are available in this zip file. To run them, use Saxon 7, the only XSLT processor current offering support for 2.0.)
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:output method="text"/> <xsl:template match="files"> <xsl:for-each-group select="file" group-by="@project"> <xsl:value-of select="current-grouping-key()"/> <xsl:text> </xsl:text> </xsl:for-each-group> </xsl:template> </xsl:stylesheet>
Just as the XSLT 1.0
xsl:for-eachinstruction iterates across a node set, with child elements of the
xsl:for-eachelement specifying what you want done to each node in the set, the
xsl:for-each-groupinstructions iterates across the groups, with children of the
xsl:for-each-groupelement specifying what you want done to each group. The example above does two simple things as it finds each group:
It outputs the value of the
current-grouping-key()function, which returns the grouping key value shared by the members of the group.
It outputs a carriage return.
Using the XML document shown earlier as a source document, the stylesheet creates this result:
mars neptune jupiter
It lists the grouping values. This ability to list all the different project values with no repeats in the list may seem simple, but it would have taken a lot more code in XSLT 1.0.
Let's replace the template rule with one that does a bit more:
<xsl:template match="files"> <xsl:for-each-group select="file" group-by="@project"> <xsl:for-each select="current-group()"> <xsl:value-of select="@name"/>, <xsl:value-of select="@size"/> <xsl:text> </xsl:text> </xsl:for-each> <xsl:text>average size for </xsl:text> <xsl:value-of select="current-grouping-key()"/> <xsl:text> group: </xsl:text> <xsl:value-of select="avg(current-group()/@size)"/> <xsl:text> </xsl:text> </xsl:for-each-group> </xsl:template>
The contents of this
xsl:for-eachelement begin with an XSLT 1.0
xsl:for-eachelement which, as I mentioned, iterates across a set of nodes. By selecting the
current-group()node set, the
xsl:for-eachelement iterates over the nodes of the "mars" group in the first
xsl:for-each-grouppass, the nodes of the "neptune" group in the second pass, and those of the "jupiter" group in the final pass. Each iteration of the
xsl:for-eachinstruction outputs the value of the
nameattribute of the context node (the node being processed by the loop), a comma, and the value of the context node's
sizeattribute, finishing with a carriage return added with an
xsl:textelement.
After the
xsl:for-eachelement iterates across the group being processed by the
xsl:for-each-groupelement, the template outputs a message about the average
sizevalue within each group. To do this, it uses the
current-grouping-key()function that we saw in our first stylesheet to name the group and the
avg()function to compute the average. The argument to the
avg()function is the node set consisting of the
sizeattribute values of all the nodes in the current group.
Applied to the same source document, this second stylesheet produces this result:
swablr.eps, 4313 ummagumma.zip, 2441 schtroumpf.txt, 389 average size for mars group: 2381 batboy.wks, 424 paisley.doc, 988 mondegreen.doc, 1993 average size for neptune group: 1135 potrzebie.dbf, 1102 kwatz.xom, 43 gadabout.pas, 685 average size for jupiter group: 610
If the
xsl:for-each-groupelement uses a
group-adjacentattribute instead of a
group-byattribute, it doesn't sort the selected elements, leaving them in their original order and grouping adjacent elements with the same key value together. For example, if we revise the previous stylesheet's template to look like this (note also the removal of the instructions that compute average file sizes),
<xsl:template match="files"> <xsl:for-each-group select="file" group-adjacent="@project"> <xsl:for-each select="current-group()"> <xsl:value-of select="@name"/>, <xsl:value-of select="@size"/> <xsl:text> </xsl:text> </xsl:for-each> <xsl:text> </xsl:text> </xsl:for-each-group> </xsl:template>
it only groups together the potrzebie.dbf/kwatz.xom pair and the ummagumma.zip/schtroumpf.txt pair, since those were the only contiguous
fileelements in our source documents that had the same
projectattribute value—"jupiter" for potrzebie.dbf and kwatz.xom and "mars" for ummagumma.zip and schtroumpf.txt.
swablr.eps, 4313 batboy.wks, 424 potrzebie.dbf, 1102 kwatz.xom, 43 paisley.doc, 988 ummagumma.zip, 2441 schtroumpf.txt, 389 mondegreen.doc, 1993 gadabout.pas, 685
The
group-starting-withattribute names a node that the
xsl:for-each-groupelement will treat as the beginning of a new group. This can add depth to a flat list of elements by enclosing groups of those elements in container elements. HTML documents, in which
h1,
h2,
h3, and
pelements after any of these headers are usually siblings, can benefit a lot from this; its flat structure makes it difficult for a stream-based parser to know which section of a document is ending when, and containing elements make this much easier. To add some depth to the following HTML document, the
group-starting-withattribute can let us specify that each
h1element starts a new chapter:
<html><body> <h1>Loomings</h1> <p>par 1</p> <p>par 2</p> <p>par 3</p> <h1>The Whiteness of the Whale</h1> <p>par 4</p> <p>par 5</p> <p>par 6</p> </body> </html>
The following template rule does this to elements within a
bodyelement by specifying "h1" as the node starting each group that the XSLT processor should enclose in a
chapterelement. Note how the
selectattribute doesn't specify one kind of element to group, but all (*) children of the
bodyelement:
<xsl:template match="body"> <body> <xsl:for-each-group select="*" group-starting-with="h1"> <chapter> <xsl:for-each select="current-group()"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:for-each> </chapter> </xsl:for-each-group> </body> </xsl:template>
Applying it to the HTML document shown above gives us this result:
<html> <body> <chapter> <h1>Loomings</h1> <p>par 1</p> <p>par 2</p> <p>par 3</p> </chapter> <chapter> <h1>The Whiteness of the Whale</h1> <p>par 4</p> <p>par 5</p> <p>par 6</p> </chapter> </body> </html>
The fourth and last way to specify a grouping is the
group-ending-withattribute, which names a pattern that identifies nodes that should end each group. The following template rule specifies that a group ends when it finds an element with any name (
*) whose position, modulo 3, equals 0 -- in other words, any element whose position within its parent is a multiple of 3. The template rule also encloses the whole result in a
bookelement.
<xsl:template match="files"> <book> <xsl:for-each-group select="*" group-ending-with="*[position() mod 3 = 0]"> <chapter> <xsl:for-each select="current-group()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:for-each> </chapter> </xsl:for-each-group> </book> </xsl:template>
[/code]
A stylesheet with this template rule creates this result when using the
filesdocument we saw earlier:
<book> <chapter> <file name="swablr.eps" size="4313" project="mars"/> <file name="batboy.wks" size="424" project="neptune"/> <file name="potrzebie.dbf" size="1102" project="jupiter"/> </chapter> <chapter> <file name="kwatz.xom" size="43" project="jupiter"/> <file name="paisley.doc" size="988" project="neptune"/> <file name="ummagumma.zip" size="2441" project="mars"/> </chapter> <chapter> <file name="schtroumpf.txt" size="389" project="mars"/> <file name="mondegreen.doc" size="1993" project="neptune"/> <file name="gadabout.pas" size="685" project="jupiter"/> </chapter> </book>
The,group-by, [code]group-adjacent
group-starting-with, and
group-ending-withattributes can all name an element as the criterion to determine grouping boundaries; but, as this last example shows, you can be more creative than that, using functions and XPath predicates to identify the source tree nodes that should be treated as group boundaries. The Examples section of the XSLT 2.0 Working Draft's section on grouping has additional good demonstrations of what you can do with these attributes to customize the
xsl:for-each-groupelement's treatment of your documents.Demonstrating XSLT 2.0's grouping capability is easiest with simple, flat data that would fit easily into a normalized relational table; remember, however, that you can take advantage of it with all kinds of data, as long as you can count on finding the fields and attributes you need where you need them. After all, sometimes the whole point of using XML is that you have data that won't fit easily into normalized tables; it's nice to see more and more of the tricks for manipulating those tables coming to the world of XML development.[/code]
相关文章推荐
- ASP.NET 2.0: Playing a bit with GridView "Sort Grouping"
- ASP.NET 2.0中XSLT的使用
- 6.10 Grouping Tasks Together with GCD
- ambiguous package name 'libglib2.0-0' with more than one installed instance
- Display Hierarchical Data with TreeView in ASP.NET 2.0
- (轉貼) 3-tier Architecture with ASP.NET 2.0 : Tutorial By Scott Mitchell (.NET) (ASP.NET) (N-Tier)
- Objective-C 2.0 with Cocoa Foundation --- 2,从Hello,World!开始
- [References]Deploying .NET Compact Framework 2.0 Applications with .cab and .msi Files
- Rolling with Rails 2.0 - The First Full Tutorial - Part 2
- iOS cocos2d 2.0 打开旧版本工程CCLayer报[[[self alloc] initWithColor:color] autorelease]出错
- ASP.NET 2.0: URL Mapping with RegEx Support
- Event Handling with Windows PowerShell 2.0
- Building a Web 2.0 Portal with ASP.NET 3.5
- Building Dynamic Web 2.0 Websites with Ruby on Rails
- FLV Flash video streaming with ASP.NET 2.0, IIS and HTTP handler
- Objective-C 2.0 with Cocoa Foundation --- (类的声明和定义 1)
- Objective-C 2.0 with Cocoa Foundation--- (继承 2)
- Objective-C 2.0 with Cocoa Foundation--- Class类型,选择器Selector以及函数指针(2)
- Objective-C 2.0 with Cocoa Foundation---NSObject的奥秘(3)