Templates by BIGtheme NET

XSLT Version 1.0 Vs 2.0

What is new in XSLT 2.0 :
Given any XML document, produce a list of the words that appear in its text.
Giving the number of times each word appears, together with its’ frequency.

frequency.xml

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="frequency.xsl"?>
<Sample>
<TITLE>a a a a a</TITLE>
<TITLE>1 1 1 1 1</TITLE>
<TITLE>A A A A A</TITLE>
<TITLE>! ! ! ! !</TITLE>
<TITLE>b b b b b</TITLE>
</Sample>

frequency.xsl

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/">
<wordcount>
<xsl:for-each-group group-by="." select="
for $w in //text()/tokenize(., 'W+')[.!=''] return lower-case($w)">
<xsl:sort select="count(current-group())" order="descending"/>
<word word="{current-grouping-key()}" frequency="{count(current-group())}"/>
</xsl:for-each-group>
</wordcount>
</xsl:template>

</xsl:stylesheet>

How to run :

c:saxon>java -jar C:saxonsaxon9ee.jar -a -s:sample/frequency.xml -o:sample/frequency.html

The output html file should be as given

frequency.html

<?xml version="1.0" encoding="UTF-8"?>
<wordcount>
<word word="a" frequency="10"/>
<word word="1" frequency="5"/>
<word word="b" frequency="5"/>
</wordcount>

In xsl, we are only considering words not special characters.

We have not mentioned case sensitivity, that is all capital & small characters are counted as same.

Come to the new features added in XSLT 2.0:

1) <xsl:for-each-group> instruction is new in XSLT 2.0

2) tokenize() function also new in XSLT 2.0

3) tokenize() function uses regular expressions <<W+>>, this is new in XSLT 2.0 and XPath 2.0

4) lower-case() function also new in XSLT 2.0

5) The avg() function too new in XPath 2.0