Site logo

evaluateXPathExpression vs XML-rules: which is better?

The discussion is here.

Hi all,

After a few months I decided to answer my own question in hope it may come in handy to the future generations.
Here’s a screenshot of the script’s dialog box I’m talking about:

 

I added a couple of radio buttons so the user could choose either xml-rules or evaluateXPathExpression option.
After some testing I came to conclusion that the latter is much better because it allows overcoming xPath limitations (not all, of course).
Here’s a quote from the XML rules chapter (page 200-201):
Due to the one-pass nature of this implementation, the following XPath expressions are specifically
excluded:
 No ancestor or preceding-sibling axes, including .., ancestor::, preceding-sibling::.
 No path specifications in predicates; for example, foo[bar/c].
 No last() function.
 No text() function or text comparisons; however, you can use InDesign scripting to examine the text
content of an XML element matched by a given XML rule.
 No compound Boolean predicates; for example, foo[@bar=font or @c=size].
 No relational predicates; for example, foo[@bar < font or @c > 3].
 No relative paths; for example, doc/chapter.

Frankly speaking, I’m not an xPath-man and don’t quite understand what it means. But my client wanted the script to work with the last() function.
When I tried to use it with xml-rules, the script wrote the following error to console: “Adobe InDesign cannot process^1XPath expression '^2', line: 43”.
But with evaluateXPathExpression it worked as expected.
Since I know almost nothing about xPath, I used the information found on these two pages – XML and XPath and XPath Syntax – because it’s given in an easy-to-understand manner for me. (I know some coding gurus here on the scripting forum will be angry at me for mentioning the http://www.w3schools.com but please have mercy on me – a mere mortal retoucher.) I created a couple of documents for testing -- copied the xml-structure and imported it to InDesign – and tested them against the path expressions posted as examples on the pages. Here’s an archive with the InDesign documents (CC 2014) – their IDML versions are also included – and the scripts I used for testing.

Here are the examples that work with evaluateXPathExpression, but don’t work with xml-rules producing an error:
/bookstore/book[last()] Selects the last book element that is the child of the bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore element
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00
//book/title | //book/price Selects all the title AND price elements of all book elements
//title | //price Selects all the title AND price elements in the document
/bookstore/book/title | //price Selects all the title elements of the book element of the bookstore element AND all the price elements in the document
. (Selects the current node)

This one doesn’t work with xml-rules silently producing no error:
bookstore//book Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element

The following doesn’t work with both -- xml-rules (producing an error) and evaluateXPathExpression (no error):
//@lang Selects all attributes that are named lang

The following doesn’t work with both silently:
bookstore/book Selects all book elements that are children of bookstore
bookstore//book Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element

Here are the simple scripts I used for testing (I just copied-pasted xPath expressions into the relative lines):

evaluateXPathExpression

XML-rules

Now a very important note: when I tested xPath expressions using evaluateXPathExpression, it worked well with my simple test files, but when my client gave me his “real” document with a complex xml-structure, it stopped working for some reason.

A couple of years ago (or more) I found out that it was somehow associated with namespaces: no namespaces – no problem.
Now, after a more careful investigation I’ve discovered that it depends on the attributes in all the parent xml-elements starting from the element we’re looking for and up to the root. The problem occurs because of the elements whose names contain colons.
For example, we’re looking for //sec/p[last()] (the last p element in every sec element). In the screenshot below, it’s marked in green.
The elements marked in red prevent the script from working properly. My idea was to temporarily replace colons, say, with underscores -- or any other valid character, in case underscores are used in attribute names – so I came up with the following script:

 

 

 

Blah-blah-blah