XPath Exercises

In the following exercises, you use the syntax and semantics of XPath as is given in the slides. Not as in the XPath book.

  1. Change the definition of Regular XPath as described above (only allow * on atoms).
  2. Real XPath also has the following and preceding axis. Show that this is syntactic sugar by expressing them using the other axis.
  3. In real XPath 1.0, you may only write axis::l[node-test], that is, the filter operation is only allowed on atoms (with an optional label test). Show that this restriction is not a real restriction by showing that you can rewrite any formula with arbitrary use of filters to an equivalent formula with restricted use of filters.
  4. Show that the restriction on the use of unions mentioned above is also not a real restriction. Do that by specifying a set of rewrite rules which distribute the unions out of the formulas, but do not change the meaning of the formulas. You should be familiar with such rules from your propositional logic course, e.g. (A or B) and C is equivalent to (A and C) or (B and C).
  5. Look at your rules. They preserve meaning but they do change the shape of the formulas. In particular, the formulas may grow. And even quite a lot! 
    Show that this growth in formula length can be exponential, by specifying a family of formulas F1,F2,....,Fn,... with the following property: the lenghth of Fi is polynomial in n, but the rewriting of Fi with your rules has length expontial in n.
  6. Can you also distribute unions out in Regular XPath? Give an example formula with a union inside which does not seem to be rewritable to a union of union-free formulas. Try to convince me why.
  7. Recall that every propositional formula can be written as a disjunction of clauses. And a clause is a conjunction of propositional letters and their negations. Inside the filter expressions we also have propositional logic. Show that P[A or B] is equivalent to P[A] union P[B], no matter what formula A and B are. Use the rules which specify the semantics.
  8. The last exercise combined with the one before makes us think that we could also completely eliminate the use of "or" from node-expressions, by distributing them out. But this is not possible, because the filter (node) expressions are not just propositional formulas. Give an example of a node-formula with a disjunction that cannot be distributed out.
  9. Let Conditional XPath be the version of Regular XPath in which application of * is restricted to formulas of the form atom[T]. Thus we can write (downarrow[p or q])*[r]. This expresses a downward path that ends in a node with label r and all nodes in between are labeled either p or q.
    1. Show that you can express this formula in Core XPath (with only star on the atoms), for formulas evaluated at the root.
    2. Explain why your trick does not work when the formula is evaluated in the middle of a tree?
    3. Show that you can express every Conditional XPath formula with the help of the XPath 2.0 operators intersect and/or except.
    4. Give an example of a natural XML file (with real data) on which you would like to use this conditional xpath construction. Give the query both in natural language and as a formula.