This document specifies an xpath1() scheme for use in XPointer-based fragment identifiers. This scheme, like other XPointer Framework[12] schemes, is designed primarily for use with the XML Media Types defined in RFC 3023[5], to identify locations within a given XML representation of a resource. The xpath1() scheme uses XPath 1.0 syntax


Table of Contents


1. Introduction

The XPath 1.0-based xpath1() scheme is intended to be used within the XPointer Framework[12] to support the addressing of nodes within XML documents. Like the xpointer() scheme, the xpath1() scheme supports addressing into the internal structures of XML documents and external parsed entities, supporting addressing document's hierarchical structure and choice of its internal parts based on various properties, such as element types, attribute values, character content, and relative position. Unlike the xpointer() scheme[15], the xpath1() scheme only identifies complete nodes and has no support for character-based ranges.


2. Justification

As the W3C's xpointer() scheme[15] already provides a superset of the functionality provided by the xpath1() scheme, some consideration of why the xpath1() scheme is useful seems worthwhile.

The xpointer() scheme, designed to support the out-of-line linking capabilities of XLink[11], provides support for character ranges which may arbitrarily cross node boundaries. While this is extremely useful for many hypertext applications, it is unnecessary for a wide variety of simpler projects, and XPath 1.0 is generally far more widely supported than the xpointer() scheme.

While the XPointer Framework explicitly supports multiple levels of conformance, the xpointer() scheme states that "Conforming XPointer processors claiming to support the xpointer() scheme must conform to the behavior defined in this specification and may conform to additional XPointer scheme specifications." Conforming xpointer() processors must implement both XPath and the xpointer() scheme's own extensions, and while applications might use only the subset of xpointer() that is pure XPath, processors built for that approach are non-conformant.

The XPointer set of specifications also includes shorthand pointers (based on ID values with their own complications) and support for an element()[14] scheme that is effectively a subset of XPath, but these offer considerably less functionality than XPath.

The xpath1() scheme strikes a balance between the simple implementation but limited functionality of shorthand pointers and the element() scheme, and the complex implementation but great capabilities of the xpointer() scheme. Perhaps more importantly, it strikes that balance using processing capabilities that are already widely deployed.


3. Syntax

The scheme name is "xpath1". The scheme data syntax is as follows; if scheme data in a pointer part with the xpath1() scheme does not conform to the syntax defined in this section, it is an error and the pointer part fails.

xpath1() Scheme Syntax:

  ptrpart             ::=    xpath1( xpath1schemedata )
  xpath1schemedata    ::=    Expr

Expr is as defined in the XPath 1.0 Recommendation[7]. To support identifiers operating on external parsed entities with multiple root elements or text nodes, the xpath1() scheme extends the XPath data model to permit the root node to contain any sequence of nodes that an element node may contain. (XSLT provides the same extension.) Variable references, and function calls other than the core functions defined in XPath 1.0, are not allowed.


4. Processing

The XML namespace[10] context for the XPath should be set by previous XPointer parts using the XPointer xmlns() scheme[13]. The XPath expression is evaluated with a context position and size of 1, and a context node that is the element containing the XPointer.

Evaluating the expression contained in xpath1schemedata in that namespace and document context returns a location-set or an error. If the XPath expression identifies a single node, then the location-set contains a single location corresponding to that node. If the XPath expression identifies a node-set, this XPointer identifies a location-set corresponding to the node-set. If the result of the XPath expression is something other than a node or node-set, the XPointer result is an error.

XPath resolution may return varying results depending on the parser used to feed the XPath processor. As non-validating parsers are not required to retrieve and incorporate external parsed entities, it is possible that the document in this situation will be incomplete. Variable interpretations of XInclude[9] processing may also cause difficulties, though that may be addressed using the xinclude1()[16] scheme.


5. Relation to MIME Media Types

MIME Media type registrations should indicate whether or not the xpath1() scheme is applicable to their contents. While this scheme is obviously most directly connected to XML registrations made in accordance with RFC 3023[5], it could conceivably be used with any registration made in accordance with RFC 2046[1] and RFC 2048[2], provided that the registration provides an explicit mapping between an XML structure and the contents of the type.


6. XPath Versions

At the time of writing, the W3C is developing Version 2.0 of XPath[8]. While this version has many more features, it also has many more dependencies and will require a much more sophisticated processor. Developers who want to use XPath 2.0 for fragment identifiers should define a new XPointer scheme (perhaps xpath2()). All xpath1() scheme processors must reject expressions which do not conform to the syntax and feature set of XPath 1.0[7].


7. Considerations for Streaming Processing

Current versions of XPath provide support for navigation from the later points in a document to prior points, a useful set of tools but only in contexts where the complete document (or substantial portions of it) remain available throughout expression processing. The xpath1() scheme supports that functionality and therefore makes a purely streaming processor difficult to write. Developers who need to be able to support streaming processing of fragment identifiers should consider or create other schemes.


8. Conformance

This specification normatively depends on the XPointer Framework[12], except insofar as it rejects the claim in Section 3.3 that "this specification reserves all scheme names for definition in additional W3C XPointer scheme specifications", and also normatively depends on XPath 1.0[7]. It also normatively depends on the XPointer xmlns() scheme[13]. XPointer processors claiming to conform to this specification must also conform to the xmlns() specification.

The scheme data for the xpath1() scheme conforms to this specification if it is processable using XPath 1.0[7], with support for multi-rooted external parsed entities if applied in that context.

Conforming XPointer processors claiming to support the xpath1() scheme must conform to the behavior defined in this specification and may conform to additional XPointer scheme specifications, including in particular the xpointer() scheme[15] which is a superset of the xpath1() scheme.


9. Security Considerations

While it is conceivable that faulty XPath expressions could be used to overflow or overload XPath processors used in processing of the xpath1() scheme data, the author is not aware of any such assaults having taken place. XPath processors should be robust enough to handle or reject XPaths of all kinds in order to process this scheme.


10. IANA Considerations




Appendix A. Acknowledgements

Thanks to Uche Ogbuji and Eric van der Vlist for inspiration. Michael Kay provided much guidance on defining the XPath processing context, and discussion on xml-dev has made some complications (especially around XInclude and entity processing) much clearer.


Full Copyright Statement