Ool - Out-of-line XML

See:
          Description

Packages
com.simonstl.ool code for converting XML documents into separate markup and text - and back again.

 

Ool - Out-of-line XML

Introduction | Examples | Warnings | Future Directions | License | Acknowledgments | Download

Introduction - Why Ool?

XML pretty much thrives on embedded markup, with tags intermingled with content. While this has worked well, it's certainly not the only way and perhaps not the best way. Separating markup from text offers a different set of possibilities. Ool creates "out-of-line" markup, which points into a separate file containing the text.

Ool is a first step, and only a first step. Ool makes it possible to work with markup and text separately, but it's a very simple framework. You can do lots of funky tricks with Ool, especially on the recombination side, but fundamentally it's extremely simple, even brain-dead.

Examples

Ool is a set of filters which separate the textual content of XML documents (just the content in the elements, not the attributes) from the the markup. One set of filters converts an XML document into separate markup and text files, and the other filter recombines the two parts.

We'll start with a very simple document:

<test xmlns="http://simonstl.com/ns/test/">
<message>Hello!  This document contains a gYearMonth.</message>
<gYearMonth >1970-11</gYearMonth>
<myYearMonth>1970-11</myYearMonth>
</test>

The text in the document, when separated from the markup, looks like:

Hello!  This document contains a gYearMonth.
1970-11
1970-11

A first pass over the document (with the OolBaseSAXFilter) produces:

<?xml version="1.0" standalone="yes"?>

<test xmlns="http://simonstl.com/ns/test/">
<ool:text ool:start="0" ool:end="1" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<message>
<ool:text ool:start="1" ool:end="45" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
</message>
<ool:text ool:start="45" ool:end="46" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<gYearMonth>
<ool:text ool:start="46" ool:end="53" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
</gYearMonth>
<ool:text ool:start="53" ool:end="54" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<myYearMonth>
<ool:text ool:start="54" ool:end="61" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
</myYearMonth>
<ool:text ool:start="61" ool:end="62" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
</test>

That's pretty awful, so the second filter (still under construction) OolCleanSAXFilter combines some of the ool:text elements with their parents:

<?xml version="1.0" standalone="yes"?>

<test xmlns="http://simonstl.com/ns/test/">
<ool:text ool:start="0" ool:end="1" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<message ool:start="1" ool:end="45" xmlns:ool="http://simonstl.com/ns/ool/"></message>
<ool:text ool:start="45" ool:end="46" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<gYearMonth ool:start="46" ool:end="53" xmlns:ool="http://simonstl.com/ns/ool/"></gYearMonth>
<ool:text ool:start="53" ool:end="54" xmlns:ool="http://simonstl.com/ns/ool/"></ool:text>
<myYearMonth ool:start="54" ool:end="61" xmlns:ool="http://simonstl.com/ns/ool/"></myYearMonth>
</test>

Recombining the text with the document (using oolRemesh and the -v option) produces:

<?xml version="1.0" standalone="yes"?>

<test xmlns="http://simonstl.com/ns/test/">
<ool:text ool:start="0" ool:end="1" xmlns:ool="http://simonstl.com/ns/ool/">
</ool:text>
<message ool:start="1" ool:end="45" xmlns:ool="http://simonstl.com/ns/ool/">Hello!  This document contains a gYearMonth.</message>
<ool:text ool:start="45" ool:end="46" xmlns:ool="http://simonstl.com/ns/ool/">
</ool:text>
<gYearMonth ool:start="46" ool:end="53" xmlns:ool="http://simonstl.com/ns/ool/">1970-11</gYearMonth>
<ool:text ool:start="53" ool:end="54" xmlns:ool="http://simonstl.com/ns/ool/">
</ool:text>
<myYearMonth ool:start="54" ool:end="61" xmlns:ool="http://simonstl.com/ns/ool/">1970-11</myYearMonth>
</test>

Since the recombined document doesn't need the ool elements, it can also produce:

<?xml version="1.0" standalone="yes"?>

<test xmlns="http://simonstl.com/ns/test/">


<message>Hello!  This document contains a gYearMonth.</message>


<gYearMonth>1970-11</gYearMonth>


<myYearMonth>1970-11</myYearMonth>
</test>

(The extra whitespace comes from the output tool used by the SAX filter at the command-line, and won't appear if the filters are used in a strictly SAX environment.

Warnings

Ool is thoroughly experimental, liable to change, and idiosyncratic. Ool is primarily a thought experiment which may not correspond to your needs.

Documents created with Ool contain fragile pointers. You cannot change the text document and expect the pointers to point to the right place. Eventually I'm hoping to build tools which do a better job of keeping track, but they're not here yet.

Future directions

Ool is currently in alpha. The core functionality works, but I'm still planning on adding functionality for specifying source files inside of the "ooled" document. There's still potential for improvement, expansion, and as always, better documentation. A MOE version is also planned.

License

Ool is distributed under the Mozilla Public License 1.1. For more information, see the javadoc.

Acknowledgments

Ted Nelson provided inspiration.

Download

A download is available.

Introduction | Examples | Warnings | Future Directions | License | Acknowledgments | Download