TreePath -- a universal tree navigation language
I'm not a die-hard XML fan. It has its place, but it's not the best thing ever. But there is one thing about the XML milieu that I really do like.
XPath.
When I got a grip on XPath, I was very impressed. I could immediately see the relationship between an XPath interpreter and the Unix glob() function; not that they have a lot of similarity, but that they perform similar roles in tree navigation.
However, each of them seems to have advantages that the other lack. But this gives me an idea...
TreePath, a Universal Tree Navigation Language
If a universal tree navigation language were designed, it would be possible to attach it to an XML backend (this would be like XPath), or to a filesystem backend (this would be glob()), or to an LDAP backend, and possibly other backends in the future; all kinds of people could use it to navigate their trees.
My plan is to discuss the advantages and disadvantages of both XPath and glob(), and then see if I can't come up with a better idea.
XPath advantages
The advantages of XPath over glob() are legion. XPath allows you to do filtering at each node based on its attributes. It allows a wide variety of comparison functions and operators, and even allows the user to define their own callback functions for use with it. This gives it even more flexibility than the Unix "find" program, as far as node selection goes (although "find" has other abilities). But imagine being able to do something like:
/home/fred/*[IsLink()]/*[IsDir()]/*[User = "john" and ends-with(Name, ".html")]That would find fred's home directory, get all links, get all the directories contained in the links, and then get all files where the user was john and the filename ends with ".html".
I won't go further into the advantages of XPath, but the example above should hopefully show what glob() would be like if it had some of XPath's power.
glob() advantages
However, glob has at least two big advantages:
- Everyone understands it already
- It's dead simple
- When it maps a glob specification to a node, it can give a single path that will identify that one node and no other.
There's not a lot we can do about point 1 above, but point 2 bears some thinking about. The example in the section above is how an XPath enthusiast would've explained it. But what you'd really end up with is more like:
/Directory[Name = "home"]/Directory[Name = "fred"]/Link/Directory/*[User = "john" and ends-with(Name, ".html")]That's all a bit confusing for someone familiar only with glob() and not XPath. But it gets worse in simpler cases. For example, take this file path:
/home/fred/file.txtWith an XPath-like notation, it would be:
/Directory[Name = "home"]/Directory[Name = "fred"]/File[Name = "file.txt"]...and seriously, who wants to select files like that.
As for what to do with point 3 above, I'll come back to that later.
Solutions
I want to propose not one tree navigation path specification, but two, named:
- Simple TreePath
- Complex TreePath
I'll discuss the complex one first, because the article will make more sense that way.
Complex TreePath
The idea of Complex TreePath is that it is a framework within which XPath can be coded. While it won't have many of the functions that XPath specifies, it will have hooks so that these functions can be implemented relatively easily.
It would also be possible to implement for filesystems (or LDAP), for use in those cases where Simple TreePath just isn't up to the job.
Or to put it very simply, it makes XPath work with any kind of tree.
Simple TreePath
Simple TreePath is a much simpler language which would be translated via a few simple transforms into a Complex TreePath.
The basic idea is that, for each node class, there is a default attribute. Rather than specifying the class for each node, and then selecting by attribute (which created the mess above in the "glob()" section) you would simply specify the value of the default attribute.
This would mean that filenames could again be specified as /home/fred/file.txt. But it would still be much more powerful than a standard glob(), as most of the full power of Complex TreePath would still be available inside the brackets (square brackets) after each element in the path.
It could also be used in selecting nodes from XML documents and LDAP databases, for ease of access.
Or, to make it simple, paths like filesystem paths would work with any kind of tree.
Summary
The XML community and LDAP community could benefit from a simpler path language. Every programmer and shell user could benefit from a more powerful path language. Other programmers working with trees could benefit from a selection language that is already mostly implemented for them. The ideas above would cover all these options.
More specifics will follow in future articles.
- wayland's blog
- Login or register to post comments
- Printer-friendly version
Delicious
Digg
StumbleUpon
Propeller
Reddit
Magnoliacom
Newsvine
Furl
Facebook
Google
Yahoo
Technorati
Icerocket