TreePath and glob()

glob(), as I said in the article "TreePath -- a universal tree navigation language", has its good points, but is lacking in power. It's time an upgrade was done.

Replacing glob()

The idea here is not to replace the glob() function at an API level, since using TreePath can only ever be about 95% compatible with glob() as it currently stands.

Instead, I recommend creating a pair of functions (maybe simplefspath and complexfspath) to be used alongside glob(), which will hopefully gradually supersede glob() in most programs.

simplefspath/Simple TreePath

The functionality of glob() is a natural match for the Simple TreePath function. The problem with the idea is that there needs to be a way of specifying the default attribute type and a default child type for each node class. However, this problem is not unsurmountable.

I would suggest that in order to minimise trouble for other programmers, that those who are implementing simplefspath do their best to make it like glob(). This would mean that:

  • The default attribute would always be Name (ie. the filename)
  • The default child be "*" (ie. anything)
  • The default comparison function would need to be a string comparison function that would allow for most glob() wildcards, such as "*" and "?". However, the one wildcard that would be utterly incompatible with TreePath is the "[]" wildcard.
  • When a set of nodes are received in return from the SimpleTreePath function, they should be translated into strings which are paths before being returned from the simplefspath function

Following the suggestions outlined above would mean that 90% or more of instances of glob() could simply substitute simplefspath() in its place (with the appropriate initialisation function called before), and continue to work as they always have.

Shells

The real problem would arise for those who write programming languages whose syntax includes globs. I'm thinking here particularly of shells. I'll speak about "bash" (the Bourne Again SHell) from now on, as that's what I'm familiar with.

Basically, each shell would need a way to specify which type of path is being used. I don't have any bright ideas for them, unfortunately. I see their options as being:

  1. Ignore TreePath altogether (not the recommended option).
  2. Change things, and break every shell script that uses [] in its globbing (also not the recommended option)
  3. Have a way of specifying TreePath paths. This might involve unicode characters, or they might have a spare punctuation mark up their sleeve. Maybe it would also work if they made it so that a colon (possibly in a string?) preceded by the word simplefspath or complexfspath would specify a path of that type. For example: simplefspath:/home/fred/*.txt[CanRead("John")]
  4. In addition to the above, change things so that the path type can be specified as being one of the three, but leave the default as "glob". Deprecate this usage. Then, at some future point (after issuing warnings for many years), change the default to simplefspath (but leave the default configurable). This will give people time to change their shell scripts, and will mean that those who don't want to switch their glob type can maintain it by configuring a different default.

complexfspath/Complex TreePath

Having considered the details above, I hope the implementation of complexfspath will be obvious once the API is available. The TreePath API will be specified later in this series.