Hierarchical Configurations

Many sources of configuration data have a hierarchical or tree-like nature. They can represent data that is structured in many ways. Such configuration sources are represented by classes implementing the HierarchicalConfiguration interface. With BaseHierarchicalConfiguration there is a fully functional implementation of the interface from which most of the hierarchical configuration classes shipped with Commons Configuration are derived.

Prominent examples of hierarchical configuration sources are XML documents. They can be read and written using the XMLConfiguration class. This section explains how to deal with such structured data and demonstrates the enhanced query facilities supported by HierarchicalConfiguration. We use XML documents as examples for structured configuration sources, but the information provided here (especially the rules for accessing properties) applies to other hierarchical configurations as well. Examples for other hierarchical configuration classes are

Accessing properties in hierarchical configurations

We will start with a simple XML document to show some basics about accessing properties. The following file named gui.xml is used as example document:


<?xml version="1.0" encoding="ISO-8859-1" ?>
<gui-definition>
  <colors>
    <background>#808080</background>
    <text>#000000</text>
    <header>#008000</header>
    <link normal="#000080" visited="#800080"/>
    <default>${colors.header}</default>
  </colors>
  <rowsPerPage>15</rowsPerPage>
  <buttons>
    <name>OK</name>
    <name>Cancel</name>
    <name>Help</name>
  </buttons>
  <numberFormat pattern="###,###.##"/>
</gui-definition>

(As becomes obvious, this tutorial does not bother with good design of XML documents, the example file should rather demonstrate the different ways of accessing properties.) To access the data stored in this document it must be loaded by XMLConfiguration. Like for other file-based configuration classes a FileBasedConfigurationBuilder is used for reading the source file as shown in the following code fragment:


Parameters params = new Parameters();
FileBasedConfigurationBuilder<XMLConfiguration> builder =
    new FileBasedConfigurationBuilder<XMLConfiguration>(XMLConfiguration.class)
    .configure(params.xml()
        .setFileName("gui.xml"));
try
{
    XMLConfiguration config = builder.getConfiguration();
    ...
}
catch(ConfigurationException cex)
{
    // loading of the configuration file failed
}

If no exception was thrown, the properties defined in the XML document are now available in the configuration object. Other hierarchical configuration classes that operate on files can be loaded in an analogous way. The following fragment shows how the properties in the configuration object can be accessed:


String backColor = config.getString("colors.background");
String textColor = config.getString("colors.text");
String linkNormal = config.getString("colors.link[@normal]");
String defColor = config.getString("colors.default");
int rowsPerPage = config.getInt("rowsPerPage");
List<Object> buttons = config.getList("buttons.name");

This listing demonstrates some important points about constructing the keys for accessing properties in hierarchical configuration sources and about features of HierarchicalConfiguration in general:

Nested elements are accessed using a dot notation. In the example document there is an element <text> in the body of the <color> element. The corresponding key is color.text.
The root element is ignored when constructing keys. In the example you do not write gui-definition.color.text, but only color.text.
Attributes of XML elements are accessed in a XPath like notation.
Interpolation can be used in the same way as for all other standard configuration implementations. Here the <default> element in the colors section refers to another color.
Lists of properties can be defined by just repeating elements. In this example the buttons.name property has the three values OK, Cancel, and Help, so it is queried using the getList() method. This works with attributes, too. In addition, a special ListDelimiterHandler implementation can be set which supports splitting texts at a specific list delimiter character. This works in the same way as described in the section about properties configuration. If this mode was used, the three button names could be defined in a single XML element. However, then the pattern attribute of the <numberFormat> element needs to escape the list delimiter which occurs in its content using a backslash character. With these changes the affected part of the XML document would look as follows:
```
  <buttons>
    <name>OK, Cancel, Help</name>
  </buttons>
  <numberFormat pattern="###\,###.##"/>
```
Because repeating elements is a natural pattern for XML documents using list splitting is rather untypical for this format.

The next section will show how data in a more complex XML document can be processed.

Complex hierarchical structures

Consider the following scenario: An application operates on database tables and wants to load a definition of the database schema from its configuration. A XML document provides this information. It could look as follows:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<database>
  <tables>
    <table tableType="system">
      <name>users</name>
      <fields>
        <field>
          <name>uid</name>
          <type>long</type>
        </field>
        <field>
          <name>uname</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>firstName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>lastName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>email</name>
          <type>java.lang.String</type>
        </field>
      </fields>
    </table>
    <table tableType="application">
      <name>documents</name>
      <fields>
        <field>
          <name>docid</name>
          <type>long</type>
        </field>
        <field>
          <name>name</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>creationDate</name>
          <type>java.util.Date</type>
        </field>
        <field>
          <name>authorID</name>
          <type>long</type>
        </field>
        <field>
          <name>version</name>
          <type>int</type>
        </field>
      </fields>
    </table>
  </tables>
</database>

This XML is quite self explanatory; there is an arbitrary number of table elements, each of it has a name and a list of fields. A field in turn consists of a name and a data type. This XML document (let's call it tables.xml) can be loaded in exactly the same way as the simple document in the section before.

When we now want to access some of the properties we face a problem: the syntax for constructing configuration keys we learned so far is not powerful enough to access all of the data stored in the tables document.

Because the document contains a list of tables some properties are defined more than once. E.g. the configuration key tables.table.name refers to a name element inside a table element inside a tables element. This constellation happens to occur twice in the tables document.

Multiple definitions of a property do not cause problems and are supported by all classes of Configuration. If such a property is queried using getProperty(), the method recognizes that there are multiple values for that property and returns a collection with all these values. So we could write


Object prop = config.getProperty("tables.table.name");
if(prop instanceof Collection)
{
	System.out.println("Number of tables: " + ((Collection<?>) prop).size());
}

An alternative to this code would be the getList() method of Configuration. If a property is known to have multiple values (as is the table name property in this example), getList() allows retrieving all values at once. Note: it is legal to call getString() or one of the other getter methods on a property with multiple values; it returns the first element of the list.

Accessing structured properties

Okay, we can obtain a list with the names of all defined tables. In the same way we can retrieve a list with the names of all table fields: just pass the key tables.table.fields.field.name to the getList() method. In our example this list would contain 10 elements, the names of all fields of all tables. This is fine, but how do we know, which field belongs to which table?

When working with such hierarchical structures the configuration keys used to query properties can have an extended syntax. All components of a key can be appended by a numerical value in parentheses that determines the index of the affected property. So if we have two table elements we can exactly specify, which one we want to address by appending the corresponding index. This is explained best by some examples:

We will now provide some configuration keys and show the results of a getProperty() call with these keys as arguments.

tables.table(0).name: Returns the name of the first table (all indices are 0 based), in this example the string users.
tables.table(0)[@tableType]: Returns the value of the tableType attribute of the first table (system).
tables.table(1).name: Analogous to the first example returns the name of the second table (documents).
tables.table(2).name: Here the name of a third table is queried, but because there are only two tables result is null. The fact that a null value is returned for invalid indices can be used to find out how many values are defined for a certain property: just increment the index in a loop as long as valid objects are returned.
tables.table(1).fields.field.name: Returns a collection with the names of all fields that belong to the second table. With such kind of keys it is now possible to find out, which fields belong to which table.
tables.table(1).fields.field(2).name: The additional index after field selects a certain field. This expression represents the name of the third field in the second table (creationDate).
tables.table.fields.field(0).type: This key may be a bit unusual but nevertheless completely valid. It selects the data types of the first fields in all tables. So here a collection would be returned with the values [long, long].

These examples should make the usage of indices quite clear. Because each configuration key can contain an arbitrary number of indices it is possible to navigate through complex structures of hierarchical configurations; each property can be uniquely identified.

Sub Configurations

Sometimes dealing with long property keys may become inconvenient, especially if always the same properties are accessed. For this case HierarchicalConfiguration provides a short cut with the configurationAt() method. This method can be passed a key that selects exactly one node of the hierarchy of nodes contained in a hierarchical configuration. Then a new hierarchical configuration will be returned whose root node is the selected node. So all property keys passed into that configuration should be relative to the new root node. For instance, if we are only interested in information about the first database table, we could do something like that:


HierarchicalConfiguration<ImmutableNode> sub = config.configurationAt("tables.table(0)");
String tableName = sub.getString("name");  // only need to provide relative path
List<Object> fieldNames = sub.getList("fields.field.name");

For dealing with complex list-like structures there is another short cut. Often it will be necessary to iterate over all items in the list and access their (sub) properties. A good example are the fields of the tables in our demo configuration. When you want to process all fields of a table (e.g. for constructing a CREATE TABLE statement), you will need all information stored for them in the configuration. An option would be to use the getList() method to fetch the required data one by one:


List<Object> fieldNames = config.getList("tables.table(0).fields.field.name");
List<Object> fieldTypes = config.getList("tables.table(0).fields.field.type");
List<Object> ... // further calls for other data that might be stored in the config

But this is not very readable and will fail if not all field elements contain the same set of data (for instance the type property may be optional, then the list for the types can contain less elements than the other lists). A solution to these problems is the configurationsAt() method, a close relative to the configurationAt() method covered above. This method evaluates the passed in key and collects all configuration nodes that match this criterion. Then for each node a HierarchicalConfiguration object is created with this node as root node. A list with these configuration objects is returned. As the following example shows this comes in very handy when processing list-like structures:


List<HierarchicalConfiguration<ImmutableNode>> fields =
    config.configurationsAt("tables.table(0).fields.field");
for(HierarchicalConfiguration sub : fields)
{
    // sub contains all data about a single field
    String fieldName = sub.getString("name");
    String fieldType = sub.getString("type");
    ...

Per default, the configurations returned by the configurationAt() and configurationsAt() methods are a kind of snapshots of the data stored in the original configuration. If the original configuration is later changed, these changes are not visible in the sub configuration and vice versa. If configuration settings just need to be read, this is fine.

It is also possible to connect a sub configuration more directly to its original configuration. This is done by using overloaded versions of configurationAt() and configurationsAt() which accept an additional boolean parameter. If here the value true is passed, a special configuration implementation is returned (in fact, an instance of the SubnodeConfiguration class) that operates on the same data structures as the original configuration. Therefore, changes made on one configuration are directly reflected by the other one.

Connecting a sub configuration with its parent configuration in the described way is useful in use cases in which configurations are updated. However, there can be pretty drastic updates which break such a connection. As an example, consider the case that a sub configuration is created for a certain sub tree of an original configuration. Now this sub tree gets removed from the original configuration. In this case, the sub configuration becomes detached from its parent. Its content is not changed, but it is now again like a snapshot or a copy of the original. This is demonstrated again in the following example:


// sub points to the 2nd table
HierarchicalConfiguration<ImmutableNode> sub = config.configurationAt("tables.table(1)", true);
assertEquals("documents", sub.getString("name"));

// Now change name in parent configuration => should be visible in sub config
config.setProperty("tables.table(1).name", "tasks");
assertEquals("tasks", sub.getString("name"));

// Clear the whole content of the 2nd table
config.clearTree("tables.table(1)");
// The key used to create the sub configuration is no longer valid,
// so it is now detacted; it contains the recent data.
assertEquals("tasks", sub.getString("name"));

This example uses the clearTree() method of HierarchicalConfiguration to remove all information about the second database table from the configuration data. While clearProperty() only removes the value of a property, clearTree() also removes all child elements and their children recursively. After this operation the key tables.table(1) specified when the sub configuration was created no longer points to an existing element; therefore, the sub configuration gets detached. Once detached, a sub configuration cannot be reconnected to its parent again. Even if another table element was added (making the sub key valid again), the sub configuration remains detached.

Adding new properties

So far we have learned how to use indices to avoid ambiguities when querying properties. The same problem occurs when adding new properties to a structured configuration. As an example let's assume we want to add a new field to the second table. New properties can be added to a configuration using the addProperty() method. Of course, we have to exactly specify where in the tree like structure new data is to be inserted. A statement like


// Warning: This might cause trouble!
config.addProperty("tables.table.fields.field.name", "size");

would not be sufficient because it does not contain all needed information. How is such a statement processed by the addProperty() method?

addProperty() splits the provided key into its single parts and navigates through the properties tree along the corresponding element names. In this example it will start at the root element and then find the tables element. The next key part to be processed is table, but here a problem occurs: the configuration contains two table properties below the tables element. To get rid off this ambiguity an index can be specified at this position in the key that makes clear, which of the two properties should be followed. tables.table(1).fields.field.name e.g. would select the second table property. If an index is missing, addProperty() always follows the last available element. In our example this would be the second table, too.

The following parts of the key are processed in exactly the same manner. Under the selected table property there is exactly one fields property, so this step is not problematic at all. In the next step the field part has to be processed. At the actual position in the properties tree there are multiple field (sub) properties. So we here have the same situation as for the table part. Because no explicit index is defined the last field property is selected. The last part of the key passed to addProperty() (name in this example) will always be added as new property at the position that has been reached in the former processing steps. So in our example the last field property of the second table would be given a new name sub property and the resulting structure would look like the following listing:


	...
    <table tableType="application">
      <name>documents</name>
      <fields>
        <field>
          <name>docid</name>
          <type>long</type>
        </field>
        <field>
          <name>name</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>creationDate</name>
          <type>java.util.Date</type>
        </field>
        <field>
          <name>authorID</name>
          <type>long</type>
        </field>
        <field>
          <name>version</name>
          <name>size</name>    <== Newly added property
          <type>int</type>
        </field>
      </fields>
    </table>
  </tables>
</database>

This result is obviously not what was desired, but it demonstrates how addProperty() works: the method follows an existing branch in the properties tree and adds new leaves to it. (If the passed in key does not match a branch in the existing tree, a new branch will be added. E.g. if we pass the key tables.table.data.first.test, the existing tree can be navigated until the data part of the key. From here a new branch is started with the remaining parts data, first and test.)

If we want a different behavior, we must explicitly tell addProperty() what to do. In our example with the new field our intension was to create a new branch for the field part in the key, so that a new field property is added to the structure rather than adding sub properties to the last existing field property. This can be achieved by specifying the special index (-1) at the corresponding position in the key as shown below:


config.addProperty("tables.table(1).fields.field(-1).name", "size");
config.addProperty("tables.table(1).fields.field.type", "int");

The first line in this fragment specifies that a new branch is to be created for the field property (index -1). In the second line no index is specified for the field, so the last one is used - which happens to be the field that has just been created. So these two statements add a fully defined field to the second table. This is the default pattern for adding new properties or whole hierarchies of properties: first create a new branch in the properties tree and then populate its sub properties. As an additional example let's add a complete new table definition to our example configuration:


// Add a new table element and define the name
config.addProperty("tables.table(-1).name", "versions");

// Add a new field to the new table
// (an index for the table is not necessary because the latest is used)
config.addProperty("tables.table.fields.field(-1).name", "id");
config.addProperty("tables.table.fields.field.type", "int");

// Add another field to the new table
config.addProperty("tables.table.fields.field(-1).name", "date");
config.addProperty("tables.table.fields.field.type", "java.sql.Date");
...

For more information about adding properties to a hierarchical configuration also have a look at the javadocs for HierarchicalConfiguration.

Escaping special characters

Some characters in property keys or values require a special treatment.

Per default the dot character is used as delimiter by most configuration classes (we will learn how to change this for hierarchical configurations in a later section). In some configuration formats however, dots can be contained in the names of properties. For instance, in XML the dot is a legal character that can occur in any tag. The same is true for the names of properties in windows ini files. So the following XML document is completely valid:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<configuration>
  <test.value>42</test.value>
  <test.complex>
    <test.sub.element>many dots</test.sub.element>
  </test.complex>
</configuration>

This XML document can be loaded by XMLConfiguration without trouble, but when we want to access certain properties we face a problem: The configuration claims that it does not store any values for the properties with the keys test.value or test.complex.test.sub.element!

Of course, it is the dot character contained in the property names, which causes this problem. A dot is always interpreted as a delimiter between elements. So given the property key test.value the configuration would look for an element named test and then for a sub element with the name value. To change this behavior it is possible to escape a dot character, thus telling the configuration that it is really part of an element name. This is simply done by duplicating the dot. So the following statements will return the desired property values:


int testVal = config.getInt("test..value");
String complex = config.getString("test..complex.test..sub..element");

Note the duplicated dots wherever the dot does not act as delimiter. This way it is possible to access properties containing dots in arbitrary combination. However, as you can see, the escaping can be confusing sometimes. So if you have a choice, you should avoid dots in the tag names of your XML configuration files or other configuration sources.

Internal Representation

You might have noted that the HierarchicalConfiguration interface has a type parameter defining the type of nodes it operates on. Internally, the nodes build up a tree structure on which queries or manipulating operations can be executed. There is an abstract base class AbstractHierarchicalConfiguration implementing a major part of the provided functionality in terms of abstract node objects. These nodes are not directly accessed, but via a so-called NodeHandler. The NodeHandler interface defines a number of methods for accessing properties of a node like its name, its value, or its children in a generic way.

This constellation makes it possible to integrate hierarchical configurations with different hierarchical data structures, e.g. file systems, JNDI, etc. The standard configurations shipped with Commons Configuration mainly use an in-memory representation of their data based on the ImmutableNode class. There is a special base class for configurations of this type called BaseHierarchicalConfiguration. As the name implies, the nodes used by these configurations are immutable; this is beneficial especially when it comes to concurrent access. It is possible to obtain the root node of a BaseHierarchicalConfiguration object as shown in the following example, but this is necessary for very special use cases only because the most typical queries and manipulations can be done via the HierarchicalConfiguration interface.


// config is of type BaseHierarchicalConfiguration
ImmutableNode root = config.getNodeModel().getNodeHandler().getRootNode();

Expression engines

In the previous chapters we saw many examples about how properties in a XMLConfiguration object (or more general in a HierarchicalConfiguration object, because this is the base interface, which defines this functionality) can be queried or modified using a special syntax for the property keys. Well, this was not the full truth. Actually, property keys are not processed by the configuration object itself, but are delegated to a helper object, a so called Expression engine.

The separation of the task of interpreting property keys into a helper object is a typical application of the Strategy design pattern. In this case it also has the advantage that it becomes possible to plug in different expression engines into a HierarchicalConfiguration object. So by providing different implementations of the ExpressionEngine interface hierarchical configurations can support alternative expression languages for accessing their data.

Before we discuss the available expression engines that ship with Commons Configuration, it should be explained how an expression engine can be associated with a configuration object. HierarchicalConfiguration and all implementing classes provide a setExpressionEngine() method, which expects an implementation of the ExpressionEngine interface as argument. After this method was called, the configuration object will use the passed expression engine, which means that all property keys passed to methods like getProperty(), getString(), or addProperty() must conform to the syntax supported by this engine. Property keys returned by the getKeys() method will follow this syntax, too.

The recommended approach is that a configuration object is fully initialized by the configuration builder which creates it. The initialization parameters for hierarchical configurations allow setting the expression engine as shown in the following code fragment (more information about initialization parameters for hierarchical and XML configurations is provided in a later section in this chapter):


Parameters params = new Parameters();
FileBasedConfigurationBuilder<XMLConfiguration> builder =
    new FileBasedConfigurationBuilder<BaseHierarchicalConfiguration>(BaseHierarchicalConfiguration.class)
    .configure(params.hierarchical()
        .setExpressionEngine(new MyExpressionEngine()));

Remember that it is possible to define Default Initialization Parameters for specific configuration classes. Using this mechanism, it is possible to instance to set a special expression engine for all XML configurations used by an application.

The default expression engine

The syntax described so far for property keys of hierarchical configurations is implemented by a specific implementation of the ExpressionEngine interface called DefaultExpressionEngine. An instance of this class is used by the base implementation of HierarchicalConfiguration if no specific expression engine was set (which is the reason why our examples above worked).

After reading the examples of property keys provided so far in this document you should have a sound understanding regarding the features and the syntax supported by the DefaultExpressionEngine class. But it can do a little bit more for you: it defines a bunch of settings, which can be used to customize most tokens that can appear in a valid property key. You prefer curly brackets over paranthesis as index markers? You find the duplicated dot as escaped property delimiter counter-intuitive? Well, simply go ahead and change it! The following example shows how the syntax of a DefaultExpressionEngine object is modified. The key is to create an instance of the DefaultExpressionEngineSymbols class and to initialize it with the desired syntax elements. This is done using a builder approach:


DefaultExpressionEngineSymbols symbols =
    new DefaultExpressionEngineSymbols.Builder(
        DefaultExpressionEngineSymbols.DEFAULT_SYMBOLS)
        // Use a slash as property delimiter
        .setPropertyDelimiter("/")
        // Indices should be specified in curly brackets
        .setIndexStart("{")
        .setIndexEnd("}")
        // For attributes use simply a @
        .setAttributeStart("@")
        .setAttributeEnd(null)
        // A Backslash is used for escaping property delimiters
        .setEscapedDelimiter("\\/")
        .create();
DefaultExpressionEngine engine = new DefaultExpressionEngine(symbols);

// Now create a configuration using this expression engine
Parameters params = new Parameters();
FileBasedConfigurationBuilder<XMLConfiguration> builder =
    new FileBasedConfigurationBuilder<XMLConfiguration>(XMLConfiguration.class)
    .configure(params.xml()
        .setFileName("tables.xml")
        .setExpressionEngine(engine));
XMLConfiguration config = builder.getConfiguration();

// Access properties using the new syntax
String tableName = config.getString("tables/table{0}/name");
String tableType = config.getString("tables/table{0}@type");

DefaultExpressionEngineSymbol objects are immutable; the same is true for DefaultExpressionEngine. Thus a single expression engine instance can be shared between multiple configuration instances. The example fragment shows the typical usage pattern for constructing new DefaultExpressionEngine instances with an alternative syntax: A builder for a symbols object is constructed passing in an instance that serves as starting point - here the constant DEFAULT_SYMBOLS is used which defines the standard syntax. Then methods of the builder are used to modify only the settings which are to be adapted.

Tip: Sometimes when processing an XML document you don't want to distinguish between attributes and "normal" child nodes. You can achieve this by setting the AttributeEnd property to null and the AttributeStart property to the same value as the PropertyDelimiter property. Then the syntax for accessing attributes is the same as the syntax for other properties:


DefaultExpressionEngineSymbols symbolsNoAttributes =
    new DefaultExpressionEngineSymbols.Builder(
        DefaultExpressionEngineSymbols.DEFAULT_SYMBOLS)
        .setAttributeStart(
            DefaultExpressionEngineSymbols.DEFAULT_SYMBOLS.getPropertyDelimiter())
        .setAttributeEnd(null)
        .create();
DefaultExpressionEngine engine = new DefaultExpressionEngine(symbolsNoAttributes);
...
Object value = config.getProperty("tables.table(0).name");
// name can either be a child node of table or an attribute

There is another property which can be used to customize an instance of DefaultExpressionEngine: the node name matcher. This is an object implementing the NodeMatcher interface which controls how the names of configuration nodes are matched against the single parts of a configuration key. It can be passed as an optional second argument to the constructor. Per default, an exact match is performed, i.e. in order to successfully resolve a key like tables.table.name, there has to be a node named tables with a child node named table which in turn has a child with the name name - this is what most people would expect.

There are use cases, however, when more flexibility or tolerance is desired. For instance, applications under Windows storing their settings in ini files sometimes expect that they can access keys in a case insensitive manner. The node name matcher can help here. The enumeration class NodeNameMatchers defines some standard matchers which can collaborate with DefaultExpressionEngine. In addition to the EQUALS matcher used by default, there is also an EQUALS_IGNORE_CASE matcher. According to its name, this matcher ignores case when matching configuration keys against nodes. The following fragment shows how this matcher can be enabled (we use an INIConfiguration as example here, but this technique works with all other hierarchical configurations, too):


DefaultExpressionEngine engine = new DefaultExpressionEngine(
  DefaultExpressionEngineSymbols.DEFAULT_SYMBOLS,
  NodeNameMatchers.EQUALS_IGNORE_CASE);

Parameters params = new Parameters();
FileBasedConfigurationBuilder<INIConfiguration> builder =
    new FileBasedConfigurationBuilder<INIConfiguration>(INIConfiguration.class)
    .configure(params.hierarchical()
        .setFileName("settings.ini")
        .setExpressionEngine(engine));
INIConfiguration config = builder.getConfiguration();

// Access properties no matter of their concrete case
String backGroundColor = config.getString("colors.background");
String foreGroundColor = config.getString("COLORS.ForeGround");

The XPATH expression engine

The expression language provided by the DefaultExpressionEngine class is powerful enough to address all properties in a hierarchical configuration, but it is not always convenient to use. Especially if list structures are involved, it is often necessary to iterate through the whole list to find a certain element.

Think about our example configuration that stores information about database tables. A use case could be to load all fields that belong to the "users" table. If you knew the index of this table, you could simply build a property key like tables.table(<index>).fields.field.name, but how do you find out the correct index? When using the default expression engine, the only solution to this problem is to iterate over all tables until you find the "users" table.

Life would be much easier if an expression language could be used, which would directly support queries of such kind. In the XML world, the XPATH syntax has grown popular as a powerful means of querying structured data. In XPATH a query that selects all field names of the "users" table would look something like tables/table[@name='users']/fields/name (here we assume that the table's name is modelled as an attribute). This is not only much simpler than an iteration over all tables, but also much more readable: it is quite obvious, which fields are selected by this query.

Given the power of XPATH it is no wonder that we got many user requests to add XPATH support to Commons Configuration. Well, here is it!

For enabling XPATH syntax for property keys you need the XPathExpressionEngine class. This class implements the ExpressionEngine interface and can be plugged into a HierarchicalConfiguration object in the same way as described above. Because instances of XPathExpressionEngine are thread-safe and can be shared between multiple configuration objects it is also possible to set an instance as the default expression engine in the default initialization parameters for configuration builders, so that all hierarchical configuration objects make use of XPATH syntax. The following code fragment shows how XPATH support can be enabled for a configuration object:


Parameters params = new Parameters();
FileBasedConfigurationBuilder<XMLConfiguration> builder =
    new FileBasedConfigurationBuilder<XMLConfiguration>(XMLConfiguration.class)
    .configure(params.xml()
        .setFileName("tables.xml")
        .setExpressionEngine(new XPathExpressionEngine()));
XMLConfiguration config = builder.getConfiguration();

// Now we can use XPATH queries:
List<Object> fields = config.getList("tables/table[1]/fields/name");

XPATH expressions are not only used for selecting properties (i.e. for the several getter methods), but also for adding new properties. For this purpose the keys passed into the addProperty() method must conform to a special syntax. They consist of two parts: the first part is an arbitrary XPATH expression that selects the node where the new property is to be added to, the second part defines the new element to be added. Both parts are separated by whitespace.

Okay, let's make an example. Say, we want to add a type property under the first table (as a sibling to the name element). Then the first part of our key will have to select the first table element, the second part will simply be type, i.e. the name of the new property:


config.addProperty("tables/table[1] type", "system");

(Note that indices in XPATH are 1-based, while in the default expression language they are 0-based.) In this example the part tables/table[1] selects the target element of the add operation. This element must exist and must be unique, otherwise an exception will be thrown. type is the name of the new element that will be added. If instead of a normal element an attribute should be added, the example becomes


config.addProperty("tables/table[1] @type", "system");

It is possible to add complete paths at once. Then the single elements in the new path are separated by "/" characters. The following example shows how data about a new table can be added to the configuration. Here we use full paths:


// Add new table "tasks" with name element and type attribute
config.addProperty("tables table/name", "tasks");
// last() selects the last element of this name,
// which is the newest table element
config.addProperty("tables/table[last()] @type", "system");

// Now add fields
config.addProperty("tables/table[last()] fields/field/name", "taskid");
config.addProperty("tables/table[last()]/fields/field[last()] type", "int");
config.addProperty("tables/table[last()]/fields field/name", "name");
config.addProperty("tables/table[last()]/fields field/name", "startDate");
...

The first line of this example adds the path table/name to the tables element, i.e. a new table element will be created and added as last child to the tables element. Then a new name element is added as child to the new table element. To this element the value "tasks" is assigned. The next line adds a type attribute to the new table element. To obtain the correct table element, to which the attribute must be added, the XPATH function last() is used; this function selects the last element with a given name, which in this case is the new table element. The following lines all use the same approach to construct a new element hierarchy: At first complete new branches are added (fields/field/name), then to the newly created elements further children are added.

There is one gotcha with these keys described so far: they do not work with the setProperty() method! This is because setProperty() has to check whether the passed in key already exists; therefore it needs a key which can be interpreted by query methods. If you want to use setProperty(), you can pass in regular keys (i.e. without a whitespace separator). The method then tries to figure out which part of the key already exists in the configuration and adds new nodes as necessary. In principle such regular keys can also be used with addProperty(). However, they do not contain sufficient information to decide where new nodes should be added.

To make this clearer let's go back to the example with the tables. Consider that there is a configuration which already contains information about some database tables. In order to add a new table element in the configuration addProperty() could be used as follows:


config.addProperty("tables/table/name", "documents");

In the configuration a <tables> element already exists, also <table> and <name> elements. How should the expression engine know where new node structures are to be added? The solution to this problem is to provide this information in the key by stating:


config.addProperty("tables table/name", "documents");

Now it is clear that new nodes should be added as children of the <tables> element. More information about keys and how they play together with addProperty() and setProperty() can be found in the Javadocs for XPathExpressionEngine.

Note: XPATH support is implemented through Commons JXPath. So when making use of this feature, be sure you include the commons-jxpath jar in your classpath.

In this tutorial we don't want to describe XPATH syntax and expressions in detail. Please refer to corresponding documentation. It is important to mention that by embedding Commons JXPath the full extent of the XPATH 1.0 standard can be used for constructing property keys.

Builder Configuration Related to Hierarchical Configurations

There is special support for the initialization parameters of configuration builders for hierarchical configurations. The HierarchicalBuilderProperties interface defines additional settings applicable to hierarchical configurations. Currently, the expression engine can be set.

A parameters object for a hierarchical configuration can be obtained using the hierarchical() method of a Parameters instance. It returns an object implementing the HierarchicalBuilderParameters interface which contains set methods for all the available properties, including the ones inherited from base interfaces.