Migration Guide to Version 2.0

This document aims at supporting with the migration from version 1.x of Commons Configuration to version 2.0. Target audience are users of an older version who want to upgrade. The document describes the areas in which major changes have been implemented; here problems are likely to be encountered during migration. It has the following content:

Introduction
Structural Changes
Accessing Configuration Properties
Creating Configurations
Reloading
Combining Configuration Sources
Concurrency Issues
Events

Introduction

Version 2.0 of Commons Configuration is the result of a major redesign of this library. While version 1.x has become pretty stable and does what it is expected to do, there are some limitations and design flaws which could not be fixed in a painless and compatible way.

In order to overcome these restrictions, version 2.0 has applied significant changes to some of the problematic concepts or even replaced them by alternative approaches. This has lead to an ambivalent situation: On one hand, you will recognize many similarities to the old version - classes with familiar names that continue to do what they have done before. On the other hand, completely new approaches have been introduced; in the affected areas Commons Configuration 2.0 will look like a completely new product rather than a plain upgrade.

Because of such major changes, you cannot simply drop the new jar in your classpath and expect that everything continues to work. In the remaining part of this document the most important changes are described. This should give you an impression about the effort required to integrate the new version with your application.

Also note that the user's guide has been fully reworked to cover all the new features and concepts offered by Commons Configuration 2.0. Because of that, this document will not describe interfaces or classes in detail, but simply refer to the corresponding sections of the user guide.

Structural Changes

The most obvious change you will notice at the very beginning is that the root package was renamed to org.apache.commons.configuration2 - the major version is now part of the package name. This certainly makes migration harder, but it is the only possibility to avoid jar hell. Imagine for a moment that we had kept the old package name. This would work well for applications that are the only user of the Commons Configuration library. But as soon as there are 3rd party libraries also using this component, but in version 1.x, then there is real trouble: The class path then contains classes with identical names in different versions - results will be unpredictable! The change of the package name solves this problem because the new version can now coexist with an old version without interfering. The very first step you have to when migrating to version 2.0 is to reorganize your imports to adapt them to the new package name. Modern IDEs will support you with this task.

For the same reason the Maven coordinates have been changed. Use the following dependency declaration in your pom:


<dependency>
  <groupId>org.apache.commons</groupId>
  <artifactId>commons-configuration2</artifactId>
  <version>2.7</version>
</dependency>

So for Maven version 2.0 is a completely different artifact. This allows a peaceful coexistence of Commons Configuration 1.x and 2.0 in the dependency set of a project.

Accessing Configuration Properties

The good news is that there are only minor changes in the central Configuration interface used for reading and writing configuration data. A few methods have been added supporting new features, but the principle patterns for dealing with Configuration objects remain valid. These concepts are described in the user's guide in the sections Using Configuration and Basic features and AbstractConfiguration.

What has changed is the default implementation of List handling in AbstractConfiguration. In version 1.x list splitting was enabled per default; string properties containing a "," character were interpreted as lists with multiple elements. This was a frequent source for confusion and bug reports. In version 2.0 list splitting is now disabled initially. The implementation also has been extended: it is no longer limited to providing a delimiter character, but an implementation of the ListDelimiterHandler interface can be set which controls all aspects of list handling. In order to enable list handling again, pass a DefaultListDelimiterHandler object to your AbstractConfiguration instance. This class supports splitting string properties at specific delimiter characters. However, its results are not 100% identical to the ones produced by Commons Configuration 1.0: this version contained some inconsistencies regarding the escaping of list delimiter characters. If you really need the same behavior in this area, then use the LegacyListDelimiterHandler class.

Version 2.0 also has changes related to Hierarchical Configurations. HierarchicalConfiguration, formally the base class for all hierarchical configurations, is now an interface. The equivalent to the old base class is now named BaseHierarchicalConfiguration. It extends the abstract base class AbstractHierarchicalConfiguration. The difference between these classes is that AbstractHierarchicalConfiguration provides generic algorithms for dealing with an arbitrary hierarchical node structure. BaseHierarchicalConfiguration in contrast defines its own node structure based on objects kept in memory. In future, it should be possible to support other kinds of hierarchical structures directly by creating specialized sub classes from AbstractHierarchicalConfiguration. Refer to section Internal Representation for further information. The node objects a hierarchical configuration deals with are now exposed as a generic type parameter; for instance, BaseHierarchicalConfiguration is actually an AbstractHierarchicalConfiguration<ImmutableNode>. For most applications only interested in accessing configuration data via the typical access methods, this parameter is not relevant and can be replaced by a wildcard ("?") in variable declarations. Extended query facilities on hierarchical configurations work in the same way as in version 1.x; so applications need not be updated in this area.

Creating Configurations

A major difference between Commons Configuration 1.x and 2.0 is the way configuration objects are created, initialized, and managed. In version 1.x configurations are created directly using their constructor. Especially for file-based configuration implementations - like PropertiesConfiguration or XMLConfiguration - there were constructors which immediately populated the newly created instances from a data source. If additional settings were to be applied, this was done after the creation using bean-like set methods. For instance, in order to create an initialized PropertiesConfiguration object, the following code could be used:


// Version 1.x: Initializing a properties configuration
PropertiesConfiguration config = new PropertiesConfiguration("myconfig.properties");
config.setThrowExceptionOnMissing(true);
config.setIncludesAllowed(false);
config.setListDelimiter(';');

While this code is easy to write, there are some non-obvious problems:

Some settings influence the loading of the configuration data. In this example, the definition of the list delimiter and the includesAllowed flag fall into this category. However, because the data is directly loaded by the constructor these settings are applied too late and thus ignored by the load operation.
The constructor calls a protected method for loading the data. This can lead to subtle bugs because at this time the instance is not yet fully initialized.
The various set methods are not thread-safe; if this configuration instance is to be accessed from another thread, there may be problems.

To overcome these problems, Commons Configuration uses a different approach for the creation of configuration objects based on configuration builders. The basic idea is that a configuration builder is created and initialized with all parameters to be applied to the new configuration object. When the configuration instance is queried from its builder it is guaranteed that it has been fully initialized in the correct order. In addition, access to configuration builders is thread-safe. Configuration builders offer a fluent API for setting the initialization parameters for the configuration to be created. The example above would become something like the following in version 2.0:


FileBasedConfigurationBuilder<PropertiesConfiguration> builder =
    new FileBasedConfigurationBuilder<PropertiesConfiguration>(PropertiesConfiguration.class)
    .configure(new Parameters().properties()
        .setFileName("myconfig.properties")
        .setThrowExceptionOnMissing(true)
        .setListDelimiterHandler(new DefaultListDelimiterHandler(';'))
        .setIncludesAllowed(false));
PropertiesConfiguration config = builder.getConfiguration();

Builders also offer an increased flexibility regarding the management of configuration objects. While in version 1.x of Commons Configuration typically Configuration objects were kept centrally and accessed throughout an application, the recommended way in version 2.0 is to work with configuration builders. A builder not only creates a new configuration object but also caches a reference to it so that it can be accessed again and again. This makes it possible to add special functionality to the builder. For instance, it may decide to return a different configuration object under certain circumstances - e.g. when a change on an external configuration source is detected and a reload operation is performed. For the application this is fully transparent.

Working with builders may seem a bit verbose at first. There are some ways to simplify their usage. Be sure to read the section Making it easier which describes some useful short cuts. It is also possible to define default values for initialization parameters. This allows simplifying of builder configurations and can establish application-global standard settings for configuration objects. This mechanism is described in Default Initialization Parameters.

Reloading

Support for reloading of externally changed configuration sources was limited in Commons Configuration 1.x. There was a reloading strategy implementation that was triggered on each access to a configuration property and checked whether an underlying file was changed in the meantime. If this was the case, the configuration was automatically reloaded. CONFIGURATION-520 contains a discussion about the problems and limitations of this approach.

In version 2.0 reloading functionality has been completely redesigned. The new approaches are described in the chapter Automatic Reloading of Configuration Sources of the user's guide. In a nutshell, configuration builders play an important role here. There are builder implementations available which can be configured to monitor external configuration sources in a pretty generic way. When a change is detected, the builder resets its managed configuration so that the next time it is accessed a new instance is created. In addition, an event can be generated notifying the application that new configuration information might be available. The whole mechanism can be setup to perform reloading checks periodically and automatically in a background thread.

The FileChangedReloadingStrategy class from version 1.0 no longer exists. It is replaced by the new, more powerful reloading mechanisms. The mentioned chapter about reloading describes in detail how a reloading-aware configuration builder can be setup and fine-tuned to an application's needs.

Combining Configuration Sources

In Commons Configuration 1.x, there were two options for creating a combined configuration out of multiple sources:

The already deprecated ConfigurationFactory class
The DefaultConfigurationBuilder class

The former has been removed. The functionality provided by DefaultConfigurationBuilder is still available, but the class has been renamed to CombinedConfigurationBuilder (the old name was no longer appropriate as builders are now a common concept in the library) and adapted to other builder implementations.

In version 1.x DefaultConfigurationBuilder inherited from XMLConfiguration - it was itself a configuration and could be populated dynamically. CombinedConfigurationBuilder in contrast is a regular builder implementation. In its initialization parameters it can be passed another builder object from which the definition for the combined configuration is obtained. So a dynamic approach is possible here as well. In both cases, the getConfiguration() method is used to obtain the CombinedConfiguration object constructed by the builder. From a usage point of view there is not that much difference between these classes.

In both the old and the version, a XML-based definition file is used to declare the different configuration sources that are to be combined plus some additional settings. The principle structure of this file has not changed - the full description of the new format is available at Configuration definition file reference.

A problem when migrating from DefaultConfigurationBuilder to CombinedConfigurationBuilder is that those definition files can contain bean definitions, i.e. references to classes which will be automatically instantiated by Commons Configuration. Because of the change of the package name definition files written for version 1.x will not work with the new version if they make use of this feature and reference internal classes shipped with the library. Here the fully-qualified class names in definition files have to be adapted.

A prominent example of bean definitions were reloading strategies assigned to specific configuration sources. As the whole reloading mechanism has changed significantly, such declarations are no longer supported. There is a much simpler replacement: just add the config-reload attribute to a configuration source declaration to enable reloading support for this source.

Another incompatible change is related to the extensibility of the definition file format. It used to be possible - and still is - to define custom tags for declaring special configuration sources. This is done by registering provider objects at the configuration builder. Because the internal implementation of CombinedConfigurationBuilder is very different from the old one, this also affects the interface used for providers. The main difference is that providers for the old version used to create configuration objects directly, while the new providers create configuration builders. If custom providers have been used in the past, additional migration effort has to be planned in.

A complete description of CombinedConfigurationBuilder, its usage and supported extension points can be found in chapter Combining Configuration Sources of the user's guide.

Concurrency Issues

An important design goal of Commons Configuration 2.0 was to improve the behavior of configuration objects when accessed by multiple threads. In the 1.x series, support for concurrent access to configuration data has grown historically: The original intent was that a configuration object can be read by multiple threads in a safe way, but as soon as one thread modifies the data, the user has to ensure proper synchronization manually. Later on, also due to the reloading implementation, more and more synchronization was added. This even caused performance bottlenecks, for instance as reported in CONFIGURATION-530.

The changes in version 2.0 related to multi-threading include multiple aspects. The most obvious change is probably that synchronization of configurations is now much more flexible. A configuration instance is assigned a Synchronizer object which controls if and how locks are obtained when executing operations on this configuration. By changing the synchronizer, an application can adapt the locking behavior to its specific needs. For instance, if a configuration is only accessed by a single thread, there is no need for any synchronization. Typical usage modes are reflected by different default implementations of the Synchronizer interface:

NoOpSynchronizer does not use any synchronization at all. This is the option of choice for single-threaded access, but fails in a multi-threaded environment.
ReadWriteSynchronizer implements synchronization based on a read/write lock.

Note that the default option is NoOpSynchronizer. This means that configuration objects are not thread-safe per default! You have to change the synchronizer in order to make them safe for concurrent access. This can be done for instance by using a builder which is configured accordingly.

Talking about builders: This is another concept which supports access to configuration data by multiple threads. Access to a builder is always thread-safe. By shifting the responsibility for reloading operations from the configuration to the builder, the need for intensive locking on each property access could be eliminated.

Hierarchical configurations derived from BaseHierarchicalConfiguration now use a new implementation which allows for concurrent access without locking. So this group of configurations can be used without having to set a fully-functional synchronizer.

There are some other changes on classes with the goal to make them well-behaving citizens in a concurrent environment. This includes:

Some classes have been made immutable, passing all information to the constructor rather than using bean-style properties for their initialization. An example is DefaultExpressionEngine whose instances can now be shared between different hierarchical configuration objects.
Static utility classes with state have been rewritten so that they can be instantiated. Mutable static fields are in general thread-hostile. Refer to CONFIGURATION-486 for further details.

Please refer to Configurations and Concurrent Access for a full description of this complex topic.

Events

Another area in which major changes took place is the support for event notifications. Commons Configuration 1.x had two types of event listeners for configuration update events and error events. Version 2.0 adds some more event sources - events generated by configuration builders and reloading events. Despite this increased number of event sources, there is now only a single event listener interface (EventListener), and the mechanisms for adding and removing event listeners are the same everywhere; the basic protocol is defined by the EventSource interface. (Note that EventSource used to be a class in version 1.x; it actually was the base class of AbstractConfiguration and therefore inherited by all concrete configuration implementations. In version 2.0 this role has been taken over by the BaseEventSource class.)

While the old version used numeric constants to define specific event types, the new events are classified by instances of the EventType class. Event types can be used to determine the semantic meaning of an event, but also for filtering for specific events. They stand in a logic hierarchical relation with each other; an event listener that is registered for a base event type also receives notifications for derived types. This makes a flexible and fine-grained event processing possible. The whole event mechanism is very similar to the one implemented in JavaFX.

The most basic use case for event listeners in version 1.x was probably the registration of a change listener at a single configuration instance. To achieve an equivalent effect with the new API, one would implement an event listener and register it for the event type ConfigurationEvent.ANY. This listener will then receive notifications for all kinds of updates performed on the monitored configuration. Structure and content of these events is nearly identical to the counterparts in version 1.x.

There is, however, an important difference with the event listener registration: The recommended way is to add the listener to the configuration builder which creates the configuration rather than to the configuration itself. This ensures that registration is done at the correct moment in time and also updated when the builder decides to replace its managed configuration instance.

All in all the new event mechanism should be much more flexible and powerful than the old one.