Configurations and Concurrent Access

Configuration objects are often central resources of an application and are accessed by multiple components. If multiple threads are involved which read or even update configuration data, care has to be taken that access to a Configuration object is properly synchronized to avoid data corruption or spurious exceptions. This section of the user's guide deals with concurrency and describes the actions necessary to make a Configuration work in a multi-threaded environment.

Synchronizers

Whether a Configuration object has to be thread-safe or not strongly depends on a concrete use case. For an application which only reads some configuration properties in its main() method at startup, it does not matter whether this configuration can safely be accessed from multiple threads. In this case, the overhead of synchronizing access to the configuration is not needed, and thus operations on the Configuration object can be more efficient. On the other hand, if the Configuration object is accessed by multiple components running in different threads it should better be thread-safe.

To support these different use cases, Commons Configuration takes a similar approach as the Java Collections framework. Here collections are per default not thread-safe (and thus more efficient). If an application needs a thread-safe collection, it can "upgrade" an existing one by calling a method of the Collections class.

Objects implementing the Configuration interface can be associated with a Synchronizer object. This synchronizer is triggered on each access to the configuration (distinguishing between read and write access). It can decide whether access is allowed or block the calling thread until it is safe to continue. Per default, a Configuration object uses a NoOpSynchronizer instance. As the name implies, this class does nothing to protect its associated configuration against concurrent access; its methods are just empty dummies. It is appropriate for use cases in which a configuration is only accessed by a single thread.

If multiple threads are involved, Configuration objects have to be thread-safe. For this purpose, there is another implementation of Synchronizer: ReadWriteSynchronizer. This class is based on the ReentrantReadWriteLock class from the JDK. It implements the typical behavior desired when accessing a configuration in a multi-threaded environment:

An arbitrary number of threads can read the configuration simultaneously.
Updates of a configuration can only happen with an exclusive lock; so if a thread changes configuration data, all other threads (readers and writers) are blocked until the update operation is complete.

The synchronizer associated with a Configuration can be changed at any time by calling the setSynchronizer() method. The following example shows how this method is used to make a Configuration instance thread-safe:


config.setSynchronizer(new ReadWriteSynchronizer());

Rather than setting the synchronizer on an existing Configuration instance, it is usually better to configure the configuration builder responsible for the creation of the configuration to set the correct synchronizer directly after a new instance has been created. This is done in the usual way by setting the corresponding property of a parameters object passed to the builder's configure() method, for instance:


Parameters params = new Parameters();
BasicConfigurationBuilder<PropertiesConfiguration> builder =
        new BasicConfigurationBuilder<PropertiesConfiguration>(
                PropertiesConfiguration.class)
                .configure(params.basic()
                        .setSynchronizer(new ReadWriteSynchronizer());
PropertiesConfiguration config = builder.getConfiguration();

It is also possible to set the synchronizer to null. In this case, the default NoOpSynchronizer is installed, which means that the configuration is no longer protected against concurrent access.

With the two classes NoOpSynchronizer and ReadWriteSynchronizer the Commons Configuration library covers the basic use cases of no protection and full protection of multi-threaded access. As the Synchronizer interface is pretty simple, applications are free to provide their own implementations according to their specific needs. However, this requires a certain understanding of internal mechanisms in Configuration implementations. Some caveats are provided in the remaining of this chapter.

Basic operations and thread-safety

AbstractConfiguration already provides a major part of the implementation of correctly interacting with a Synchronizer object. Methods for reading configuration data (such as getProperty(), isEmpty(), or getKeys()) and for changing properties (e.g. setProperty(), addProperty(), or clearProperty()) already call the correct methods of the Synchronizer. These methods are declared final to avoid that subclasses accidently break thread-safety by incorrectly usage of the Synchronizer.

Classes derived from AbstractConfiguration sometimes offer specific methods for accessing properties. For instance, hierarchical configurations offer operations on whole subtrees, or INIConfiguration allows querying specific sections. These methods are also aware of the associated Synchronizer and invoke it correctly.

There is another pair of methods available for each Configuration object allowing direct control over the Synchronizer: lock() and unlock(). Both methods expect an argument of type LockMode which tells them whether the configuration is to be locked for read or write access. These methods can be used to extend the locking behavior of standard methods. For instance, if multiple properties are to be added in an atomic way, lock() can be called first, then all properties are added, and finally unlock() is called. Provided that a corresponding Synchronizer implementation is used, other threads will not interfere with this sequence. Note that it is important to always call unlock() after a lock() call; this is done best in a finally block as shown in the following example:


config.lock(LockMode.WRITE);
try
{
    config.addProperty("prop1", "value1");
    ...
    config.addProperty("prop_n", "value_n");
}
finally
{
    config.unlock(LockMode.WRITE);
}

So, in a nutshell: When accessing configuration data from standard configuration classes all operations are controlled via the configuration's Synchronizer object. Client code is only responsible for setting a correct Synchronizer object which is suitable for the intended use case.

Other flags

In addition to the actual configuration data, each Configuration object has some flags controlling its behavior. One example for such a flag is the boolean throwExceptionOnMissing property. Other helper objects like the object responsible for interpolation or the expression engine for hierarchical configurations fall into the same category. The manipulation of those flags and helper objects is also related to thread-safety.

In contrast to configuration data, access to flags is not guarded by the Synchronizer. This means that when changing a flag in a multi-threaded environment, there is no guarantee that this change is visible to other threads.

The reason for this design is that the preferred way to create a Configuration object is using a configuration builder. The builder is responsible for fully initializing the configuration; afterwards, no behavioral changes should be performed any more. Because builders are always synchronized the values of all flags are safely published to all involved threads.

If there really is the need to change a flag later on in the life-cycle of a Configuration object, the lock() and unlock() methods described in the previous section should be used to do the change with a write lock held.

Special cases

Thread-safety is certainly a complex topic. This section describes some corner cases which may occur when some of the more advanced configuration classes are involved.

All hierarchical configurations derived from BaseHierarchicalConfiguration internally operate on a nodes structure implemented by immutable nodes. This is beneficial for concurrent access. It is even possible to share (sub) trees of configuration nodes between multiple configuration objects. Updates of these structures are implemented in a thread-safe and non-blocking way - even when using the default NoOpSynchronizer. So the point to take is that when using hierarchical configurations it is not required to set a special synchronizer because safe concurrent access is already a basic feature of these classes. The only exception is that change events caused by updates of a configuration's data are not guaranteed to be delivered in a specific order. For instance, if one thread clears a configuration and immediately afterwards another thread adds a property, it may be the case that the clear event arrives after the add property event at an event listener. If the listener relies on the fact that the configuration is empty now, it may be up for a surprise. In cases in which the sequence of generated configuration events is important, a fully functional synchronizer object should be set.
CombinedConfiguration is a bit special regarding lock handling. Although derived from BaseHierarchicalConfiguration, this class is not thread-safe per default. So if accessed by multiple threads, a suitable synchronizer has to be set. An instance manages a node tree which is constructed dynamically from the nodes of all contained configurations using the current node combiner. When one of the child configurations is changed the node tree is reset so that it has to be re-constructed on next access. Because this operation changes the configuration's internal state it is performed with a write lock held. So even if only data is read from a CombinedConfiguration, it may be the case that temporarily a write lock is obtained for constructing the combined node tree. Note that the synchronizers used for the children of a combined configuration are independent. For instance, if configuration objects derived from BaseHierarchicalConfiguration are added as children to a CombinedConfiguration, they can continue using a NoOpSynchronizer.
Derived from CombinedConfiguration is DynamicCombinedConfiguration which extends its base class by the ability to manage multiple combined configuration instances. The current instance is selected based on a key constructed by a ConfigurationInterpolator instance. If this yields a key which has not been encountered before, a new CombinedConfiguration object is created. Here again it turns out that even a read access to a DynamicCombinedConfiguration may cause internal state changes which require a write lock to be held. When creating a new child combined configuration it is passed the Synchronizer of the owning DynamicCombinedConfiguration; so there is actually only a single Synchronizer controlling the access to all involved configurations.

Read-only configurations

Objects that are not changed typically play well in an environment with multiple threads - provided that they are initialized in a safe way. For the safe initialization of Configuration objects specialized builders are responsible. These are classes derived from BasicConfigurationBuilder. Configuration builders are designed to be thread-safe: their getConfiguration() method is synchronized, so that configurations can be created and initialized in a safe way even if multiple threads are interacting with the builder. Synchronization also ensures that all values stored in member fields of newly created Configuration objects are safely published to all involved threads.

As long as a configuration returned freshly from a builder is not changed in any way, it can be used without a special Synchronizer (this means that the default NoOpSynchronizer is used). As was discussed in the previous section, there are special cases in which read-only access to Configuration objects causes internal state changes. This would be critical without a fully functional Synchronizer object. However, the builders dealing with affected classes are implemented in a way that they take care about these special cases and perform extra initialization steps which make write locks for later read operations unnecessary.

For instance, the builder for combined configurations explicitly accesses a newly created CombinedConfiguration object so that it is forced to construct its node tree. This happens in the builder's getConfiguration() method which is synchronized. So provided that the combined configuration is not changed (no other child configurations are added, no updates are performed on existing child configurations), no protection against concurrent access is needed - a simple NoOpSynchronizer can do the job.

Situation is similar for the other special cases described in the previous section. One exception is DynamicCombinedConfiguration: Whether an instance can be used in a read-only manner without a fully functional Synchronizer depends on the way it constructs its keys. If the keys remain constant during the life time of an instance (for instance, they are based on a system property specified as startup option of the Java virtual machine), NoOpSynchronizer is sufficient. If the keys are more dynamic, a fully functional Synchronizer is required for concurrent access - even if only reads are performed.

So to sum up, except for very few cases configurations can be read by multiple threads without having to use a special Synchronizer. For this to be safe, the configurations have to be created through a builder, and they must not be updated by any of these threads. A good way to prevent updates to a Configuration object is to wrap it by an immutable configuration.