Apache Commons logo Commons Lang

What's new in Commons Lang 3.0?

Commons Lang 3.0 is out, and the obvious question is: "So what? What's changed?".

The big story

Lang is now Java 5 based. We've generified the API, moved certain APIs to support varargs and thrown out any features that are now supported by Java itself. We've removed the deprecated parts of the API and have also removed some features that were deemed weak or unnecessary. All of this means that Lang 3.0 is not backwards compatible.

To that end we have changed the package name, allowing Lang 3.0 to sit side-by-side with your previous version of Lang without any bad side effects. The new package name is the exciting and original org.apache.commons.lang3. This also forces you to recompile your code, making sure the compiler can let you know if a backwards incompatibility affects you.

As you'd expect, there are also new features, enhancements and bugs fixed.

Migrating from 2.x

Java code

Despite the label of backwards incompatibility, in the vast majority of cases the simple addition of a '3' to an import statement will suffice for your migration.


Change: import org.apache.commons.lang -> import org.apache.commons.lang3

Maven

groupId: commons-lang -> org.apache.commons

artifactId: commons-lang -> commons-lang3

What's gone?

Enum package

Java 5 provided enums out of the box, therefore we dispensed with both the deprecated enum package, and the enums package. Instead you should migrate over to the standard Java enum. An EnumUtils class has been born from the ashes of the old code, focused on providing additional functionality to the standard Java enum API.

NestedExceptions

In Java 1.4, the notion that all Throwables could be linked to a cause was introduced. In Lang we had provided a NestedException framework to support the same feature, and now that we're jumping from Java 1.3 to Java 5 we are remove this feature. The deprecation section below covers one part of ExceptionUtils that remains until we are on Java 6, where the last remaining parts of the JDK appear to have embraced the new cause API.

JVMRandom

This class was introduced in Lang 2.0 to support a Random object built around the system seed. This proved to be both an uncommon use case and one with bugs and so was dropped.

StringEscapeUtils.escapeSql

This was a misleading method, only handling the simplest of possible SQL cases. As SQL is not Lang's focus, it didn't make sense to maintain this method.

*Exceptions removed

Various Exception classes were removed - the lesson in defining more semantically relevant exception classes is that you can keep on coming up with more and more new classes. Simpler to focus on using the main JDK classes.

math.*Range

The various Range classes in the math package were removed in favour of a new generic Range class.

Previous Deprecations

All deprecated fields/methods/classes - with a new major version, all of the previously deprecated parts of the API could finally go away.

If you feel that something was unfairly taken away, please feel free to contact the list. In many cases the possibility exists to reintroduce code.

Deprecations

The lone deprecation in 3.0 is that of the notion of 'cause method names' in ExceptionUtils. In Java 5.0 it is still just about needed to handle some JDK classes that have not been migrated to the getCause API. In Java 6.0 things appear to be resolved and we will remove the related methods in Lang 4.0.

New packages

Two new packages have shown up. org.apache.commons.lang3.concurrent, which unsurprisingly provides support classes for multithreaded programming, and org.apache.commons.lang3.text.translate, which provides a pluggable API for text transformation.

concurrent.*

Java 1.5 adds a great bunch of functionality related to multithreaded programming below the java.util.concurrent package. Commons Lang 3.0 provides some additional classes in this area which are intended to further simplify the development of concurrent applications.

The classes located in the concurrent package can be roughly divided into the following categories:

  • Utility classes
  • Initializer classes

Classes of the former category provide some basic functionality a developer typically has to implement manually again and again. Examples are a configurable ThreadFactory implementation or utility methods for the handling of ExecutionExceptions thrown by Java's executor service framework.

Initializer classes deal with the creation of objects in a multithreaded environment. There are several variants of initializer implementations serving different purposes. For instance, there are a couple of concrete initializers supporting lazy initialization of objects in a safe way. Another example is BackgroundInitializer which allows pushing the creation of an expensive object to a background thread while the application can continue with the execution of other tasks. Here is an example of the usage of BackgroundInitializer which creates an EntityManagerFactory object:

    public class DBInitializer extends BackgroundInitialize<EntityManagerFactory> {
        protected EntityManagerFactory initialize() {
            return Persistence.createEntityManagerFactory("mypersistenceunit");
        }
    }

An application creates an instance of the DBInitializer class and calls its start() method. When it later needs access to the EntityManagerFactory created by the initializer it calls the get() method; get() returns the object produced by the initializer if it is already available or blocks if necessary until initialization is complete. Alternatively a convenience method of the ConcurrentUtils class can be used to obtain the object from the initializer which hides the checked exception declared by get():

    DBInitializer init = new DBInitializer();
    init.start();

    // now do some other stuff

    EntityManagerFactory factory = ConcurrentUtils.initializeUnchecked(init);

Comprehensive documentation about the concurrent package is available in the user guide.

text.translate.*

A common complaint with StringEscapeUtils was that its escapeXml and escapeHtml methods should not be escaping non-ASCII characters. We agreed and made the change while creating a modular approach to let users define their own escaping constructs.

The simplest way to show this is to look at the code that implements escapeXml:

    return ESCAPE_XML.translate(input);

Very simple. Maybe a bit too very simple, let's look a bit deeper.

    public static final CharSequenceTranslator ESCAPE_XML =
        new AggregateTranslator(
            new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
            new LookupTranslator(EntityArrays.APOS_ESCAPE())
        );

Here we see that ESCAPE_XML is a 'CharSequenceTranslator', which in turn is made up of two lookup translators based on the basic XML escapes and another to escape apostrophes. This shows one way to combine translators. Another can be shown by looking at the example to achieve the old XML escaping functionality (escaping non-ASCII):

          StringEscapeUtils.ESCAPE_XML.with( NumericEntityEscaper.between(0x7f, Integer.MAX_VALUE) );

That takes the standard Commons Lang provided escape functionality, and adds on another translation layer. Another JIRA requested option was to also escape non-printable ASCII, this is now achievable with a modification of the above:

          StringEscapeUtils.ESCAPE_XML.with(
              new AggregateTranslator(
                  NumericEntityEscaper.between(0, 31),
                  NumericEntityEscaper.between(0x80, Integer.MAX_VALUE)
              )
          )

You can also implement your own translators (be they for escaping, unescaping or some aspect of your own). See the CharSequenceTranslator and its CodePointTranslator helper subclass for details - primarily a case of implementing the translate(CharSequence, int, Writer);int method.

New classes + methods

There are many new classes and methods in Lang 3.0 - the most complete way to see the changes is via this Lang2 to Lang3 Clirr report.

Here is a summary of the new classes:

  • AnnotationUtils
  • CharSequenceUtils
  • EnumUtils
  • JavaVersion - used in SystemUtils
  • Pair, ImmutablePair and MutablePair
  • Range - replaces the old math.*Range classes
  • builder.Builder
  • exception.ContextedException
  • exception.CloneFailedException
  • reflect.ConstructorUtils
  • reflect.FieldUtils
  • reflect.MethodUtils
  • reflect.TypeUtils
  • text.WordUtils

Bugfixes?

See the 3.0 changes report for the list of fixed bugs and other enhancements.

Other Notable Changes

  • StringUtils.isAlpha, isNumeric and isAlphanumeric now all return false when passed an empty String. Previously they returned true.
  • SystemUtils.isJavaVersionAtLeast now relies on the java.specification.version and not the java.version System property.
  • StringEscapeUtils.escapeXml and escapeHtml no longer escape high value Unicode characters by default. The text.translate package is available to recreate the old behavior.
  • Validate utility methods have been changed and genericized to return the validated argument where possible, to allow for inline use.
  • Validate utility methods handle validity violations arising from null values by throwing NullPointerExceptions. This better aligns with standard JDK behavior (lang is intended to complement java.lang, after all). Users upgrading from v2.x may need to adjust to this change. See Validate#isTrue() for a general-purpose mechanism to raise an IllegalArgumentException.