User Guide for Commons "Text"The Commons Text PackageUsers Guide[Description] [text] [text.diff] [text.lookup] [text.similarity] [text.translate] DescriptionThe Commons Text library provides additions to the standard JDK's text handling. Our goal is to provide a consistent set of tools for processing text generally from computing distances between Strings to being able to efficiently do String escaping of various types. Package org.apache.commons.textOriginally the text package was added in Commons Lang 2.2. However, its
new home is here. It provides, amongst other
classes, a replacement for Beyond the text utilities ported over from Commons Lang, we have also included various string similarity and distance functions. Lastly, there are also utilities for addressing differences between bodies of text for the sake of viewing these differences. Class StringEscapeUtilsFrom Lang 3.5, we have moved into Text StringEscapeUtils and StrTokenizer.
It provides ways in which to generate pieces of text, such as might
be used for default passwords. StringEscapeUtils contains methods to
escape and unescape Java, JavaScript, HTML and XML. It is worth noting that
the package Class StringSubstitutorThe simplest example is to use this class to replace Java System properties. For example: StringSubstitutor.replaceSystemProperties( "You are running with java.version = ${java.version} and os.name = ${os.name}."); For details see StringSubstitutor.
Use a StringSubstitutorReader
to avoid reading a whole file into memory as a To build a default full-featured substitutor, use:
The available substitutions are defined in org.apache.commons.text.lookup.StringLookupFactory. Similarity and DistanceThe
The list of "edit distances" that we currently support follow:
Text diff'ingThe Package org.apache.commons.text.diffProvides algorithms for diff between strings. The initial implementation of the Myers algorithm was adapted from the commons-collections sequence package. Package org.apache.commons.text.lookupProvides algorithms for looking up strings used by a StringSubstitutor. Standard lookups are defined in StringLookupFactory and the associated DefaultStringLookup enum.
The example below demonstrates use of the default lookups for NOTE: The list of lookups available by default changed in version 1.10.0. See the documentation for StringLookupFactory for details and instructions on how to reproduce the previous behavior. final StringSubstitutor interpolator = StringSubstitutor.createInterpolator(); final String text = interpolator.replace( "Base64 Decoder: ${base64Decoder:SGVsbG9Xb3JsZCE=}\n" + "Base64 Encoder: ${base64Encoder:HelloWorld!}\n" + "Java Constant: ${const:java.awt.event.KeyEvent.VK_ESCAPE}\n" + "Date: ${date:yyyy-MM-dd}\n" + "Environment Variable: ${env:USERNAME}\n" + "File Content: ${file:UTF-8:src/test/resources/document.properties}\n" + "Java: ${java:version}\n" + "Local host: ${localhost:canonical-name}\n" + "Loopback address: ${loopbackAddress:canonical-name}\n" + "Properties File: ${properties:src/test/resources/document.properties::mykey}\n" + "Resource Bundle: ${resourceBundle:org.apache.commons.text.example.testResourceBundleLookup:mykey}\n" + "System Property: ${sys:user.dir}\n" + "URL Decoder: ${urlDecoder:Hello%20World%21}\n" + "URL Encoder: ${urlEncoder:Hello World!}\n" + "XML Decoder: ${xmlDecoder:<element>}\n" + "XML Encoder: ${xmlEncoder:<element>}\n" + "XML XPath: ${xml:src/test/resources/document.xml:/root/path/to/node}\n" ); Package org.apache.commons.text.similarityProvides algorithms for string similarity. The algorithms that implement the EditDistance interface follow the same simple principle: the more similar (closer) strings are, the lower is the distance. For example, the words house and hose are closer than house and trousers. The following algorithms are available at the moment:
The Package org.apache.commons.text.translate.*An API for creating text translation routines from a set of smaller building blocks. Initially created to make it possible for the user to customize the rules in the StringEscapeUtils class. These classes are immutable, and therefore thread-safe. |