Proposal for Apache Commons Text Package
(0) Rationale
Providing algorithms for processing texts like editing distance or
similarity is out of scope of the standard Java libraries. The
Commons Text Package provides these extra methods.
(1) Scope of the Package
This proposal is to create a package of Java utility classes implementing
well known string algorithms and metrics.
(1.5) Interaction With Other Packages
Commons Text relies only on standard JDK 7 (or later) APIs for
production deployment. It utilizes the JUnit unit testing framework and
the hamcrest matcher library for developing and executing unit tests, but
this is of interest only to developers of the component. Commons Text may be
a dependency for several existing components in the open source world that
implement higher order text processing.
No external configuration files are utilized.
(2) Initial Source of the Package
The initial classes came from the Commons Lang and Commons Codec subprojects.
The proposed package name for the new component is
org.apache.commons.text.
(3) Required Apache Commons Resources
- Git Repository - New repository commons-text.
- Mailing List - Discussions will take place on the general
dev@commons.apache.org mailing list. To help
list subscribers identify messages of interest, it is suggested that
the message subject of messages about this component be prefixed with
[text].
- Jira - New component "Common Text" under the "Commons Sandbox" product.
- Confluence FAQ - New category "commons-text" (when available).
(4) Initial Committers
The initial committers on the Commons Text component shall be as follows:
- Benedikt Ritter (britter)
- Bruno P. Kinoshita (kinow)