org.apache.commons.codec.language.bm
Class Lang

java.lang.Object
  extended by org.apache.commons.codec.language.bm.Lang

public class Lang
extends Object

Language guessing utility.

This class encapsulates rules used to guess the possible languages that a word originates from. This is done by reference to a whole series of rules distributed in resource files.

Instances of this class are typically managed through the static factory method instance(). Unless you are developing your own language guessing rules, you will not need to interact with this class directly.

This class is intended to be immutable and thread-safe.

Lang resources

Language guessing rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically named following the pattern:

org/apache/commons/codec/language/bm/lang.txt
The format of these resources is the following:

Port of lang.php

Since:
1.6
Version:
$Id: Lang.html 889935 2013-12-11 05:05:13Z ggregory $

Method Summary
 String guessLanguage(String text)
          Guesses the language of a word.
 Languages.LanguageSet guessLanguages(String input)
          Guesses the languages of a word.
static Lang instance(NameType nameType)
          Gets a Lang instance for one of the supported NameTypes.
static Lang loadFromResource(String languageRulesResourceName, Languages languages)
          Loads language rules from a resource.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

instance

public static Lang instance(NameType nameType)
Gets a Lang instance for one of the supported NameTypes.

Parameters:
nameType - the NameType to look up
Returns:
a Lang encapsulating the language guessing rules for that name type

loadFromResource

public static Lang loadFromResource(String languageRulesResourceName,
                                    Languages languages)
Loads language rules from a resource.

In normal use, you will obtain instances of Lang through the instance(NameType) method. You will only need to call this yourself if you are developing custom language mapping rules.

Parameters:
languageRulesResourceName - the fully-qualified resource name to load
languages - the languages that these rules will support
Returns:
a Lang encapsulating the loaded language-guessing rules.

guessLanguage

public String guessLanguage(String text)
Guesses the language of a word.

Parameters:
text - the word
Returns:
the language that the word originates from or Languages.ANY if there was no unique match

guessLanguages

public Languages.LanguageSet guessLanguages(String input)
Guesses the languages of a word.

Parameters:
input - the word
Returns:
a Set of Strings of language names that are potential matches for the input word


Copyright © 2002-2013 The Apache Software Foundation. All Rights Reserved.