001/* 002 * Licensed to the Apache Software Foundation (ASF) under one or more 003 * contributor license agreements. See the NOTICE file distributed with 004 * this work for additional information regarding copyright ownership. 005 * The ASF licenses this file to You under the Apache License, Version 2.0 006 * (the "License"); you may not use this file except in compliance with 007 * the License. You may obtain a copy of the License at 008 * 009 * http://www.apache.org/licenses/LICENSE-2.0 010 * 011 * Unless required by applicable law or agreed to in writing, software 012 * distributed under the License is distributed on an "AS IS" BASIS, 013 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 014 * See the License for the specific language governing permissions and 015 * limitations under the License. 016 */ 017package org.apache.commons.text.similarity; 018 019/** 020 * Interface for the concept of a string similarity score. 021 * 022 * <p> 023 * A string similarity score is intended to have <i>some</i> of the properties of a metric, yet 024 * allowing for exceptions, namely the Jaro-Winkler similarity score. 025 * </p> 026 * <p> 027 * We Define a SimilarityScore to be a function <code>d: [X * X] -> [0, INFINITY)</code> with the 028 * following properties: 029 * </p> 030 * <ul> 031 * <li><code>d(x,y) >= 0</code>, non-negativity or separation axiom</li> 032 * <li><code>d(x,y) == d(y,x)</code>, symmetry.</li> 033 * </ul> 034 * 035 * <p> 036 * Notice, these are two of the properties that contribute to d being a metric. 037 * </p> 038 * 039 * 040 * <p> 041 * Further, this intended to be BiFunction<CharSequence, CharSequence, R>. 042 * The <code>apply</code> method 043 * accepts a pair of {@link CharSequence} parameters 044 * and returns an <code>R</code> type similarity score. We have ommitted the explicit 045 * statement of extending BiFunction due to it only being implemented in Java 1.8, and we 046 * wish to maintain Java 1.7 compatibility. 047 * </p> 048 * 049 * @param <R> The type of similarity score unit used by this EditDistance. 050 * @since 1.0 051 */ 052public interface SimilarityScore<R> { 053 054 /** 055 * Compares two CharSequences. 056 * 057 * @param left the first CharSequence 058 * @param right the second CharSequence 059 * @return the similarity score between two CharSequences 060 */ 061 R apply(CharSequence left, CharSequence right); 062 063}