Metrics (Mathematics) - singa-bio/singa GitHub Wiki

The Metrics part of the Mathematics package handles functions that define distances between pairs of elements of the same kind.

The Metric interface defines a method that returns the distance between two elements. The implementation of this method is used in convenience methods. The different implementations of Metrics are located in the implementations package and ready to use Metrics are provided in the static VectorMetricProvider class.

The Metrizable interface is used to assign the ability to calculate the distance between two instances of the object. The MetrizableType defines the kind of attribute that is used to calculate the distance. For example Atoms can be compared using their coordinates in the form of Vectors.

Here are some basic use cases of metrics:

List<Vector3D> vectors = new ArrayList<>();
vectors.add(new Vector3D(0.0, 0.0, 0.0));
vectors.add(new Vector3D(1.0, 0.0, 0.0));
vectors.add(new Vector3D(0.0, 1.0, 0.0));
vectors.add(new Vector3D(0.0, 0.0, 1.0));

A list of vectors to calculate distances from.

double angularDistance = VectorMetricProvider.ANGULAR_DISTANCE.calculateDistance(vectors.get(1), vectors.get(2));

Result: 0.4; The angular distance between the first an the second vector from the list

Vectro3D reference = new Vector3D(0.3, 0.1, 0.4);
Map.Entry<Vector3D, Double> entry = MANHATTAN_METRIC.calculateClosestDistance(vectors, reference);

Result: A map entry (0.0, 0.0, 0.0) = 0.8 with the vector closest to the reference and its distance

Matrix distanceMatrix = VectorMetricProvider.MANHATTAN_METRIC.calculateDistancesPairwise(vectors);

Result: A distance matrix in which the distance was determined pairwise for all vectors.

The class VectorMetricProvider provides some predefined metrics:

Metric Description
ANGULAR_DISTANCE The angle distance using the same calculation of similarity as the CosineSimilarity, the normalised angle between the vectors can be used as a bounded similarity function within [0,1].
CHEBYCHEV_METRIC The Chebyshev metric (also known as Tchebychev metric or maximum metric) is a metric defined on a vector space where the distance between two vectors is the greatest of their differences along any coordinate dimension.
COSINE_SIMILARITY The cosine similarity is a similarity measure not a distance measure. The resulting similarity ranges from −1 meaning exactly opposite, to 1 meaning exactly the same, with 0 indicating orthogonality (decorrelation), and in-between values indicating intermediate similarity or dissimilarity.
EUCLIDEAN_METRIC The Euclidean metric is the "ordinary" (i.e. straight-line) distance between two points in Euclidean space.
MANHATTAN_METRIC The Manhattan metric (also known as taxicab geometry, rectilinear distance, L1 distance, snake distance or city block distance) defines a distance where the length between two points (p1 and p2) is equal to the length of all paths connecting p1 and p2 along horizontal and vertical segments, without ever going back.
SQUARED_EUCLIDEAN_METRIC Calculates the squared euclidean distance between two Vectors.

Further examples:

The Minkowski distance calculates the distance between two Vectors of order "p". For p >= 1, the Minkowski distance is a metric as a result of the Minkowski inequality. For p = 1 the Minkowski metric is the Manhattan or Taxicab metric, and for p = 2 it is the Euclidean metric. If p < 1, this is not a proper distance metric, since it does not satisfy the triangle inequality.

MinkowskiMetric<Vector3D> minkowski = new MinkowskiMetric<>(2.5);
double minkowskiMetric = minkowski.calculateDistance(vectors.get(1), vectors.get(2));

Using the Jaccard metric you can calculate the distance between two collections of Strings.

Metric<Collection<String>> jaccard = new JaccardMetric<>();

Set<String> first = new HashSet<>();
first.add("Apple");
first.add("Pear");
first.add("Banana");

Set<String> second = new HashSet<>();
second.add("Pear");
second.add("Cucumber");
second.add("Tomato");

double jaccardDistance = jaccard.calculateDistance(first, second);
​
⚠️ **GitHub.com Fallback** ⚠️