Package elki.distance

Class SqrtCosineUnitlengthDistance

  • All Implemented Interfaces:
    Distance<NumberVector>, NumberVectorDistance<NumberVector>, PrimitiveDistance<NumberVector>, SpatialPrimitiveDistance<NumberVector>

    public class SqrtCosineUnitlengthDistance
    extends CosineUnitlengthDistance
    Cosine distance function for unit length feature vectors using the square root.

    The cosine distance is computed from the cosine similarity by sqrt(1-cosine similarity).

    Cosine similarity is defined as \[ \tfrac{\vec{x}\cdot\vec{y}}{||a||\cdot||b||} =_{||a||=||b||=1} \vec{x}\cdot\vec{y} \] Cosine distance then is defined as \[ \sqrt{2 - 2 \tfrac{\vec{x}\cdot\vec{y}}{||a||\cdot||b||}} =_{||a||=||b||=1} \sqrt{2 - 2\vec{x}\cdot\vec{y}} \in [0;2] \]

    This implementation assumes that \(||a||=||b||=1\). If this does not hold for your data, use SqrtCosineDistance instead!

    Because of the square root, this is more expensive than regular cosine, but because this corresponds to Euclidean distance on normalized vectors, it is a metric on normalized vectors.

    Erich Schubert