Package elki.data
Class BitVector
- java.lang.Object
-
- elki.data.BitVector
-
- All Implemented Interfaces:
FeatureVector<java.lang.Number>
,NumberVector
,SparseFeatureVector<java.lang.Number>
,SparseNumberVector
,SpatialComparable
public class BitVector extends java.lang.Object implements SparseNumberVector
Vector using a dense bit set encoding, based onlong[]
storage.- Since:
- 0.1
- Author:
- Arthur Zimek
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BitVector.Factory
Factory for bit vectors.static class
BitVector.ShortSerializer
Serialization class for dense integer vectors with up toShort.MAX_VALUE
dimensions, by using a short for storing the dimensionality.
-
Field Summary
Fields Modifier and Type Field Description private long[]
bits
Storing the bits.private int
dimensionality
Dimensionality of this bit vector.static BitVector.Factory
FACTORY
Static instance.static ByteBufferSerializer<BitVector>
SHORT_SERIALIZER
Serializer for up to 2^15-1 dimensions.-
Fields inherited from interface elki.data.FeatureVector
TYPE
-
Fields inherited from interface elki.data.NumberVector
ATTRIBUTE_SEPARATOR, FIELD_1D, FIELD_2D
-
Fields inherited from interface elki.data.SparseNumberVector
FIELD, VARIABLE_LENGTH
-
-
Constructor Summary
Constructors Constructor Description BitVector(long[] bits, int dimensionality)
Create a new BitVector corresponding to the specified bits and of the specified dimensionality.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
andOnto(long[] v)
Combine onto v using the AND operation, i.e.boolean
booleanValue(int dimension)
Get the value of a single bit.int
cardinality()
Compute the vector cardinality (uncached!)long[]
cloneBits()
Returns a copy of the bits currently set in this BitVector.boolean
contains(long[] bitset)
Returns whether this BitVector contains all bits that are set to true in the specified BitSet.double
doubleValue(int dimension)
Returns the value in the specified dimension as double.boolean
equals(java.lang.Object obj)
Indicates whether some other object is "equal to" this BitVector.int
getDimensionality()
The dimensionality of the vector space where of this FeatureVector of V is an element.Bit
getValue(int dimension)
Deprecated.int
hammingDistance(BitVector v2)
Compute the Hamming distance of two bit vectors.int
hashCode()
boolean
intersect(BitVector v2)
Compute whether two vectors intersect.int
intersectionSize(BitVector v2)
Compute the vector intersection size.int
iter()
Iterator over non-zero features only, ascendingly.int
iterAdvance(int iter)
Advance the iterator to the next position.int
iterDim(int iter)
Get the dimension an iterator points to.double
iterDoubleValue(int iter)
Get the value of the iterators' current dimension.long
iterLongValue(int iter)
Get the value of the iterators' current dimension.int
iterRetract(int iter)
Retract the iterator to the next position.boolean
iterValid(int iter)
Test the iterator position for validity.double
jaccardSimilarity(BitVector v2)
Compute the Jaccard similarity of two bit vectors.long
longValue(int dimension)
Returns the value in the specified dimension as long.void
orOnto(long[] v)
Combine onto v using the OR operation, i.e.void
setDimensionality(int dimensionality)
Update the vector space dimensionality.double[]
toArray()
Returns a Vector representing in one column andgetDimensionality()
rows the values of this BitVector as double values.java.lang.String
toString()
Returns a String representation of this BitVector.int
unionSize(BitVector v2)
Compute the vector union size.void
xorOnto(long[] v)
Combine onto v using the XOR operation, i.e.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface elki.data.NumberVector
getMax, getMin
-
Methods inherited from interface elki.data.SparseNumberVector
byteValue, floatValue, intValue, iterByteValue, iterFloatValue, iterIntValue, iterShortValue, shortValue
-
-
-
-
Field Detail
-
FACTORY
public static final BitVector.Factory FACTORY
Static instance.
-
SHORT_SERIALIZER
public static final ByteBufferSerializer<BitVector> SHORT_SERIALIZER
Serializer for up to 2^15-1 dimensions.
-
bits
private final long[] bits
Storing the bits.
-
dimensionality
private int dimensionality
Dimensionality of this bit vector.
-
-
Method Detail
-
getDimensionality
public int getDimensionality()
Description copied from interface:FeatureVector
The dimensionality of the vector space where of this FeatureVector of V is an element.- Specified by:
getDimensionality
in interfaceFeatureVector<java.lang.Number>
- Specified by:
getDimensionality
in interfaceSpatialComparable
- Returns:
- the number of dimensions of this FeatureVector of V
-
setDimensionality
public void setDimensionality(int dimensionality)
Description copied from interface:SparseNumberVector
Update the vector space dimensionality.- Specified by:
setDimensionality
in interfaceSparseNumberVector
- Parameters:
dimensionality
- New dimensionality
-
booleanValue
public boolean booleanValue(int dimension)
Get the value of a single bit.- Parameters:
dimension
- Bit number to get- Returns:
true
when set
-
getValue
@Deprecated public Bit getValue(int dimension)
Deprecated.Description copied from interface:FeatureVector
Returns the value in the specified dimension.- Specified by:
getValue
in interfaceFeatureVector<java.lang.Number>
- Specified by:
getValue
in interfaceNumberVector
- Parameters:
dimension
- the desired dimension, where 0 ≤ dimension ≤this.getDimensionality()-1
- Returns:
- the value in the specified dimension
-
doubleValue
public double doubleValue(int dimension)
Description copied from interface:NumberVector
Returns the value in the specified dimension as double.Note: this might seem redundant with respect to
getValue(dim).doubleValue()
, but usually this is much more efficient due to boxing/unboxing cost.- Specified by:
doubleValue
in interfaceNumberVector
- Specified by:
doubleValue
in interfaceSparseNumberVector
- Parameters:
dimension
- the desired dimension, where 0 ≤ dimension <this.getDimensionality()
- Returns:
- the value in the specified dimension
-
longValue
public long longValue(int dimension)
Description copied from interface:NumberVector
Returns the value in the specified dimension as long.Note: this might seem redundant with respect to
getValue(dim).longValue()
, but usually this is much more efficient due to boxing/unboxing cost.- Specified by:
longValue
in interfaceNumberVector
- Specified by:
longValue
in interfaceSparseNumberVector
- Parameters:
dimension
- the desired dimension, where 0 ≤ dimension <this.getDimensionality()
- Returns:
- the value in the specified dimension
-
iter
public int iter()
Description copied from interface:SparseNumberVector
Iterator over non-zero features only, ascendingly.Note: depending on the underlying implementation, this may or may not be the dimension. Use
SparseFeatureVector.iterDim(int)
to get the actual dimension. In fact, usually this will be the ith non-zero value, assuming an array representation.Think of this number as an iterator. For efficiency, it has a primitive type!
Intended usage:
for (int iter = v.iter(); v.iterValid(iter); iter = v.iterAdvance(iter)) { final int dim = v.iterDim(iter); final double val = v.iterDoubleValue(iter); // Do something. }
- Specified by:
iter
in interfaceSparseFeatureVector<java.lang.Number>
- Specified by:
iter
in interfaceSparseNumberVector
- Returns:
- Identifier for the first non-zero dimension, not necessarily the dimension!
-
iterAdvance
public int iterAdvance(int iter)
Description copied from interface:SparseFeatureVector
Advance the iterator to the next position.- Specified by:
iterAdvance
in interfaceSparseFeatureVector<java.lang.Number>
- Parameters:
iter
- Previous iterator position- Returns:
- Next iterator position
-
iterRetract
public int iterRetract(int iter)
Description copied from interface:SparseFeatureVector
Retract the iterator to the next position.- Specified by:
iterRetract
in interfaceSparseFeatureVector<java.lang.Number>
- Parameters:
iter
- Next iterator position- Returns:
- Previous iterator position
-
iterValid
public boolean iterValid(int iter)
Description copied from interface:SparseFeatureVector
Test the iterator position for validity.- Specified by:
iterValid
in interfaceSparseFeatureVector<java.lang.Number>
- Parameters:
iter
- Iterator position- Returns:
true
when it refers to a valid position.
-
iterDim
public int iterDim(int iter)
Description copied from interface:SparseFeatureVector
Get the dimension an iterator points to.- Specified by:
iterDim
in interfaceSparseFeatureVector<java.lang.Number>
- Parameters:
iter
- Iterator position- Returns:
- Dimension the iterator refers to
-
iterDoubleValue
public double iterDoubleValue(int iter)
Description copied from interface:SparseNumberVector
Get the value of the iterators' current dimension.- Specified by:
iterDoubleValue
in interfaceSparseNumberVector
- Parameters:
iter
- Iterator- Returns:
- Value at the current position
-
iterLongValue
public long iterLongValue(int iter)
Description copied from interface:SparseNumberVector
Get the value of the iterators' current dimension.- Specified by:
iterLongValue
in interfaceSparseNumberVector
- Parameters:
iter
- Iterator- Returns:
- Value at the current position
-
toArray
public double[] toArray()
Returns a Vector representing in one column andgetDimensionality()
rows the values of this BitVector as double values.- Specified by:
toArray
in interfaceNumberVector
- Returns:
- a Matrix representing in one column and
getDimensionality()
rows the values of this BitVector as double values - See Also:
NumberVector.toArray()
-
contains
public boolean contains(long[] bitset)
Returns whether this BitVector contains all bits that are set to true in the specified BitSet.- Parameters:
bitset
- the bits to inspect in this BitVector- Returns:
- true if this BitVector contains all bits that are set to true in the specified BitSet, false otherwise
-
cloneBits
public long[] cloneBits()
Returns a copy of the bits currently set in this BitVector.- Returns:
- a copy of the bits currently set in this BitVector
-
cardinality
public int cardinality()
Compute the vector cardinality (uncached!)- Returns:
- Vector cardinality
-
jaccardSimilarity
public double jaccardSimilarity(BitVector v2)
Compute the Jaccard similarity of two bit vectors.- Parameters:
v2
- Second bit vector- Returns:
- Jaccard similarity (intersection / union)
-
hammingDistance
public int hammingDistance(BitVector v2)
Compute the Hamming distance of two bit vectors.- Parameters:
v2
- Second bit vector- Returns:
- Hamming distance (number of bits difference)
-
intersectionSize
public int intersectionSize(BitVector v2)
Compute the vector intersection size.- Parameters:
v2
- Second bit vector- Returns:
- Intersection size (number of bits in both)
-
unionSize
public int unionSize(BitVector v2)
Compute the vector union size.- Parameters:
v2
- Second bit vector- Returns:
- Intersection size (number of bits in both)
-
intersect
public boolean intersect(BitVector v2)
Compute whether two vectors intersect.- Parameters:
v2
- Second bit vector- Returns:
true
if they intersect in at least one bit.
-
andOnto
public void andOnto(long[] v)
Combine onto v using the AND operation, i.e.v &= this
.- Parameters:
v
- Existing bit set of same length.
-
orOnto
public void orOnto(long[] v)
Combine onto v using the OR operation, i.e.v |= this
.- Parameters:
v
- Existing bit set of same length.
-
xorOnto
public void xorOnto(long[] v)
Combine onto v using the XOR operation, i.e.v ^= this
.- Parameters:
v
- Existing bit set of same length.
-
toString
public java.lang.String toString()
Returns a String representation of this BitVector. The representation is suitable to be parsed byBitVectorLabelParser
.- Specified by:
toString
in interfaceFeatureVector<java.lang.Number>
- Overrides:
toString
in classjava.lang.Object
- Returns:
- a String representation of the FeatureVector of V
-
equals
public boolean equals(java.lang.Object obj)
Indicates whether some other object is "equal to" this BitVector. This BitVector is equal to the given object, if the object is a BitVector of same dimensionality and with identical bits set.- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
-