TY - GEN
T1 - Evaluation of Jensen-Shannon distance over sparse data
AU - Connor, Richard
AU - Cardillo, Franco Alberto
AU - Moss, Robert
AU - Rabitti, Fausto
PY - 2013/10/30
Y1 - 2013/10/30
N2 - Jensen-Shannon divergence is a symmetrised, smoothed version of Küllback-Leibler. It has been shown to be the square of a proper distance metric, and has other properties which make it an excellent choice for many high-dimensional spaces in ℝ*. The metric as defined is however expensive to evaluate. In sparse spaces over many dimensions the Intrinsic Dimensionality of the metric space is typically very high, making similarity-based indexing ineffectual. Exhaustive searching over large data collections may be infeasible. Using a property that allows the distance to be evaluated from only those dimensions which are non-zero in both arguments, and through the identification of a threshold function, we show that the cost of the function can be dramatically reduced.
AB - Jensen-Shannon divergence is a symmetrised, smoothed version of Küllback-Leibler. It has been shown to be the square of a proper distance metric, and has other properties which make it an excellent choice for many high-dimensional spaces in ℝ*. The metric as defined is however expensive to evaluate. In sparse spaces over many dimensions the Intrinsic Dimensionality of the metric space is typically very high, making similarity-based indexing ineffectual. Exhaustive searching over large data collections may be infeasible. Using a property that allows the distance to be evaluated from only those dimensions which are non-zero in both arguments, and through the identification of a threshold function, we show that the cost of the function can be dramatically reduced.
UR - http://www.scopus.com/inward/record.url?scp=84886443337&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-41062-8_16
DO - 10.1007/978-3-642-41062-8_16
M3 - Conference contribution
AN - SCOPUS:84886443337
SN - 9783642410611
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 163
EP - 168
BT - Similarity Search and Applications - 6th International Conference, SISAP 2013, Proceedings
T2 - 6th International Conference on Similarity Search and Applications, SISAP 2013
Y2 - 2 October 2013 through 4 October 2013
ER -