Evaluation of Jensen-Shannon distance over sparse data

Richard Connor, Franco Alberto Cardillo, Robert Moss, Fausto Rabitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Jensen-Shannon divergence is a symmetrised, smoothed version of Küllback-Leibler. It has been shown to be the square of a proper distance metric, and has other properties which make it an excellent choice for many high-dimensional spaces in ℝ*. The metric as defined is however expensive to evaluate. In sparse spaces over many dimensions the Intrinsic Dimensionality of the metric space is typically very high, making similarity-based indexing ineffectual. Exhaustive searching over large data collections may be infeasible. Using a property that allows the distance to be evaluated from only those dimensions which are non-zero in both arguments, and through the identification of a threshold function, we show that the cost of the function can be dramatically reduced.

Original languageEnglish
Title of host publicationSimilarity Search and Applications - 6th International Conference, SISAP 2013, Proceedings
Pages163-168
Number of pages6
DOIs
Publication statusPublished - 30 Oct 2013
Event6th International Conference on Similarity Search and Applications, SISAP 2013 - A Coruna, Spain
Duration: 2 Oct 20134 Oct 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8199 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Similarity Search and Applications, SISAP 2013
Country/TerritorySpain
CityA Coruna
Period2/10/134/10/13

Fingerprint

Dive into the research topics of 'Evaluation of Jensen-Shannon distance over sparse data'. Together they form a unique fingerprint.

Cite this