TY - GEN
T1 - High dimensional search using polyhedral query
AU - Connor, Richard
AU - MacKenzie-Leigh, Stewart
AU - Moss, Robert
PY - 2014/1/1
Y1 - 2014/1/1
N2 - It is well known that, as the dimensionality of a metric space increases, metric search techniques become less effective and the cost of indexing mechanisms becomes greater than the saving they give. This is due to the so-called curse of dimensionality. One effect of increasing dimensionality is that the ratio of unit hypersphere to unit hypercube volume decreases rapidly, making the solution to a similarity query (the query ball, or hypersphere) ever more difficult to identify by using metric invariants such as triangle inequality. In this paper we take a different approach, by identifying points within a query polyhedron rather than a ball. We show how this can be achieved by constructing a surrogate metric space, such that a query ball in the surrogate space corresponds to a polyhedron in the original space. If the polyhedron contains the ball, the overall cost of the query is likely to be increased in high dimensions; however, we show that shrinking the polyhedron can capture a surprisingly high proportion of the points within the ball, whilst at the same time giving a more efficient, and more scalable, search. We show results which confirm our underlying hypothesis. In some cases we can retrieve significant volumes of query results from spaces which are otherwise intractable.
AB - It is well known that, as the dimensionality of a metric space increases, metric search techniques become less effective and the cost of indexing mechanisms becomes greater than the saving they give. This is due to the so-called curse of dimensionality. One effect of increasing dimensionality is that the ratio of unit hypersphere to unit hypercube volume decreases rapidly, making the solution to a similarity query (the query ball, or hypersphere) ever more difficult to identify by using metric invariants such as triangle inequality. In this paper we take a different approach, by identifying points within a query polyhedron rather than a ball. We show how this can be achieved by constructing a surrogate metric space, such that a query ball in the surrogate space corresponds to a polyhedron in the original space. If the polyhedron contains the ball, the overall cost of the query is likely to be increased in high dimensions; however, we show that shrinking the polyhedron can capture a surprisingly high proportion of the points within the ball, whilst at the same time giving a more efficient, and more scalable, search. We show results which confirm our underlying hypothesis. In some cases we can retrieve significant volumes of query results from spaces which are otherwise intractable.
UR - http://www.scopus.com/inward/record.url?scp=84911167198&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-11988-5_16
DO - 10.1007/978-3-319-11988-5_16
M3 - Conference contribution
AN - SCOPUS:84911167198
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 176
EP - 188
BT - Similarity Search and Applications - 7th International Conference, SISAP 2014, Proceedings
A2 - Traina, Agma Juci Machado
A2 - Traina, Caetano
A2 - Cordeiro, Robson Leonardo Ferreira
PB - Springer-Verlag
T2 - 7th International Conference on Similarity Search and Applications, SISAP 2014
Y2 - 29 October 2014 through 31 October 2014
ER -