Query filtering using two-dimensional local embeddings

Lucia Vadicamo, Richard Connor*, Edgar Chávez

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


In high dimensional datasets, exact indexes are ineffective for proximity queries, and a sequential scan over the entire dataset is unavoidable. Accepting this, here we present a new approach employing two-dimensional embeddings. Each database element is mapped to the XY plane using the four-point property. The caveat is that the mapping is local: in other words, each object is mapped using a different mapping.

The idea is that each element of the data is associated with a pair of reference objects that is well-suited to filter that particular object, in cases where it is not relevant to a query. This maximises the probability of excluding that object from a search. At query time, a query is compared with a pool of reference objects which allow its mapping to all the planes used by data objects. Then, for each query/object pair, a lower bound of the actual distance is obtained. The technique can be applied to any metric space that possesses the four-point property, therefore including Euclidean, Cosine, Triangular, Jensen–Shannon, and Quadratic Form distances.

Our experiments show that for all the datasets tested, of varying dimensionality, our approach can filter more objects than a standard metric indexing approach. For low dimensional data this does not make a good search mechanism in its own right, as it does not scale with the size of the data: that is, its cost is linear with respect to the data size. However, we also show that it can be added as a post-filter to other mechanisms, increasing efficiency with little extra cost in space or time. For high-dimensional data, we show related approximate techniques which, we believe, give the best known compromise for speeding up the essential sequential scan. The potential uses of our filtering technique include pure GPU searching, taking advantage of the tiny memory footprint of the mapping.
Original languageEnglish
Article number101808
Number of pages13
JournalInformation Systems
Early online date18 Jun 2021
Publication statusPublished - 1 Nov 2021


  • Metric search
  • Extreme pivoting
  • Supermetric space
  • Four-point property
  • Pivot based index


Dive into the research topics of 'Query filtering using two-dimensional local embeddings'. Together they form a unique fingerprint.

Cite this