Spectral Partitioning and Nearest Neighbor Search
In the Approximate Nearest Neighbor Search (NNS) problem our goal is to pre-process a given set of points P in d-dimensional space into a data structure so that, given a query point q, we can quickly find a point p in P which is as close as possible to q according to some distance measure. NNS is an important primitive in many data analysis tasks, and has numerous applications in computational geometry, databases, computer vision, and machine learning, among others. In this talk I will focus on the high-dimensional regime in which space or query time exponential in dimension are unacceptable. I will describe a way to apply spectral partitioning of graphs to the NNS problem, giving us a generic reduction from nonlinear spectral gaps of metric spaces to data-dependent Locality-Sensitive Hashing, and a new approach to NNS data structures in high-dimensional spaces. Using this reduction, we get NNS data structures in the cell probe model for any d-dimensional norm with log(d) approximation, near-linear space, and sublinear query complexity. We also obtain improved NNS data structures for classical norms, like Lp norms and Schatten-p matrix norms. Finally, combining this reduction with a smooth map between spheres of normed spaces due to Daher, we get the first efficient high-dimensional NNS data structure with approximation sub-polynomial in the dimension d.
Based on joint work with Alexandr Andoni, Assaf Naor, Ilya Razenshteyn, and Erik Waingarten.