The k-NN plugin leverages the lightweight open source library Non-Metric Space Library (NMSLIB) that implements the approximate k-NN search based on Hierarchical Navigable Small world (HNSW) graphs. NMSLIB is a highly efficient implementation of k-NN, which has consistently out-performed most of the other solutions as per the ANN-Benchmarks published here. NMSLIB can be easily extended to add new search methods and distance functions.
The solution has extended Lucene codec to introduce a separate file format for storing and retrieving k-NN indices to deliver high efficiency k-NN search operations on Elasticsearch. Datasets in k-NN are vectors that are represented in Elasticsearch fields by the new datatype called knn_vector. This vector can support a single list of up to 10,000 floats.
k-NN functionality integrates seamlessly with other Elasticsearch features. This provides users the flexibility to use Elasticsearch’s extensive search features such as aggregations and filtering with k-NN to further slice and dice the data to increase the precision of the searches.