Efficient High-dimensional Vector Search

Picture of mengxuan-zhang.md Mengxuan Zhang

8 Jan 2026

The embedding of unstructured data, e.g., text, image, audio, as high-dimensional vectors, is an emerging and popular way to represent, manage and utilize the data from diverse sources. The storage of such high-dimensional vectors is called vector databases, which serve as indispensable external knowledge repositories by providing AI models with contextually relevant information, i.e., Retrieval-Augmented Generation (RAG). To facilitate efficient information retrieval, the Approximate Nearest Neighbor (ANN) search returns semantically similar vectors for a given query vector with the help of indexes. Nevertheless, the high dimensionality of vector datasets brings the curse of dimensionality, which poses great challenges to ANN index performance. 

View the Facebook AI Similarity Search (Faiss) introduction on GitHub and Pinecone, for more preliminary knowledge of vector databases.

Below are several research projects to improve the ANN search performance.

Project 1: Effective Graph Construction for Approximate Nearest Neighbor Search

One type of index is a similarity graph, where similar vectors are retrieved through graph traversal. The quality of the similarity graph is critical since it determines and would largely affect the search performance, i.e., efficiency and accuracy. To address this concern, this project will focus on designing a novel, effective graph construction to improve ANN search performance.

Requirements: C++ implementation skill, basic understanding of Graph Theory 

Related works on similarity graph construction and graph-based ANN search:

  • Fu, Cong, et al. “Fast approximate nearest neighbor search with the navigating spreading-out graph.” PVLDB 2019.
  • Wang, Mengzhao, et al. “A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search.” VLDB 2021.
  • Malkov, Yu A., and Dmitry A. Yashunin. “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.” TPAMI 2018.
  • Dong, Wei, Charikar Moses, and Kai Li. “Efficient k-nearest neighbor graph construction for generic similarity measures.” WWW 2011.

Project 2: Speeding Up ANN Search through Query-Space Aware Caching 

In real-life applications, the submitted queries tend to be highly skewed, with a long-tail and head-heavy distribution. Based on this phenomenon, we can identify the frequent query space and cache their query results, which can be reused in the subsequent query answering. Therefore, this project focuses on speeding up ANN query answering through caching.

Requirements: C++ implementation skill, foundation in data structures and algorithms

Related works on cache-based query speed-up:

  • Li, Lingli, Zhanyu He, and Zhuo Zhang. “ANN-Cache: Accelerating Approximate Nearest Neighbor Search via Caching.” Authorea Preprints (2025).
  • Zhou, Yijie, et al. “GoVector: An I/O-Efficient Caching Strategy for High-Dimensional Vector Nearest Neighbor Search.” arXiv preprint arXiv:2508.15694 (2025).
  • Zeng, Ximu, et al. “LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search.” Proceedings of the ACM on Web Conference 2025. 2025.

Project 3: GPU-accelerated ANN search

Real-life applications (e.g., OpenAI, Copilot, Spotify, Amazon, Meta) relying on vector DB have high demands on query efficiency, i.e., hundreds of thousands of queries need to be processed per second on million- or billion-scale datasets. To meet the real-time query response, this project will explore how to accelerate ANN search with the help of GPUs because of their strong parallel computation. 

Requirements: C++ and GPU programming skills

Related works on GPU-accelerated computation and ANN search:

  • Groh, Fabian, et al. “Ggnn: Graph-based gpu nearest neighbor search.” IEEE Transactions on Big Data 9.1 (2022): 267-279.
  • Zhao, Weijie, Shulong Tan, and Ping Li. “Song: Approximate nearest neighbor search on gpu.” ICDE 2020.
  • Ootomo, Hiroyuki, et al. “Cagra: Highly parallel graph construction and approximate nearest neighbor search for gpus.” ICDE 2024.

Each of these projects is conducted independently and research-focused. At least one (high-quality) paper can be/ is expected to be produced if properly conducted for each project.

If you are interested in one of them and meet the requirements listed, please read the related papers first and contact Mengxuan, specifying your interested project title in your email.

arrow-left bars magnifying-glass xmark