Vector Search and RAG with ApertureDB

ApertureDB can store and manage multimodal data, including text, documents, images, videos, blobs, clips, frames, and their associated embeddings. With its support for indexing embeddings of any dimensions, perfoming K-near neighbor (KNN) search and classification, ApertureDB offers all the necessary support as a vector database.

ApertureDB Term	Equivalent Elsewhere
`DescriptorSet`	collection, index, vector store
`Descriptor`	embedding, feature vector

Descriptor Sets, Collections, Vector Search Space, or Vector Index

A DescriptorSet, also called collection, vector search space, or vector index is created to store the embeddings. It is a group of descriptors with specified dimensions resulting from the same algorithm for feature extraction. For instance, we can create a DescriptorSet and insert multiple descriptors obtained by using OpenFace (128 dimensions), and then index and perform matching operations over those descriptors. This set defines the search space for our embeddings.

The engine (e.g. HNSW) and distance metrics (e.g. cosine) are assigned to the set when creating it. All the embeddings added to the set are then indexed using the engine and KNN is based on the specified distance metric. ApertureDB allows descriptor sets to be created with multiple engines and distance metrics so users can choose KNN criteria on the fly.

Descriptors or Embeddings

Unimodal or multimodal embeddings are always added to a DescriptorSet, along with other metadata properties and labels associated with it.

A blob must be provided in the array of blobs, which contains the descriptor's values. The blob is an binary array of 32-bit floating point values. The size of the blob, in bytes, must be dimensions*4.

Quickstart Notebook

Vector Search Quickstart — create a DescriptorSet, add vectors, run KNN search, and use the Python SDK Descriptors wrapper. The quick start also links to notebooks showing vector search support for various data types like images, videos, documents, audio.

Application Examples

Some examples of ApertureDB being used as a vector database:

Building RAG Pipelines

MMR (Maximal Marginal Relevance) reranking is implemented in the Python SDK via Descriptors.find_similar_mmr and is utilized in the LangChain and LlamaIndex integrations.

Graph capabilities let you scope retrieval to a subgraph — for example, restricting responses based on user access permissions for secure RAG or building real-world RAG applications.

See the Building RAG Pipelines guide for a full pipeline walkthrough.

ApertureDB vs. Other Vector Databases

ApertureDB delivers 2–10x higher KNN throughput compared to Pinecone, Weaviate, Qdrant, Chroma, Lance, and Milvus across embedding sizes from 96 to 4096 dimensions, with sub-10ms query latency for typical RAG workloads. Pricing does not vary by dimension count, number of embeddings, or query volume.

See the full benchmarks page, or request the performance whitepaper (access required).