Vector Search and RAG with ApertureDB
ApertureDB can store and manage multimodal data, including text, documents, images, videos, blobs, clips, frames, and their associated embeddings. With its support for indexing embeddings of any dimensions, perfoming K-near neighbor (KNN) search and classification, ApertureDB offers all the necessary support as a vector database.
| ApertureDB Term | Equivalent Elsewhere |
|---|---|
DescriptorSet | collection, index, vector store |
Descriptor | embedding, feature vector |
Descriptor Sets, Collections, Vector Search Space, or Vector Index
A DescriptorSet, also called collection, vector search space, or vector index is created to store the embeddings. It is a group of descriptors with specified dimensions resulting from the same algorithm for feature extraction. For instance, we can create a DescriptorSet and insert multiple descriptors obtained by using OpenFace (128 dimensions), and then index and perform matching operations over those descriptors. This set defines the search space for our embeddings.
The engine (e.g. HNSW) and distance metrics (e.g. cosine) are assigned to the set when creating it. All the embeddings added to the set are then indexed using the engine and KNN is based on the specified distance metric. ApertureDB allows descriptor sets to be created with multiple engines and distance metrics so users can choose KNN criteria on the fly.
Descriptors or Embeddings
Unimodal or multimodal embeddings are always added to a DescriptorSet, along with other metadata properties and labels associated with it.
A blob must be provided in the array of blobs, which contains the descriptor's values. The blob is an binary array of 32-bit floating point values. The size of the blob, in bytes, must be dimensions*4.
Vector Search Quickstart — create a DescriptorSet, add vectors, run KNN search, and
use the Python SDK Descriptors wrapper. The quick start also links to notebooks showing vector search support for
various data types like images, videos, documents, audio.
Application Examples
Some examples of ApertureDB being used as a vector database:
- Finding faces in images using multimodal embeddings
- Building agents using ApertureDB vector search
- Video semantic search
- GraphRAG with Gemini and ApertureDB
Building RAG Pipelines
MMR (Maximal Marginal Relevance) reranking is implemented in the Python SDK
via Descriptors.find_similar_mmr and is
utilized in the LangChain and LlamaIndex integrations.
Graph capabilities let you scope retrieval to a subgraph — for example, restricting responses based on user access permissions for secure RAG or building real-world RAG applications.
See the Building RAG Pipelines guide for a full pipeline walkthrough.
ApertureDB vs. Other Vector Databases
ApertureDB delivers 2–10x higher KNN throughput compared to Pinecone, Weaviate, Qdrant, Chroma, Lance, and Milvus across embedding sizes from 96 to 4096 dimensions, with sub-10ms query latency for typical RAG workloads. Pricing does not vary by dimension count, number of embeddings, or query volume.
See the full benchmarks page, or request the performance whitepaper (access required).
Demos
The MCP Server workflow exposes ApertureDB directly to MCP-compatible agents (Claude, Cursor, etc.) without writing retrieval code manually.
For more videos, visit the ApertureData YouTube channel.
References
DescriptorsPython SDK —find_similar,find_similar_mmr- AQL Descriptor Reference
- More VectorDB and RAG Resources — videos, demos, and community examples