Vector Search Quickstart

This notebook shows the core vector search workflow in ApertureDB:

Create a DescriptorSet (vector index)
Add Descriptors (embeddings with metadata)
Run KNN search to find similar items

For real embeddings and runnable end-to-end examples, jump straight to the notebooks linked at the bottom.

Connect to ApertureDB

Option A: ApertureDB Cloud (recommended)
Sign up for a free 30-day trial. Get your key from Connect > Generate API Key, add it to a .env file in this directory:

APERTUREDB_KEY=your_key_here

Option B: Community Edition (local Docker)
Run this in a terminal before starting the notebook:

docker run -d --name aperturedb \
  -p 55555:55555 -e ADB_MASTER_KEY=admin -e ADB_FORCE_SSL=false \
  aperturedata/aperturedb-community

See client configuration options for all connection methods and server setup options for deployment choices.

%pip install --upgrade --quiet aperturedb python-dotenv

# Option A: ApertureDB Cloud
from dotenv import load_dotenv
load_dotenv()  # loads APERTUREDB_KEY from .env into the environment

True

# Option B: Community Edition (local Docker)
# !adb config create localdb --active \
#     --host localhost --port 55555 \
#     --username admin --password admin \
#     --no-use-ssl --no-interactive

from aperturedb.CommonLibrary import create_connector

client = create_connector()
response, _ = client.query([{"GetStatus": {}}])
client.print_last_response()

[
    {
        "GetStatus": {
            "info": "OK",
            "status": 0,
            "system": "ApertureDB",
            "version": "0.19.6"
        }
    }
]

Create a Vector Index

A DescriptorSet is a named, indexed collection of vectors. All vectors in a set must have the same number of dimensions.

SET_NAME = "recipe_search"

client.query([{
    "AddDescriptorSet": {
        "name":       SET_NAME,
        "dimensions": 4,           # use 384, 512, 1024, etc. for real models
        "engine":     "FaissFlat", # exact search; use HNSW for large-scale ANN
        "metric":     "CS",        # cosine similarity; or "L2" for Euclidean
    }
}])
client.print_last_response()

[
    {
        "AddDescriptorSet": {
            "status": 0
        }
    }
]

Add Vectors

Each Descriptor is a float32 vector plus optional metadata properties. The vector is passed as a binary blob.

import numpy as np

dishes = [
    {"name": "Butter Chicken",  "cuisine": "Indian",    "vec": [0.9, 0.1, 0.8, 0.2]},
    {"name": "Rajma Chawal",    "cuisine": "Indian",    "vec": [0.8, 0.2, 0.9, 0.1]},
    {"name": "Ramen",           "cuisine": "Japanese",  "vec": [0.1, 0.9, 0.2, 0.8]},
    {"name": "Sushi",           "cuisine": "Japanese",  "vec": [0.2, 0.8, 0.1, 0.9]},
    {"name": "Focaccia",        "cuisine": "Italian",   "vec": [0.5, 0.5, 0.6, 0.4]},
]

for dish in dishes:
    vec = np.array(dish["vec"], dtype="float32")
    client.query([{
        "AddDescriptor": {
            "set": SET_NAME,
            "properties": {"name": dish["name"], "cuisine": dish["cuisine"]},
        }
    }], [vec.tobytes()])

print(f"Added {len(dishes)} descriptors")

Added 5 descriptors

KNN Search

FindDescriptor takes a query vector and returns the k nearest neighbors by the set's distance metric.

query_vec = np.array([0.85, 0.15, 0.85, 0.15], dtype="float32")  # close to Indian dishes

response, _ = client.query([{
    "FindDescriptor": {
        "set":         SET_NAME,
        "k_neighbors": 3,
        "distances":   True,
        "results":     {"all_properties": True},
    }
}], [query_vec.tobytes()])

client.print_last_response()

[
    {
        "FindDescriptor": {
            "entities": [
                {
                    "_distance": 0.9966610670089722,
                    "_set_name": "recipe_search",
                    "_uniqueid": "3.192.488740",
                    "cuisine": "Indian",
                    "name": "Butter Chicken"
                },
                {
                    "_distance": 0.9966610670089722,
                    "_set_name": "recipe_search",
                    "_uniqueid": "3.193.488760",
                    "cuisine": "Indian",
                    "name": "Rajma Chawal"
                },
                {
                    "_distance": 0.867941677570343,
                    "_set_name": "recipe_search",
                    "_uniqueid": "3.196.488820",
                    "cuisine": "Italian",
                    "name": "Focaccia"
                }
            ],
            "returned": 3,
            "status": 0
        }
    }
]

Python SDK: `Descriptors` Wrapper

The Descriptors class in the Python SDK wraps the query language and adds reranking with MMR.

from aperturedb.Descriptors import Descriptors

descriptors = Descriptors(client)

# Basic similarity search — distances available
descriptors.find_similar(
    set=SET_NAME,
    vector=query_vec,
    k_neighbors=3,
    distances=True,
)
print("find_similar:")
for r in descriptors.response:
    print(f"  {r['name']:<20} distance={r['_distance']:.4f}")

print()

# MMR: diversify results (avoids near-duplicates)
# Note: find_similar_mmr uses blobs internally for reranking;
# _distance is not available in the output.
descriptors.find_similar_mmr(
    set=SET_NAME,
    vector=query_vec,
    k_neighbors=3,
    fetch_k=5,
    lambda_mult=0.5,   # 0.0 = max diversity, 1.0 = similarity only
)
print("find_similar_mmr (diversified):")
for r in descriptors.response:
    print(f"  {r['name']:<20} cuisine={r['cuisine']}")

find_similar:
  Butter Chicken       distance=0.9967
  Rajma Chawal         distance=0.9967
  Focaccia             distance=0.8679

find_similar_mmr (diversified):
  Butter Chicken       cuisine=Indian
  Rajma Chawal         cuisine=Indian
  Focaccia             cuisine=Italian

Cleanup

client.query([{"DeleteDescriptorSet": {"with_name": SET_NAME}}])
client.print_last_response()

[
    {
        "DeleteDescriptorSet": {
            "count": 1,
            "status": 0
        }
    }
]

Next Steps

Replace the synthetic vectors above with real embeddings from your data:

Data type	Notebook
Text / documents	Recipe Text Search — sentence-transformers on Cookbook dish descriptions
PDF	Work with PDFs — chunk, embed, and search a PDF blob
Images	Image Vector Search — CLIP embeddings on dish images, text-to-image search
Video frames	Video Vector Search — CLIP frame embeddings, text-to-frame search
Audio	Audio Vector Search — audio embedding and search
Bulk loading	Bulk Embeddings — ParallelLoader for large-scale ingestion
Hybrid search	Hybrid Search — combine KNN with metadata filters

Connect to ApertureDB​

Create a Vector Index​

Add Vectors​

KNN Search​

Python SDK: Descriptors Wrapper​

Cleanup​

Next Steps​