Filtering, Aggregation, and Facets

Hybrid search combines KNN vector similarity with metadata filters in a single query. ApertureDB applies constraints server-side during the search — not as a post-filter — so results are both semantically relevant and satisfy the condition.

Runnable Notebook

Hybrid Search — seven patterns on the Cookbook dataset: KNN with filters, sort, count, average, and group by (facets)

How It Works

Add a constraints block to FindDescriptor. The engine evaluates the constraint during KNN traversal — not after — so k_neighbors results are drawn from the filtered subset:

q = [{
    "FindDescriptor": {
        "set":         "recipe_search",
        "k_neighbors": 5,
        "constraints": {
            "cuisine": ["==", "Indian"],   # server-side filter
        },
        "distances": True,
        "results":   {"all_properties": True},
    }
}]

response, _ = client.query(q, [query_embedding.tobytes()])

Constraint Operators

Operator	Example	Meaning
`==`	`["==", "Indian"]`	exact match
`!=`	`["!=", "Italian"]`	not equal
`>`	`[">", 2023]`	greater than
`>=`	`[">=", 2023]`	greater than or equal
`<`	`["<", 600]`	less than
`<=`	`["<=", 600]`	less than or equal

Multi-Condition Filter

Combine multiple constraints in the same block — all conditions must be satisfied:

q = [{
    "FindDescriptor": {
        "set":         "recipe_search",
        "k_neighbors": 5,
        "constraints": {
            "cuisine":  ["==", "Indian"],
            "calories": ["<", 600],
        },
        "distances": True,
        "results":   {"all_properties": True},
    }
}]

Sort

sort is a top-level command parameter (not inside results). It orders the returned entities by a property after the KNN search:

q = [{
    "FindDescriptor": {
        "set":         "recipe_search",
        "k_neighbors": 10,
        "sort":        "calories",   # sort top-10 KNN results ascending
        "distances":   True,
        "results":     {"all_properties": True},
    }
}]

Count, Average, Min, Max

These go inside results. The request uses average, min, max; the response keys are prefixed with _: _avg, _min, _max. count stays count.

# Count only — no entities returned
response, _ = client.query([{
    "FindDescriptor": {
        "set":         "recipe_search",
        "constraints": {"cuisine": ["==", "Indian"]},
        "results":     {"count": True},
    }
}])
print(response[0]["FindDescriptor"]["count"])

# Average + min + max
response, _ = client.query([{
    "FindDescriptor": {
        "set":         "recipe_search",
        "constraints": {"cuisine": ["==", "Indian"]},
        "results":     {"average": "calories", "min": "calories", "max": "calories"},
    }
}])
fd = response[0]["FindDescriptor"]
print(fd["_avg"]["calories"], fd["_min"]["calories"], fd["_max"]["calories"])

Aggregates also work combined with k_neighbors — they aggregate over the top-k KNN result set.

group returns per-value counts without returning individual entities. Response comes back in groups, with _group_count and optional _group_avg, _group_min, _group_max:

# Count per cuisine
response, _ = client.query([{
    "FindDescriptor": {
        "set":     "recipe_search",
        "results": {"group": ["cuisine"]},
    }
}])
for g in response[0]["FindDescriptor"]["groups"]:
    print(g["cuisine"], g["_group_count"])

# Average calories per cuisine
response, _ = client.query([{
    "FindDescriptor": {
        "set":     "recipe_search",
        "results": {"group": ["cuisine"], "average": "calories"},
    }
}])
for g in response[0]["FindDescriptor"]["groups"]:
    print(g["cuisine"], g["_group_avg"]["calories"])

What's Next

Hybrid Search notebook — all patterns runnable on the Cookbook dataset
Building RAG Pipelines — use hybrid search as the retrieval step
FindDescriptor reference — full parameter list
results reference — count, average, sum, min, max, group (for entity/image/video commands)

How It Works​

Constraint Operators​

Multi-Condition Filter​

Sort​

Count, Average, Min, Max​

Group By (Facets)​

What's Next​