Skip to main content

Filtering, Aggregation, and Facets

Hybrid search combines KNN vector similarity with metadata filters in a single query. ApertureDB applies constraints server-side during the search — not as a post-filter — so results are both semantically relevant and satisfy the condition.

Runnable Notebook

Hybrid Search — seven patterns on the Cookbook dataset: KNN with filters, sort, count, average, and group by (facets)


How It Works

Add a constraints block to FindDescriptor. The engine evaluates the constraint during KNN traversal — not after — so k_neighbors results are drawn from the filtered subset:

q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 5,
"constraints": {
"cuisine": ["==", "Indian"], # server-side filter
},
"distances": True,
"results": {"all_properties": True},
}
}]

response, _ = client.query(q, [query_embedding.tobytes()])

Constraint Operators

OperatorExampleMeaning
==["==", "Indian"]exact match
!=["!=", "Italian"]not equal
>[">", 2023]greater than
>=[">=", 2023]greater than or equal
<["<", 600]less than
<=["<=", 600]less than or equal

Multi-Condition Filter

Combine multiple constraints in the same block — all conditions must be satisfied:

q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 5,
"constraints": {
"cuisine": ["==", "Indian"],
"calories": ["<", 600],
},
"distances": True,
"results": {"all_properties": True},
}
}]

Sort

sort is a top-level command parameter (not inside results). It orders the returned entities by a property after the KNN search:

q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 10,
"sort": "calories", # sort top-10 KNN results ascending
"distances": True,
"results": {"all_properties": True},
}
}]

Count, Average, Min, Max

These go inside results. The request uses average, min, max; the response keys are prefixed with _: _avg, _min, _max. count stays count.

# Count only — no entities returned
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"constraints": {"cuisine": ["==", "Indian"]},
"results": {"count": True},
}
}])
print(response[0]["FindDescriptor"]["count"])

# Average + min + max
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"constraints": {"cuisine": ["==", "Indian"]},
"results": {"average": "calories", "min": "calories", "max": "calories"},
}
}])
fd = response[0]["FindDescriptor"]
print(fd["_avg"]["calories"], fd["_min"]["calories"], fd["_max"]["calories"])

Aggregates also work combined with k_neighbors — they aggregate over the top-k KNN result set.


Group By (Facets)

group returns per-value counts without returning individual entities. Response comes back in groups, with _group_count and optional _group_avg, _group_min, _group_max:

# Count per cuisine
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"results": {"group": ["cuisine"]},
}
}])
for g in response[0]["FindDescriptor"]["groups"]:
print(g["cuisine"], g["_group_count"])

# Average calories per cuisine
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"results": {"group": ["cuisine"], "average": "calories"},
}
}])
for g in response[0]["FindDescriptor"]["groups"]:
print(g["cuisine"], g["_group_avg"]["calories"])

What's Next