Filtering, Aggregation, and Facets
Hybrid search combines KNN vector similarity with metadata filters in a single query. ApertureDB applies constraints server-side during the search — not as a post-filter — so results are both semantically relevant and satisfy the condition.
Hybrid Search — seven patterns on the Cookbook dataset: KNN with filters, sort, count, average, and group by (facets)
How It Works
Add a constraints block to FindDescriptor. The engine evaluates the constraint during KNN traversal — not after — so k_neighbors results are drawn from the filtered subset:
q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 5,
"constraints": {
"cuisine": ["==", "Indian"], # server-side filter
},
"distances": True,
"results": {"all_properties": True},
}
}]
response, _ = client.query(q, [query_embedding.tobytes()])
Constraint Operators
| Operator | Example | Meaning |
|---|---|---|
== | ["==", "Indian"] | exact match |
!= | ["!=", "Italian"] | not equal |
> | [">", 2023] | greater than |
>= | [">=", 2023] | greater than or equal |
< | ["<", 600] | less than |
<= | ["<=", 600] | less than or equal |
Multi-Condition Filter
Combine multiple constraints in the same block — all conditions must be satisfied:
q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 5,
"constraints": {
"cuisine": ["==", "Indian"],
"calories": ["<", 600],
},
"distances": True,
"results": {"all_properties": True},
}
}]
Sort
sort is a top-level command parameter (not inside results). It orders the returned entities by a property after the KNN search:
q = [{
"FindDescriptor": {
"set": "recipe_search",
"k_neighbors": 10,
"sort": "calories", # sort top-10 KNN results ascending
"distances": True,
"results": {"all_properties": True},
}
}]
Count, Average, Min, Max
These go inside results. The request uses average, min, max; the response keys are prefixed with _: _avg, _min, _max. count stays count.
# Count only — no entities returned
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"constraints": {"cuisine": ["==", "Indian"]},
"results": {"count": True},
}
}])
print(response[0]["FindDescriptor"]["count"])
# Average + min + max
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"constraints": {"cuisine": ["==", "Indian"]},
"results": {"average": "calories", "min": "calories", "max": "calories"},
}
}])
fd = response[0]["FindDescriptor"]
print(fd["_avg"]["calories"], fd["_min"]["calories"], fd["_max"]["calories"])
Aggregates also work combined with k_neighbors — they aggregate over the top-k KNN result set.
Group By (Facets)
group returns per-value counts without returning individual entities. Response comes back in groups, with _group_count and optional _group_avg, _group_min, _group_max:
# Count per cuisine
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"results": {"group": ["cuisine"]},
}
}])
for g in response[0]["FindDescriptor"]["groups"]:
print(g["cuisine"], g["_group_count"])
# Average calories per cuisine
response, _ = client.query([{
"FindDescriptor": {
"set": "recipe_search",
"results": {"group": ["cuisine"], "average": "calories"},
}
}])
for g in response[0]["FindDescriptor"]["groups"]:
print(g["cuisine"], g["_group_avg"]["calories"])
What's Next
- Hybrid Search notebook — all patterns runnable on the Cookbook dataset
- Building RAG Pipelines — use hybrid search as the retrieval step
FindDescriptorreference — full parameter listresultsreference — count, average, sum, min, max, group (for entity/image/video commands)