Skip to main content

ApertureDB Benchmarks

We constantly run internal benchmarks, either using public datasets like COCO and YFCC100M, as well as on data and workloads modeling our customer workloads. We also periodically run some stress tests as part of our test flights with a goal of testing certain limits though in most cases its cloud costs that cause us to pause rather than ApertureDB itself.

We continue to run more benchmarks and this page will be updated as we are able to publish more results. For a detailed report, please write to team@aperturedata.io

Query Performance and Scaling

  • ApertureDB server replicas can be scaled to handle a million queries per second on GCP or AWS. Of course, the payload will affect these queries but our evaluation was focused on stress testing the scaling with number of replicas.
  • As noted in this case study, Badger Tech can already get over 13K vector search queries per second on a search space of millions of embeddings achieving over 90% accuracy, making ApertureDB one of the highest performing vector database that is enterprise-ready and can be deployed in-house
  • Using a Biotech customer's benchmark, our queries ran, within their SLA, over 1.3+ billion metadata entities and as many relationships. These queries filtered from over 300+ million images
  • For KNN classification over a few million image embeddings, we see sub 7 msec query response time as measured at the server (removing any network effects) as well as sub 10 msec query response time for our RAG chatbot, on the server side.

Data Ingestion

  • We can ingest 750 high-res images per second at the current scale which means we can load the YFCC100M dataset with 100M images and videos in about 37hrs.
  • For CIFAR100, we can ingest the 60K images in less than 3 minutes
  • For the COCO dataset, we ingest at 820+ images/sec. This includes both the images and all the metadata associated with them. We can ingest the entire dataset (120K images) in less than 3 minutes. We can ingest all extra metadata, segmentation masks, and embeddings (a total of 250K images, 50K embeddings, and millions of bounding boxes, for a total of about 30GB or data) in less than 25 minutes.

We have still not established the upper limits of ApertureDB performance but are just getting started!