Skip to main content

Migrate From Any Database Type

ApertureDB is a graph-vector database purpose-built for multimodal AI. It is the unifying backend for the entire AI/ML pipeline. Only painful DIYs are the full alternative today:

We have summarized AI use cases that we solve for our customers today. Since, the challenges we address are prominent in the context of Multimodal AI use cases, we use the data layer requirements in such cases to guide our comparisons below.

Comparison with Traditional Databases

Today’s databases need a lot of additional help to achieve what ApertureDB does. We have already discussed migrating from a relational database in great detail and summarize an example in the table below. In this table, we also introduce some other popular NoSQL database types for a more complete picture to help compare ApertureDB with various databaes.

RequirementsMongoDB (document db)Neo4j (graph db)SQL (relational db)ApertureDB
Store and process unstructured types like images, videosStore binary files up to 100M but lose "image" or “video” recognition. No native processing or visualizingNo storage or recognition for processing of unstructured types. Needs a DIY layer as shown earlier for anything complex like processing or visualizingNo storage or recognition for processing of unstructured types. Needs a DIY layer as shown earlier for anything complexNatively recognizes images, videos and supports pre-processing e.g. intervals or frame rate changes. Also supports other types as blobs
Store and search regions of interest (annotations, clips)JSON can be directly stored and search but harder to connect with unstructured typesAnnotations could be modeled and relationships established to image or video nodes but no pixel retrieval in regions of interest or IoUAnnotations could be modeled though schema is less flexible and foreign keys established to images or videos but no pixel retrieval in regions of interest or IoUNatively supports regions of interest and their relations to image / video objects. Easy to search, fetch RoI pixels, IoU
Store and search metadataJSON storage and searchProperty graph model for complex searchesRelational model making connected data searches harder to write and usually longer to execute and schema is harder to evolveProperty graph model for complex searches
AI/ML SupportNot designed for it but need complex, one-off DIY scriptsHas some support with Neo4j data science but its more for numerical quantitiesSupport varies but for building video training, connecting to annotations, visualizing video stuff, need lots of DIYAI/ML framework connector, annotation connector, batching, and other AI tools support
Vector searchNot designed for it but indexes are being overlayedHas some graph vector search support now in the DS suiteNot designed for it but indexes are being overlayedIn-depth support

Key value and time series databases provide a smaller subset of capabilities compared to the chosen categories of databases.

Another significant difference, just from a database perspective, is our ability to offer data consistency at the level of queries that involve both, metadata and data, something that data engineers have to manually enforce in the DIY systems they are forced to build today.

Comparison with Vector Databases

We also compare ApertureDB to a couple of the popular vector databases. There are some common features across all three

  • Variety of exact / approximate search engines with different distance metrics
  • Embeddings from a variety of multimodal data for similarity searches
  • Support for CRUD operations, at least over vectors and attached metadata attributes

However, since we take a holistic approach to data management, what we offer goes beyond just vector databases (ApertureDB is an enterprise ready vector db + more). The table below answers the question around how we are different:

RequirementsPineconeMilvus, etc.ApertureDB
Support for rich metadata for rich filters and access to corresponding multimodal dataSimple attributes attached to vector but have to correlate ID with source data outside of PineconeColumns of SQL-like attributes attached to vector but have to correlate ID with source data outside of DB
  • Graph metadata
  • multimodal data
  • embeddings behind one unified data layer
Access modelFully SaaSOpen source, cloud hosted, installs in VPCCommunity edition, cloud hosted, installs in VPC
ACID transaction supportOnly over supported vector dataOnly over supported vector dataACID compliant across multimodal data including embeddings and graph metadata
PerformanceData and use case dependentData and use case dependent2-4X higher better throughput for customer use case. Scales to larger dimensions
Seamless integration with overall ML / analytics pipelinesJust built for vector search, now Langchain-style integrationsJust built for vector search, now Langchain-style integrationsIntegrates with:
  • data in various storage systems
  • labeling tools
  • ML training / inference frameworks
  • analytics frontends

Comparison Across AI Pipelines

AI teams today expect their data layer to let them manage different modalities of data, prepare data easily for AI/ML workloads, be easy for dataset management, manage annotations, track model information, and let them search and visualize data using multimodal searches.

Sadly their current choice to achieve each of those requirements is a manually integrated solution (DIY) where they have to bring together cloud stores, databases, labels in various formats, finicky (vision) processing libraries, and vector databases, in order to transfer multimodal data input to meaningful AI or analytics output. Such DIY systems consume too much time from AI teams (sometimes 6-9 months), are long to install, bad for debugging, and painful to maintain. AI teams are very skilled, hard to hire or retain, and the last thing we want is having them maintain infra code.

The right Database needs to not only understand the complexity of multimodal data management but also understand AI requirements to make it easy for AI teams to adopt and deploy in production. That’s what we have built with ApertureDB.

Given our work lies at the intersection of database and machine learning, we also have a comparison chart for ML tools.

png

For an example of how to manage multimodal data in ApertureDB, checkout our Cookbook examples.