Skip to main content

LlamaIndex

LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data. It provides end-to-end tooling to ship a context-augmented AI agent to production. ApertureDB's capabilities fit within its ecosystem as an advanced search and retriever. This is useful for using an ApertureDB vector store as part of pipelines such as RAG (Retrieval-Augmented Generation).

ApertureDB is a hybrid vector store and graph database. Currently, LlamaIndex supports the vector store functionality of ApertureDB (contributed by AIMon team). This means that you can use ApertureDB as a provider for LlamaIndex's vector store. This allows you to store and retrieve vectors from ApertureDB using LlamaIndex's API.

In the future, we plan to add support for ApertureDB's graph database functionality to LlamaIndex. This will allow you to store and query graphs in ApertureDB using LlamaIndex's API.

This is a work in progress and will be contributed upstream to the LlamaIndex repository soon. If you are using this integration, please let us know at team@aperturedata.io.

Vector Store

ApertureDB's integration with LlamaIndex is done through the ApertureDB Vector Store. This is a Python package that provides an interface for storing and retrieving vectors from ApertureDB.

For an example of using ApertureDB in LlamaIndex, see aimon doc chatbot code. You can also find the source code for the ApertureDB Vector Store on GitHub.

Example code using Aperturedb Vector store


!mkdir data
!cd data && wget https://vldb.org/pvldb/vol14/p3240-remis.pdf && cd -

from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext
)
from llama_index.vector_stores.ApertureDB import ApertureDBVectorStore

from google.colab import userdata
import os

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

adb_client = ApertureDBVectorStore(dimensions=1536)
storage_context = StorageContext.from_defaults(vector_store=adb_client)

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

query_engine = index.as_query_engine()
query_str = [
"Who all created VDMS. Give me a list.",
"How many images were ingested in VDMS for its scale test?",
"What are distinguishing features of VDMS?"
]
for qs in query_str:
response = query_engine.query(qs)
print(f"{qs=}\r\n")
print(response)

Here's a link to a complete working example.

Graph database

It is possible to use a lot of ApertureDB's functionality through LlamaIndex, but the full power of ApertureDB is only available through the ApertureDB API. For example, you can use LlamaIndex to store and retrieve vectors from ApertureDB, and then use the ApertureDB API to query the graph database.

In the future, we plan to add support for ApertureDB's graph database functionality to LlamaIndex. This will allow you to store and query graphs in ApertureDB using LlamaIndex's API. This will make it easier to use ApertureDB as a graph database in LlamaIndex applications.

Implementation details

TODO - Needs correct details.

Those attempting a hybrid approach should note a few of details of how LlamaIndex vectore stores and documents are represented internally in ApertureDB:

  • The LlamaIndex vector store corresponds to a DescriptorSet in ApertureDB
  • Documents with embeddings correspond to Descriptors.
  • The document id field is stored in the uniqueid property
  • The document text field is stored in the text property
  • Metadata properties are stored as properties with a lc_ prefix.