LlamaIndex
LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data. It provides end-to-end tooling to ship a context-augmented AI agent to production. ApertureDB's capabilities fit within its ecosystem as an advanced search and retriever. This is useful for using an ApertureDB vector store as part of pipelines such as RAG (Retrieval-Augmented Generation).
ApertureDB is a hybrid vector store and graph database. Currently, LlamaIndex supports the vector store functionality of ApertureDB (contributed by AIMon team). This means that you can use ApertureDB as a provider for LlamaIndex's vector store. This allows you to store and retrieve vectors from ApertureDB using LlamaIndex's API.
In the future, we plan to add support for ApertureDB's graph database functionality to LlamaIndex. This will allow you to store and query graphs in ApertureDB using LlamaIndex's API.
This is a work in progress and will be contributed upstream to the LlamaIndex repository soon. If you are using this integration, please let us know at team@aperturedata.io.
Vector Store
ApertureDB's integration with LlamaIndex is done through the ApertureDB Vector Store. This is a Python package that provides an interface for storing and retrieving vectors from ApertureDB.
For an example of using ApertureDB in LlamaIndex, see aimon doc chatbot code. You can also find the source code for the ApertureDB Vector Store on GitHub.
Example code using Aperturedb Vector store
!mkdir data
!cd data && wget https://vldb.org/pvldb/vol14/p3240-remis.pdf && cd -
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext
)
from llama_index.vector_stores.ApertureDB import ApertureDBVectorStore
from google.colab import userdata
import os
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
adb_client = ApertureDBVectorStore(dimensions=1536)
storage_context = StorageContext.from_defaults(vector_store=adb_client)
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()
query_str = [
"Who all created VDMS. Give me a list.",
"How many images were ingested in VDMS for its scale test?",
"What are distinguishing features of VDMS?"
]
for qs in query_str:
response = query_engine.query(qs)
print(f"{qs=}\r\n")
print(response)
Here's a link to a complete working example.
Graph database
It is possible to use a lot of ApertureDB's functionality through LlamaIndex, but the full power of ApertureDB is only available through the ApertureDB API. For example, you can use LlamaIndex to store and retrieve vectors from ApertureDB, and then use the ApertureDB API to query the graph database.
In the future, we plan to add support for ApertureDB's graph database functionality to LlamaIndex. This will allow you to store and query graphs in ApertureDB using LlamaIndex's API. This will make it easier to use ApertureDB as a graph database in LlamaIndex applications.
Implementation details
TODO - Needs correct details.
Those attempting a hybrid approach should note a few of details of how LlamaIndex vectore stores and documents are represented internally in ApertureDB:
- The LlamaIndex vector store corresponds to a
DescriptorSet
in ApertureDB - Documents with embeddings correspond to
Descriptor
s. - The document
id
field is stored in theuniqueid
property - The document
text
field is stored in thetext
property - Metadata properties are stored as properties with a
lc_
prefix.