Skip to main content

Work with Text,PDFs,Audio,Blobs...

Open In Colab Download View source on GitHub

Blob is how we can store any object like a text file, audio file, PDF, or another modality not yet specially recognized in ApertureDB and the query language allows you to directly add, find, update, and delete blob.

Connect to ApertureDB

Option A: ApertureDB Cloud (recommended)
Sign up for a free 30-day trial. Get your key from Connect > Generate API Key, add it to a .env file in this directory:

APERTUREDB_KEY=your_key_here

Option B: Community Edition (local Docker)
Run this in a terminal before starting the notebook:

docker run -d --name aperturedb \
-p 55555:55555 -e ADB_MASTER_KEY=admin -e ADB_FORCE_SSL=false \
aperturedata/aperturedb-community
%pip install --upgrade --quiet aperturedb python-dotenv

# Option A: ApertureDB Cloud
from dotenv import load_dotenv
load_dotenv() # loads APERTUREDB_KEY from .env into the environment

# Option B: Community Edition (local Docker)
# !adb config create localdb --active \
# --host localhost --port 55555 \
# --username admin --password admin \
# --no-use-ssl --no-interactive

from aperturedb.CommonLibrary import create_connector

client = create_connector()
response, _ = client.query([{"GetStatus": {}}])
client.print_last_response()

Create or Add a Recipe to ApertureDB

Let's say we want to add a recipe text file (this could be a property in an entity if small). One way to introduce new text in the database is through our query language

For bulk additions, we recommend using the Python SDK loaders

# Download the sample file
! mkdir -p data; cd data; wget https://github.com/aperture-data/Cookbook/blob/e333f6c59070b9165033d9ddd5af852a6b9624ba/notebooks/simple/data/baked_potato.txt; cd -
query = [{
"AddBlob": {
# Notice the missing "class" property since we are adding it as a blob (represented as _Blob in ApertureDB)
"properties": {
"document_type": "text", # use "pdf" for PDFs so the ApertureDB UI can render them inline
"type": "text", # since blobs can be of different types like audio, pdf, etc, we can make that explicit
"name": "baked_potato",
"id": 55,
"category": "sides",
"cuisine": "American",
"caption": "Special baked potatoes"
},
"if_not_found": { # avoid adding twice
"id": ["==", 55]
}
}
}]

# Read the image data as a binary blob
fd = open("data/baked_potato.txt", 'rb')
array = [ fd.read() ]
fd.close()

response, blobs = client.query(query, array)

client.print_last_response()
[
{
"AddBlob": {
"status": 0
}
}
]

Query blob by its metadata attributes

Verify this Blob was added to the database and read all the property values

from aperturedb import NotebookHelpers as nh   # Our helper package for image displays and other utilities

query = [{
"FindBlob": {
"constraints": {
"name": [">=", "baked"],
"type": ["==", "text"]
},
"blobs": True, # This is set to False by default
"results": {
"all_properties": True
}
}
}]

response, blobs = client.query(query)

client.print_last_response()
num_blobs = response[0]["FindBlob"]["returned"]
for count in range(num_blobs):
print(blobs[count].decode())
[
{
"FindBlob": {
"blobs_start": 0,
"entities": [
{
"_blob_index": 0,
"_uniqueid": "11.29.224580",
"caption": "Special baked potatoes",
"category": "sides",
"cuisine": "American",
"id": 55,
"name": "baked_potato",
"type": "text"
}
],
"returned": 1,
"status": 0
}
}
]
Perfect Baked Potato
Cook Time: 1 hour hr
Serves 4
Bake a potato perfectly every time! With the olive oil & sea salt coating, it'll come out of the oven with crispy skin and fluffy insides that are delicious with your favorite toppings.
Equipment

Baking Sheets (I use these nonstick ones from USA Pan Bakeware!)
Parchment Paper (this one doesn't burn)

Ingredients

4 medium russet potatoes
Extra-virgin olive oil
Sea salt

Instructions

Preheat the oven to 425°F and line a baking sheet with parchment paper.
Use a fork to poke a few holes into the potatoes. Place on the baking sheet, rub with olive oil, and sprinkle liberally with sea salt all over. Bake 45 to 60 minutes, or until the potato is fork-tender and the skin is crisp.
Slice open each potato, fluff the insides, and serve with desired toppings.

Update properties of the blob already in ApertureDB

Use UpdateBlob if any of the attributes need a new value or your application now needs a new attribute in existing blobs

query = [{
"UpdateBlob": {
"properties": {
"contributor": "Gavin" # property will get added if missing or the value will be updated
},
"constraints": {
"name": ["==", "baked_potato"]
},
}
}]

response, blobs = client.query(query)

client.print_last_response()
[
{
"UpdateBlob": {
"count": 1,
"status": 0
}
}
]
query = [{
"FindBlob": {
"constraints": {
"name": ["==", "baked_potato"]
},
"results": {
"all_properties": True
}
}
}]

response, blobs = client.query(query)

client.print_last_response()
[
{
"FindBlob": {
"entities": [
{
"_uniqueid": "11.29.224580",
"caption": "Special baked potatoes",
"category": "sides",
"contributor": "Gavin",
"cuisine": "American",
"id": 55,
"name": "baked_potato",
"type": "text"
}
],
"returned": 1,
"status": 0
}
}
]

Delete the blob we no longer need

query = [{
"DeleteBlob": {
"constraints": {
"name": ["==", "baked_potato"] # if this matches multiple videos, they will all be deleted
}
}
}]

response, blobs = client.query(query)

client.print_last_response()
[
{
"DeleteBlob": {
"count": 1,
"status": 0
}
}
]

Verify deletion

We can verify that the blob is not longer in the database.

query = [{
"FindBlob": {
"constraints": {
"name": ["==", "baked_potato"]
},
"results": {
"all_properties": True
}
}
}]

response, blobs = client.query(query)

client.print_last_response()
[
{
"FindBlob": {
"returned": 0,
"status": 0
}
}
]

What's next?