Skip to main content

Ingest Datasets - Movies

This workflow allows you to ingest movies dataset into ApertureDB, including movie metadata, posters, and related information. It is built from TMDB 5000 dataset. This provides an easy way to get started with ApertureDB using connected data comprising of information related to movie, professionals, production companies, genres, etc. It also demonstrates how to work with structured metadata alongside multimedia content like images and videos.

Prerequisites

Before running this workflow, you should have an ApertureDB instance set up and accessible. No additional data needs to be pre-loaded into the database, as this workflow will populate it with movie information.

Configuring

Creating and deleting workflows
For general information on creating workflows in ApertureDB Cloud see Creating and Deleting Workflows.

This ingestion process is very opinionated, and it persists the movies information in a certain schema. It's details can be perused on the github page.

See the GitHub repository for more information
For more detailed information about what this workflow is doing, additional information about the parameters, and how to run the workflow outside of the ApertureDB Cloud, see the dataset-ingestion-movies documentation in GitHub.

Running in ApertureDB Cloud

These instructions assume a standard cloud setup; for general information on creating workflows in ApertureDB Cloud see Creating and Deleting Workflows.

This is the view you will see when you go to instantiate the Movies Dataset Ingestion workflow:

[object Object]
1
  1. Click the blue button at the bottom.

Once you have selected your options, click "Submit". Your workflow will be created and will start running.

Running in Docker

All available options to the workflow are documented in the workflow README. Database access configuration is explained in Common Parameters.

Results

Once the workflow has started (see Managing Workflow for information on workflow states), you can view results as they appear.

Open the Web UI for your instance.

You will see Entities and Images being added to your database as the workflow processes the movie dataset.

[object Object]
1
2
3
  1. Movie count will increase as movies are ingested
  2. Image count will increase if poster images are included
  3. Switch to Image Search to see the movie posters

After switching to the "Image Search" tab, you can query and view movie entities with their associated metadata and images.

[object Object]
1
2
  1. Click on Run to Search Entities
  2. View movie poster in bigger size

The ingested movie data includes structured metadata that can be queried and searched, making it easy to find movies by genre, year, rating, or other attributes.

Troubleshooting

  • No entities appearing in the Web UI

    Check that the workflow has started running and is not in an error state. If the workflow status shows as "Running" but no entities appear, wait a few minutes as the ingestion process may take time to process and insert the data.