Skip to main content

Generate Embeddings Workflow

This workflow allows you to add embeddings for images in ApertureDB using a pre-trained model. This means that you can use the embeddings to search for similar images, or to classify images based on the embeddings. This provides an easy way to add embeddings to your images, and to see how they can be used with real data.

The workflow will also allow you to add embeddings for PDF documents. This works slightly differently from images because, in addition to extracting text from the PDF, the full text is then segmented into shorter texts. This allows you to find relevant sections in the PDF document that are similar to some input text.

For more detailed information about what this workflow is doing, see the embeddings-extraction documentation in GitHub.

Creating the workflow

For general information on creating workflows in ApertureDB Cloud see Creating and Deleting Workflows.

Configure your workflow by selecting:

  • Which instance to use. If you only have one instance, there will be no options to select.
  • The model to use to generate embeddings. Currently, only one model is available, but more may be available in the future.
  • Whether to generate embeddings for images.
  • Whether to generate embeddings for PDF documents.

Setup Your Workflow

Once you have filled in the fields, click "Submit". Your workflow will be created and will start running.

See the results

If you go to the "My Instances" page and click on "Connect" for the instance you used, you will see an option to go to the Web UI for your instance. You will see the number of descriptors in the database increase as the workflow runs. Click on the refresh button to update the count.