ParallelLoader

ParallelLoader Objects

class ParallelLoader(ParallelQuery.ParallelQuery)

Parallel and Batch Loader for ApertureDB

This takes a dataset (which is a collection of homogeneous objects) or a derived class, and optimally inserts them into database by splitting them into batches, and passing the batches to multiple workers.

get_entity_indexes

def get_entity_indexes(schema: dict) -> dict

Returns a dictionary of indexes for entities' properties.

Arguments:

schema dict - The schema dictionary to get indexes from.

Returns:

dict - A dictionary of entity indexes.

get_connection_indexes

def get_connection_indexes(schema: dict) -> dict

Returns a dictionary of indexes for connections' properties.

Arguments:

schema dict - The schema dictionary to get indexes from.

Returns:

dict - A dictionary of connection indexes.

query_setup

def query_setup(generator: Subscriptable) -> None

Runs the setup for the loader, which includes creating indices. Currently, it only creates indices for the properties that are used for constraint.

Will only run when the argument generator has a get_indices method that returns a dictionary of the form:

{
    "entity": {
        "class_name": ["property_name"]
    },
}

or

{
    "connection": {
        "class_name": ["property_name"]
    },
}

Arguments:

generator Subscriptable - The Subscriptable object that is being ingested

ingest

def ingest(generator: Subscriptable,
           batchsize: int = 1,
           numthreads: int = 4,
           stats: bool = False) -> None

Method to ingest data into the database

Arguments:

generator Subscriptable - The list of data, or a class derived from Subscriptable to be ingested.
batchsize int, optional - The size of batch to be used. Defaults to 1.
numthreads int, optional - Number of workers to create. Defaults to 4.
stats bool, optional - If stats need to be presented, realtime. Defaults to False.

ParallelLoader Objects​

get_entity_indexes​

get_connection_indexes​

query_setup​

ingest​

ParallelLoader Objects

get_entity_indexes

get_connection_indexes

query_setup

ingest