Skip to content

Loaders

Census

CensusLoader

Bases: BaseLoader

Class for transforming and loading Census data to warehouse.

__init__(connection, pipeline_version)

Transform and load Census data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required
`pipeline_version`

AssetId to track data loads by run/asset id. Passed from SoFlow.

required

create_census(table)

Method to create a Staging table.

drop_census(table)

Method to remove a Staging table.

drop_census_views(table)

Method to remove a Census view.

There are a sequence of queries in soload.sql.view_removal_census to remove views in the Census schema. This function will run a query to remove a view.

Parameters:

Name Type Description Default
`table` str

The table to remove a view for (ex: 'Metrics').

required

load_census(df, table)

Method for loading each table in the Census schema.

This method populates the tables of the Census schema. Insert timestamps are checked and if no new records have been observed since the latest pipeline run, loading is skipped and the transient class attribute new_records is set to False.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to, 'Census.Metrics' or 'Census.ProgramInforItems'.

required

load_census_views(table)

Method for transforming from Census data to create a view.

There are a sequence of transform/load queries in soload.sql.view_creation_census to create views in the Census schema. This function will create a view for the given table using that query. Note that you will first have to remove a view if it already exists using remove_census_views.

Parameters:

Name Type Description Default
`table` str

The table to create a view for (ex: 'Metrics').

required

Common

CommonLoader

Bases: BaseLoader

Class for transforming and loading Common data to warehouse.

__init__(connection)

Transform and load Common data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required
`pipeline_version`

AssetId to track data loads by run/asset id. Passed from SoFlow.

required

create_common(table)

Method to create a common table.

drop_common(table)

Method to drop a common table.

load_common(df, table)

Method for loading each table in the Common schema.

This method populates the tables of the Common schema.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to.

required

GMS

GMSLoader

Bases: BaseLoader

Class for transforming and loading OpenMRS data to warehouse.

__init__(connection, pipeline_version, msi)

Transform and load OpenMRS data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required
`pipeline_version`

AssetId to track data loads by run/asset id. Passed from SoFlow.

required
`msi`

T/F. If msi is used to connect to dwh.

required

clear_source(source_dict, table)

Method to clear specified source from existing GMS tables.

create_gms(table)

Method to create a GMS table.

drop_gms(table)

Method to remove a GMS table.

load_base_gms(df, table)

Method for loading each table in the GMS schema.

This method populates the tables of the GMS schema and must run before transforming and loading entities and relationships to Core. Insert timestamps are checked and if no new records have been observed since the latest pipeline run, loading is skipped and the transient class attribute new_records is set to False.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to, e.g. 'GMS.Entries'.

required

OpenMRS

OpenMRSLoader

Bases: BaseLoader

Class for transforming and loading OpenMRS data to warehouse.

__init__(connection, pipeline_version)

Transform and load OpenMRS data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required
`pipeline_version`

AssetId to track data loads by run/asset id. Passed from SoFlow.

required

Attributes:

Name Type Description
`last_updated` datetime

The last pipeline run time in the warehouse. Used for CDC.

`row_count` int

The incrementer for rows loaded.

`skipped` int

The incrementer for skipped rows.

`emoji_dict` dict

Discipline-specific emojis because emojis.

create_has(table)

Method to create a HAS table.

drop_has(table)

Method to drop a HAS table.

drop_haspublic_views(table)

Method to remove a HASPublic view.

There are a sequence of queries in soload.sql.view_removal_haspublic to remove views in the HASPublic schema. This function will run a query to remove a view from table. The table must already exist for this to work. Args: table (str): The table to create a view for (ex: 'SHEApp').

drop_views(table)

Method to remove a HAS view.

There are a sequence of queries in soload.sql.view_removal_has to remove views in the HAS schema. This function will run a query to remove a view from table. The table must already exist for this to work.

Parameters:

Name Type Description Default
`table` str

The table to remove a view for (ex: 'FitFeet').

required

load_base_has(df, table)

Method for loading each table in the HAS schema.

This method populates the tables of the HAS schema and must run before transforming and loading entities and relationships to Core. Insert timestamps are checked and if no new records have been observed since the latest pipeline run, loading is skipped and the transient class attribute new_records is set to False.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to, e.g. 'HAS.FitFeet'.

required

load_haspublic_views(table)

Method for transforming from HASPublic data to create a view.

There are a sequence of transform/load queries in soload.sql.view_creation_haspublic to create views in the HASPublic schema with additional columns and calculations. This function will create a view for the given table using that query. Note that you will first have to remove a view if it already exists using remove_views.

Parameters:

Name Type Description Default
`table` str

The table to create a view for (ex: 'SHEApp').

required

load_stored_procedure(table)

Method to connect together the HAS views, Common tables and Core into a stored procedure to get a single connected table. This will create HASPublic.[DisciplineName]

Parameters:

Name Type Description Default
`table` str

The table to create a view for (ex: 'FitFeet').

required

load_views(table)

Method for transforming from HAS data to create a view.

There are a sequence of transform/load queries in soload.sql.view_creation_has to create views in the HAS schema with additional columns and calculations. This function will create a view for the given table using that query. Note that you will first have to remove a view if it already exists using remove_views.

Parameters:

Name Type Description Default
`table` str

The table to create a view for (ex: 'FitFeet').

required

SHE

SHELoader

Bases: BaseLoader

Class for transforming and loading SHE data to warehouse.

__init__(connection, pipeline_version)

Transform and load SHE data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required

create_she()

Method to remove a SHE table.

The table must already exist for this to work.

drop_she()

Method to remove a SHE table.

The table must already exist for this to work.

load_she(df, table)

Method for loading each table in the SHE schema.

This method populates the tables of the SHE schema.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to.

required

SportPartnership

SportPartnershipLoader

Class for transforming and loading Sport Partnership data to warehouse.

__init__(connection, pipeline_version)

Transform and load Sport Partnership data into SOI Warehouse.

Parameters:

Name Type Description Default
`connection` Connection

The sqlalchemy connection and engine to SOI Warehouse, either Dev_v01xx or Prod_v01xx.

required

create_sport_partnership()

Method to create a SportPartnership table.

drop_sport_partnership()

Method to remove the SportPartnership table.

load_sport_partnership(df, table)

Method for loading the SportPartnership table in the Sport schema.

Parameters:

Name Type Description Default
`df` DataFrame

The dataframe to be loaded.

required
`table` str

The table to write to.

required