Loaders
Census
CensusLoader
Bases: BaseLoader
Class for transforming and loading Census data to warehouse.
__init__(connection, pipeline_version)
Transform and load Census data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and engine
to SOI Warehouse, either |
required |
`pipeline_version` |
AssetId to track data loads by run/asset id. Passed from SoFlow. |
required |
create_census(table)
Method to create a Staging table.
drop_census(table)
Method to remove a Staging table.
drop_census_views(table)
Method to remove a Census view.
There are a sequence of queries in
soload.sql.view_removal_census
to remove views in the
Census schema. This function will run a query to remove a view.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to remove a view for (ex: 'Metrics'). |
required |
load_census(df, table)
Method for loading each table in the Census schema.
This method populates the tables of the Census schema. Insert timestamps are checked and if no new records have
been observed since the latest pipeline run, loading is skipped
and the transient class attribute new_records
is set to False.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to, 'Census.Metrics' or 'Census.ProgramInforItems'. |
required |
load_census_views(table)
Method for transforming from Census data to create a view.
There are a sequence of transform/load queries in
soload.sql.view_creation_census
to create views in the
Census schema. This function will create a view for the given table
using that query.
Note that you will first have to remove a view if it already exists
using remove_census_views.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to create a view for (ex: 'Metrics'). |
required |
Common
CommonLoader
Bases: BaseLoader
Class for transforming and loading Common data to warehouse.
__init__(connection)
Transform and load Common data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and engine
to SOI Warehouse, either |
required |
`pipeline_version` |
AssetId to track data loads by run/asset id. Passed from SoFlow. |
required |
create_common(table)
Method to create a common table.
drop_common(table)
Method to drop a common table.
load_common(df, table)
Method for loading each table in the Common schema.
This method populates the tables of the Common schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to. |
required |
GMS
GMSLoader
Bases: BaseLoader
Class for transforming and loading OpenMRS data to warehouse.
__init__(connection, pipeline_version, msi)
Transform and load OpenMRS data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and engine
to SOI Warehouse, either |
required |
`pipeline_version` |
AssetId to track data loads by run/asset id. Passed from SoFlow. |
required | |
`msi` |
T/F. If msi is used to connect to dwh. |
required |
clear_source(source_dict, table)
Method to clear specified source from existing GMS tables.
create_gms(table)
Method to create a GMS table.
drop_gms(table)
Method to remove a GMS table.
load_base_gms(df, table)
Method for loading each table in the GMS schema.
This method populates the tables of the GMS schema and must run
before transforming and loading entities and relationships to
Core. Insert timestamps are checked and if no new records have
been observed since the latest pipeline run, loading is skipped
and the transient class attribute new_records
is set to False.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to, e.g. 'GMS.Entries'. |
required |
OpenMRS
OpenMRSLoader
Bases: BaseLoader
Class for transforming and loading OpenMRS data to warehouse.
__init__(connection, pipeline_version)
Transform and load OpenMRS data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and engine
to SOI Warehouse, either |
required |
`pipeline_version` |
AssetId to track data loads by run/asset id. Passed from SoFlow. |
required |
Attributes:
Name | Type | Description |
---|---|---|
`last_updated` |
datetime
|
The last pipeline run time in the warehouse. Used for CDC. |
`row_count` |
int
|
The incrementer for rows loaded. |
`skipped` |
int
|
The incrementer for skipped rows. |
`emoji_dict` |
dict
|
Discipline-specific emojis because emojis. |
create_has(table)
Method to create a HAS table.
drop_has(table)
Method to drop a HAS table.
drop_haspublic_views(table)
Method to remove a HASPublic view.
There are a sequence of queries in
soload.sql.view_removal_haspublic
to remove views in the
HASPublic schema. This function will run a query to remove a view
from table. The table must already exist for this to work.
Args:
table
(str): The table to create a view for (ex: 'SHEApp').
drop_views(table)
Method to remove a HAS view.
There are a sequence of queries in
soload.sql.view_removal_has
to remove views in the
HAS schema. This function will run a query to remove a view from table.
The table must already exist for this to work.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to remove a view for (ex: 'FitFeet'). |
required |
load_base_has(df, table)
Method for loading each table in the HAS schema.
This method populates the tables of the HAS schema and must run
before transforming and loading entities and relationships to
Core. Insert timestamps are checked and if no new records have
been observed since the latest pipeline run, loading is skipped
and the transient class attribute new_records
is set to False.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to, e.g. 'HAS.FitFeet'. |
required |
load_haspublic_views(table)
Method for transforming from HASPublic data to create a view.
There are a sequence of transform/load queries in
soload.sql.view_creation_haspublic
to create views in the
HASPublic schema with additional columns and calculations.
This function will create a view for the given table using that query.
Note that you will first have to remove a view if it already exists
using remove_views.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to create a view for (ex: 'SHEApp'). |
required |
load_stored_procedure(table)
Method to connect together the HAS views, Common tables and Core into a stored procedure to get a single connected table. This will create HASPublic.[DisciplineName]
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to create a view for (ex: 'FitFeet'). |
required |
load_views(table)
Method for transforming from HAS data to create a view.
There are a sequence of transform/load queries in
soload.sql.view_creation_has
to create views in the
HAS schema with additional columns and calculations. This function
will create a view for the given table using that query.
Note that you will first have to remove a view if it already exists
using remove_views.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`table` |
str
|
The table to create a view for (ex: 'FitFeet'). |
required |
SHE
SHELoader
Bases: BaseLoader
Class for transforming and loading SHE data to warehouse.
__init__(connection, pipeline_version)
Transform and load SHE data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and engine
to SOI Warehouse, either |
required |
create_she()
Method to remove a SHE table.
The table must already exist for this to work.
drop_she()
Method to remove a SHE table.
The table must already exist for this to work.
load_she(df, table)
Method for loading each table in the SHE schema.
This method populates the tables of the SHE schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to. |
required |
SportPartnership
SportPartnershipLoader
Class for transforming and loading Sport Partnership data to warehouse.
__init__(connection, pipeline_version)
Transform and load Sport Partnership data into SOI Warehouse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`connection` |
Connection
|
The sqlalchemy connection and
engine to SOI Warehouse, either |
required |
create_sport_partnership()
Method to create a SportPartnership table.
drop_sport_partnership()
Method to remove the SportPartnership table.
load_sport_partnership(df, table)
Method for loading the SportPartnership table in the Sport schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`df` |
DataFrame
|
The dataframe to be loaded. |
required |
`table` |
str
|
The table to write to. |
required |