Skip to content

Pipelines

SolerPerson

SolerPersonPipeline

Class for resolving duplicates and/or linking records.

Parameters

bulk_person()

Method for creating Core.ResolvedPerson from scratch.

This method gets all records contained in ADAPT.PersonClean that are not in Core.ResolvedPerson. It resolves records in bulk by program, inserting into core all programs from their Soler.temp_resolve_* table once all programs have been resolved.

run(spawn_personclean=False, spawn_personresolved=False, load_to_core=True, load_personclean=False, update_personresolved=False, bulk_personresolved=False)

Main method for resolving person records.

stream_person()

Method for updating Core.ResolvedPerson

This method gets all records contained in ADAPT.PersonClean that are not in Core.ResolvedPerson.

It resolves records individually, inserting into core as they are completed by program.

SolerProgram

SolerProgramPipeline

Class for resolving duplicates and/or linking records.

Parameters

resolve_programs()

This function resolves programs in 3 steps: 1. Resolves organizationids that already exist in core.programresolved 2. Attempts to match in sql using levenshtein distance and resolve to existing programs 3. Extracts into python and conducts fuzzy search against SOIPrograms table

run(spawn_programresolved=False, load_to_core=True, load_soi_programs=False)

Main method to run SOlerProgramPipeline.