Pipelines
SolerPerson
SolerPersonPipeline
Class for resolving duplicates and/or linking records.
Parameters
bulk_person()
Method for creating Core.ResolvedPerson from scratch.
This method gets all records contained in ADAPT.PersonClean that are not in Core.ResolvedPerson. It resolves records in bulk by program, inserting into core all programs from their Soler.temp_resolve_* table once all programs have been resolved.
run(spawn_personclean=False, spawn_personresolved=False, load_to_core=True, load_personclean=False, update_personresolved=False, bulk_personresolved=False)
Main method for resolving person records.
stream_person()
Method for updating Core.ResolvedPerson
This method gets all records contained in ADAPT.PersonClean that are not in Core.ResolvedPerson.
It resolves records individually, inserting into core as they are completed by program.
SolerProgram
SolerProgramPipeline
Class for resolving duplicates and/or linking records.
Parameters
resolve_programs()
This function resolves programs in 3 steps: 1. Resolves organizationids that already exist in core.programresolved 2. Attempts to match in sql using levenshtein distance and resolve to existing programs 3. Extracts into python and conducts fuzzy search against SOIPrograms table
run(spawn_programresolved=False, load_to_core=True, load_soi_programs=False)
Main method to run SOlerProgramPipeline.