archaeo_super_prompt.modeling.train
source module archaeo_super_prompt.modeling.train
DAGs to train the FieldExtractor models.
Classes
-
ExtractionDAGParts — A decomposition of the general DAG into different parts for a better handling between the training, the inference and the evaluation modes.
Functions
-
get_training_dag — Return the most advanced pre-processing DAG for the model.
-
train_from_scratch — Return the most advanced DAG model, fitted from the data.
-
get_fitted_model — Return the most advanced DAG model, mockly fitted from the data.
source class ExtractionDAGParts()
Bases : NamedTuple
A decomposition of the general DAG into different parts for a better handling between the training, the inference and the evaluation modes.
source get_training_dag() → ExtractionDAGParts
Return the most advanced pre-processing DAG for the model.
All its estimators and transformers are initialized with particular parametres.
Returns
-
ExtractionDAGParts — A part of the complete DAG for getting the pre-processed data. The field extractors related to their parent node, to apply on these extractors special training or evaluation operations or to bind them to the preprocessing dag The final union component to finish the building of the complete DAG in inference mode.
source train_from_scratch(training_input: PDFPathDataset, ds: MagohDataset) → ExtractionDAGParts
Return the most advanced DAG model, fitted from the data.
Apply a training for each FieldExtractor model.
source get_fitted_model(training_input: PDFPathDataset, ds: MagohDataset)
Return the most advanced DAG model, mockly fitted from the data.
The FieldExtractor model are supposed already fitted from saved dspy models in get_model_store_dir() path.