archaeo_super_prompt.modeling.chunk_selector
source module archaeo_super_prompt.modeling.chunk_selector
Utils to select boundaries page of interest in documents.
Functions
-
select_incipit — Select only the chunks of the first pages of the document.
-
select_end_pages — Select only the chunks of the last pages of the document.
source select_incipit(chunkDataset: PDFChunkPerInterventionDataset)
Select only the chunks of the first pages of the document.
source select_end_pages(chunkDataset: PDFChunkPerInterventionDataset)
Select only the chunks of the last pages of the document.