Skip to content

archaeo_super_prompt.modeling.chunk_selector

source module archaeo_super_prompt.modeling.chunk_selector

Utils to select boundaries page of interest in documents.

Functions

  • select_incipit Select only the chunks of the first pages of the document.

  • select_end_pages Select only the chunks of the last pages of the document.

source select_incipit(chunkDataset: PDFChunkPerInterventionDataset)

Select only the chunks of the first pages of the document.

source select_end_pages(chunkDataset: PDFChunkPerInterventionDataset)

Select only the chunks of the last pages of the document.