Skip to content

archaeo_super_prompt.modeling.entity_extractor.model

source module archaeo_super_prompt.modeling.entity_extractor.model

Core functions for inferring and filtering named entities in chunks.

Functions

  • fetch_entities Infer into the remote NER model to find named entities in each chunk.

  • gatherEntityChunks Gather the chunk of entity output from one text chunk.

  • postrocess_entities Return a set of the occured entities for each chunks.

  • filter_entities For each text chunk, keep only the entities included in the given group of allowed entity types.

source fetch_entities(chunks: list[str])

Infer into the remote NER model to find named entities in each chunk.

source gatherEntityChunks(entity_chunks: list[NerOutput], confidence_treshold: float)

Gather the chunk of entity output from one text chunk.

source postrocess_entities(entitiesPerTextChunk: list[list[NerOutput]], confidence_treshold: float)

Return a set of the occured entities for each chunks.

Parameters

  • entitiesPerTextChunk : list[list[NerOutput]] for each chunk, a list of its retrieved entities ordered by their occurence in the chunk's text content

  • confidence_treshold : float a treshold between 0 and 1 to tolerate only a subset of entities

source filter_entities(complete_entity_sets: list[list[CompleteEntity]], allowed_entities: set[NerXXLEntities])list[list[CompleteEntity]]

For each text chunk, keep only the entities included in the given group of allowed entity types.