archaeo_super_prompt.modeling.pdf_to_text.stream_ocr_manual
source module archaeo_super_prompt.modeling.pdf_to_text.stream_ocr_manual
Better OCR model with VLLM.
Functions
-
ollama_vlm_options — Return a configuration for vlm model set with ollama.
-
vllm_vlm_options — Return a configuration for vlm model set with a vllm server (so an OpenAI compatible API).
-
converter — Return a Docling PDF converter object from an ollama vlm configuration.
-
process_documents — Convert the documents into text with Docling, using the given converter.
source ollama_vlm_options(model: str, prompt: str, response_format: Literal[ResponseFormat.HTML, ResponseFormat.MARKDOWN] = ResponseFormat.MARKDOWN, allowed_timeout: int = 60 * 3)
Return a configuration for vlm model set with ollama.
Parameters
-
model : str — the string identifier of the vllm model in ollama
-
prompt : str — a string to prompt to the vllm to contextualize its OCR task
-
response_format : Literal[ResponseFormat.HTML, ResponseFormat.MARKDOWN] — a supported response format for the vllm
-
allowed_timeout : int — the allowed time for processing one page in one document (default to 3 minutes)
source vllm_vlm_options(model: str, prompt: str, response_format: Literal[ResponseFormat.HTML, ResponseFormat.MARKDOWN] = ResponseFormat.MARKDOWN, allowed_timeout: int = 60 * 3)
Return a configuration for vlm model set with a vllm server (so an OpenAI compatible API).
Parameters
-
model : str — the string identifier of the vllm model in ollama
-
prompt : str — a string to prompt to the vllm to contextualize its OCR task
-
response_format : Literal[ResponseFormat.HTML, ResponseFormat.MARKDOWN] — a supported response format for the vllm
-
allowed_timeout : int — the allowed time for processing one page in one document (default to 3 minutes)
source converter(ollama_vlm_options: ApiVlmOptions)
Return a Docling PDF converter object from an ollama vlm configuration.
source process_documents(file_inputs: list[tuple[InterventionId, Path]], documentConvertor: DocumentConverter, incipit_only=True) → Iterator[tuple[tuple[InterventionId, Path], Iterator[tuple[PageRange, CorrectlyConvertedDocument]]]]
Convert the documents into text with Docling, using the given converter.
Returns
-
Iterator[tuple[tuple[InterventionId, Path], Iterator[tuple[PageRange, CorrectlyConvertedDocument]]]] — For each file, either a list of one docling document, if all the document can have been procesed at once, or a list of nullable docling documents for each document page. For some pages, the a null value is put when the page reading has failed.