Forum moved here!

Home / How to read specific text from image PDF using “Select Tool”?


I have a PDF created from images. I want to select a particular portion of the image like we can do in adobe using select tool and then extract the text using Zonal OCR. Can anyone help me regarding this? I am a newbie in PDF development.
Image is attached.


Zonal OCR is not supported by SumatraPDF since it goes well beyond its remit and needs one ore two additional engines to “Clip” the Zone location then pass the partial image to the OCR engine.

SumatraPDF could be used in such a workflow to try and hold a given page at a given point on screen (using the CLI -page -zoom and -scroll commands) however it would need continuously fixed format files (e.g. a set of drawings where the title block is always lower right or invoices / statements where the fields are at a fixed position), since from the definition “if the area to be processed is different in every document zonal OCR cannot be used”

In such a case I would usually use other tools to “Burst” the images (see the export PDF options in your screenshot) from the PDF into a format such as mono or greyscale TIFF (depending on the OCR software) which is more suited to character processing.