Hi,
I have quite a lot of ebooks that are scanned books with OCR text (books from the Open Library, available via archive.org). The problem is often render slowly in SumatraPDF and searching is extremely slow. I’ve tried some online service to shrink the PDFs, but didn’t like the results, so I’ve decided to tinker with GhostScript and optimize the ebooks myself. I started with a command like:
gswin64 ^
-o downsampled96.pdf ^
-sDEVICE=pdfwrite ^
-dDownsampleColorImages=true ^
-dDownsampleGrayImages=true ^
-dDownsampleMonoImages=true ^
-dColorImageResolution=96 ^
-dGrayImageResolution=96 ^
-dMonoImageResolution=96 ^
-dColorImageDownsampleThreshold=1.0 ^
-dGrayImageDownsampleThreshold=1.0 ^
-dMonoImageDownsampleThreshold=1.0 ^
%1
And tested various resolutions with a 1100 pages book.
With 72 PPI the books are harder to read, but display and search in SumatraPDF blazing fast.
With 96 PPI they look better, quite readable, the quality could be better, but rendering and searching is still fast.
With 150 PPI they look quite good, rendering is still fast enough, but searching starts to feel very slow.
Do you have any suggestions, what can I do, to improve the quality of the output without sacrificing search speed? I’ve seen GhostScript options, but I’m not sure which should I try.