Re: Require help to convert image PDFs into text PDF documents


Luke Davis
 

If they really are image PDFs, then you will never get what you want.

If you use OCR software on them (others can suggest what OCR software is best these days; I don't use OCR packages on Windows), you can get some version of the text out of these.
For example, if I process an image PDF through KNFB reader, or one of the network based recognition apps, I can get somewhere between a 70 percent and 95 percent accurate recognition of the contents.

NVDA's built in OCR feature (NVDA+R) can possibly read these one page at a time, although it's not designed for full document recognition.

However you won't get links and headings and all of that. Software simply isn't smart enough to figure out which pictures of text are supposed to become those formatting and connective elements.

Also, while I could be mistaken about this, I don't believe an image PDF can have links.

Lastly, this is not an NVDA problem. You probably want to take this conversation to a general Windows list or the chat subgroup.

Luke

"I have no idea what I'm supposed to do. I only know what I can do." -James T. Kirk

Join nvda@nvda.groups.io to automatically receive all group messages.