Re: new addon: NVDA Advanced OCR.


On Wed, Dec 8, 2021 at 12:13 PM, Joseph Lee wrote:
For PDF files, provided that they are generated with accessibility in mind,

I haven't seen any PDF originally created as PDF that's not accessible, fully accessible, with the possible exception of the lack of Alt Text for images.

That being said, I always presume these OCR functions are going to need to exist for a very long time simply because there exist so many image scanned PDF files that were created long before OCR became a standard part of scanning (or even existed).

I'll tell you what I told several of my former clients who were grad students, and who routinely were handed ancient image scanned PDFs that have been in use for years to decades:  OCR process them, save the text layer with the file itself, then try like the dickens to get whoever it is that maintains the archive from which the original was pulled to ditch that original and replace it with the OCRed version.

It's really not anyone's fault that inaccessible PDFs exist that were scanned in "another era."  But those documents can easily be made accessible via OCR done so that the result can be saved as part of the source file.  Those who are the digital archivists should be willing to replace inaccessible versions with accessible ones with just the slightest bit of vetting of the result.  And if they don't want to accept an OCRed version from someone else, a system needs to be in place to report image scanned PDFs for permanent OCR processing by staff, and that it be done promptly.  This isn't time intensive when you're working on demand, rather than a search and destroy mission for every PDF that might be image scanned.

Brian - Windows 10, 64-Bit, Version 21H1, Build 19043  

The difference between a top-flight creative man and the hack is his ability to express powerful meanings indirectly.

         ~ Vance Packard


Join to automatically receive all group messages.