OCRing pdfs using NVDA?


Dan Beaver
 

Hi,

Does the OCR add-in ocr pdfs? Or do I need to find another ocr package to do this?

Thanks.

Dan Beaver


 

Dan,

         I cannot speak to NVDA's OCR capabilities, but will offer a suggestion as a stand-alone.  A company with the unfortunate name, these days anyway, of Tracker Software offers a free PDF viewer called PDF-XChange Viewer.  When you land on the Tracker home page if you do an NVDA search for "free PDF" you'll land straight on the download link for PDF-XChange Viewer.

         It has OCR capability for multiple languages built in to the free version and it has performed very well.  I have used it to process image scanned documents up to several hundred pages in length, and with some not-so-great original scans, and what it can OCR with it's default settings is pretty amazing.  Once an image scanned document that you know contains a lot of text has been opened in the program, simply hitting CTRL+SHIFT+C to run OCR on the file.  You simply save it (standard windows File menu commands) after the OCR is done and you have the file with a permanent text layer added.

Brian


Mallard <mallard@...>
 

Hello Brian,

I've downloaded the portable version of the programme, plus all the ocr language files.

I'm ocr-ing a long document right now, but I must say teh programme is not all that accessible.

For example, if I rpess Alt or Alt+f to go to the menus, my son sees the menus on screen, but NVDA doesn't read anything, either in braille or in voice.

By exploring the screen with NVDA+7 (numpad 7, of course), I reach the options and, if I click on them, something happens, but not always.

I can open a file, but can't see the options, for examples.

If I move around with the mouse, on the other hand, things are shown and read out correctly.

Ctrl+Shift+c doesn't work here. To start ocr.ing I need to use the mouse.

Am I doing something wrong?

Using Windows 7 and latest NVDA Next snapshot.

Ciao,
Ollie

Il 08/03/2016 17:29, Brian Vogel ha scritto:

Dan,

I cannot speak to NVDA's OCR capabilities, but will offer a suggestion as a stand-alone. A company with the unfortunate name, these days anyway, of Tracker Software <https://tracker-software.com/> offers a free PDF viewer called PDF-XChange Viewer. When you land on the Tracker home page if you do an NVDA search for "free PDF" you'll land straight on the download link for PDF-XChange Viewer.

It has OCR capability for multiple languages built in to the free version and it has performed very well. I have used it to process image scanned documents up to several hundred pages in length, and with some not-so-great original scans, and what it can OCR with it's default settings is pretty amazing. Once an image scanned document that you know contains a lot of text has been opened in the program, simply hitting CTRL+SHIFT+C to run OCR on the file. You simply save it (standard windows File menu commands) after the OCR is done and you have the file with a permanent text layer added.

Brian


 

Ollie,

            I've used this program primarily with JAWS, and it is not accessible in its entirety, but it is for performing OCR and saving the files.  The command CTRL+SHIFT+C is noted in the NVDA keystrokes documentation as being special use for the program Poedit, and if this key sequence is not turned on/off based on whether one is in Poedit that could be your problem.  In which case you'd have to use NVDA+F2 prior to hitting CTRL+SHIFT+C when you're in PDF-XChange Viewer to pass the keystroke through.  I cannot speak specifically to the interaction between NVDA and PDF-XChange reader in its entirety.  I'm using NVDA 2016.1 and when I'm in the program CTRL+Shift+C brings up the OCR dialog and it is announced as such by NVDA.  When I use ALT+F to open the File Menu and use down arrow to traverse it nothing is announced, but if I move my mouse pointer over the various menu options they are.  At first I thought this might be a result of the NVDA default in mouse preferences of having mouse tracking on, but even when I turn mouse tracking off I get nothing announced in the file menu via keyboard navigation. ALT+F,S will save a copy with the OCR embedded if you've done OCR and it's complete, even if NVDA does not announce it.

            The only thing I recommend screen reader users employ PDF-XChange Viewer for is the OCR function.  The only things I've taught are opening a document, triggering OCR, and saving a document and all of those things are announced in JAWS, but NVDA is not on the menu itself but when the respective dialogs for open and save are brought up via ALT+F,O and ALT+F,S those are announced.  For reading stick to Adobe Reader (or your own favorite if you have one).

Brian


Mallard <mallard@...>
 

Brian,

You turned on the light in my sleepy brain... lol

Of course, I'd have thought about it... I have to pass the command through to the application...

In any case, my son clicked on the right icon with the mouse, and the programme performed a practically perfect ocr of a huge file - over 700 pages.

I didn't mean to try with such an enormous file, but it was the only one I had at hand that didn't have text in it, and I must say I'm impressed.

I normally use Finereader 12 to recognise image-only files, but that isn't free. This programme is free, and it's a great assett.
I'm sure we could write to the devs to see if something can be done about readability of the menus. Because if I use a mouse, the various options are read out by NVDA, so I think something could be done.

Or perhaps someone could develop an add-on?

I have no knowledge of coding, so I can't tell whether it's feasible or not, but it would be interesting to find out.

Meanwhile, thanks for pointing us to this tool.
Ciao,
Ollie


 

Ollie,

          Glad it worked for you and you're happy with the result.  I'm sure a document of that size took several hours to process.  I think the largest I've ever done is a bit over 400 pages, and most are significantly smaller than that.  When you get into large PDF files, even OCR-ed ones, you pretty much have to be able to word search them to make finding that "needle in a haystack" a reasonable task.

          I still can't explain why CTRL+SHIFT+C is not working for you without using pass-through as it works just fine for me when I'm using NVDA 2016.1 and PDF-XChange Viewer together to do an OCR that way.

          I have suggested that others contact Tracker regarding accessibility issues.  While I can and do use screen readers as part of testing things out, and can say what works or doesn't work for "routine stuff," there's nothing like a skilled screen reader user who has to do "the deep stuff" to uncover, and even to explain, accessibility issues I will never encounter.

Brian


Mallard <mallard@...>
 

Brian,

The issue with Ctrl+Shift+C was simply my fault, as I mentioned previously. It simply didn't occur my mind to try and press NVDA+F2 before the key combination...
Now it works fine here too.

The document I processed took several hours indeed, but I went off cooking and things, so I didn't mind...

I'm going to study teh programme a bit more in depth, to see if there are workarounds to help others make good use of it, at least as much as possible.

If the devs are willing to install NVDA (given that it's free), we might be able to make them understand what is not working in menu viewing...
But I'm probably just too optimistic - I'm spoilt by so many great devs on the Android eyes-free list... (smile).

Ciao,
Ollie

Il 09/03/2016 15:30, Brian Vogel ha scritto:

Ollie,

Glad it worked for you and you're happy with the result. I'm sure a document of that size took several hours to process. I think the largest I've ever done is a bit over 400 pages, and most are significantly smaller than that. When you get into large PDF files, even OCR-ed ones, you pretty much have to be able to word search them to make finding that "needle in a haystack" a reasonable task.

I still can't explain why CTRL+SHIFT+C is not working for you without using pass-through as it works just fine for me when I'm using NVDA 2016.1 and PDF-XChange Viewer together to do an OCR that way.

I have suggested that others contact Tracker regarding accessibility issues. While I can and do use screen readers as part of testing things out, and can say what works or doesn't work for "routine stuff," there's nothing like a skilled screen reader user who has to do "the deep stuff" to uncover, and even to explain, accessibility issues I will never encounter.

Brian


 

On Wed, Mar 9, 2016 at 07:44 am, Mallard <mallard@...> wrote:
It simply didn't occur my mind to try and press NVDA+F2 before the key combination...

But, Ollie, what I'm wondering is why you would even need to do that?  I don't, so we've either got different versions of NVDA (I'm on 2016.1) that behave differently or you're using add-ons that I'm not that are having an impact, etc.   It's just a curiosity to me regarding what's actually capturing CTRL+SHIFT+C on your end ahead of PDF-XChange Viewer.  It doesn't seem to be NVDA 2016.1 with the Eloquence Synthesizer and NoBeepsSpeechMode add-ons active on my end.

These little mysteries are like fun puzzles (at least sometimes they are).

Brian