OCR: Interacting with items from recognized UI


Luke Robinett <blindgroupsluke@...>
 

Hi,

 

For inaccessible UIs I’ll often run NVDA OCR on the UI and then press enter or space bar on the UI controls I find in the recognized text such as tabs, menus and buttons. I find this only works in some programs and other times seems to be ignored. I believe the issue in some cases is that some UIs only respond when an actual mouse is clicked on a control. Is there a way to have the mouse cursor move to the UI element corresponding with its label in the OCR viewer and produce a click on that element? If not, is this something I could submit on GitHub as a feature request? It would make the already helpful OCR feature even more useful.

 

Thanks,

Luke

 


Sascha Cowley
 

You can use normal mouse control functions in NVDA to do this. See section 5.7. Navigating with the Mouse of the NVDA user guide.


Luke Robinett <blindgroupsluke@...>
 

Hi Sascha,

I will review that section of the manual, but I just want to make sure it was clear what I was asking about. I know I can initiate a mouse click from the keyboard using the NVDA key and the numpad slash key. I also have the golden cursor ad on so I can move the mouse cursor around, set hot spots etc. I’m still not sure how I can get from the text readout of an OCR recognition result to the screen location of one of the controls it recognized. For example, say I take OCR of an application’s interface and when I’m reviewing the OCR output, one of the items I come across says ”options.” How can I tell NVDA to go to where it found that text on the UI and initiate a mouse click there? Thanks again.

On Jan 9, 2021, at 5:14 PM, Sascha Cowley via groups.io <sascha.camille=yahoo.com@groups.io> wrote:

Y


Sascha Cowley
 

Per section 5.7 of the user guide, these are the keyboard commands:

Name

Desktop key

Laptop key

Touch

Description

Left mouse button click

numpadDivide

NVDA+[

none

Clicks the left mouse button once. The common double click can be performed by pressing this key twice in quick succession

Move mouse to current navigator object

NVDA+numpadDivide

NVDA+shift+m

none

Moves the mouse to the location of the current navigator object and review cursor

You can read the OCR result as normal, then, when you are at the item you want, use these shortcuts.


Luke Robinett <blindgroupsluke@...>
 

yes, very familiar with those commands and I’m still not sure it is being understood exactly what I’m asking about. I think the messages I’ve already written on the subject are pretty clear and I do appreciate you for taking a shot at it but I’m not going to try to continue explaining it and risk muddying the waters of my original question. We will see if we get some other responses. Thanks again.

On Jan 9, 2021, at 11:00 PM, Luke Robinett via groups.io <blindgroupsluke=gmail.com@groups.io> wrote:

Hi Sascha,

I will review that section of the manual, but I just want to make sure it was clear what I was asking about. I know I can initiate a mouse click from the keyboard using the NVDA key and the numpad slash key. I also have the golden cursor ad on so I can move the mouse cursor around, set hot spots etc. I’m still not sure how I can get from the text readout of an OCR recognition result to the screen location of one of the controls it recognized. For example, say I take OCR of an application’s interface and when I’m reviewing the OCR output, one of the items I come across says ”options.” How can I tell NVDA to go to where it found that text on the UI and initiate a mouse click there? Thanks again.

On Jan 9, 2021, at 5:14 PM, Sascha Cowley via groups.io <sascha.camille=yahoo.com@groups.io> wrote:

Y




Lukasz Golonka
 

On Sat, 9 Jan 2021 17:02:00 -0800
"Luke Robinett" <blindgroupsluke@gmail.com> wrote:

Hi,

For inaccessible UIs I'll often run NVDA OCR on the UI and then press enter or space bar on the UI controls I find in the recognized text such as tabs, menus and buttons. I find this only works in some programs and other times seems to be ignored. I believe the issue in some cases is that some UIs only respond when an actual mouse is clicked on a control.
Your assertion is incorrect as NVDA already activates these controls by
moving mouse to their screen location and performing a left click.

While I cannot be sure why it works only for some controls for you the
most likely explanation is that Windows 10 OCR reports wrong screen
coordinates for the recognized text.
You can also try routing mouse to the given control manually by pressing
NVDA+numpad slash as the position of the caret inside recognition
results corresponds to the position of the control which contains that
text on the screen.

--
Regards
Lukasz


Kara Goldfinch
 

Hi Luke,

You can move the mouse to where the virtual focus is by pressing NVDA+numpad slash, then you can press numpad slash to click it.

I've also run into the  same issue as you where enter sometimes doesn't work, so here's a couple of things I also try.

After moving the mouse to the thing you want to  click, press shift+numpad slash twice with a short pause in-between. This locks and unlocks the left  mouse button. This could help if the app expects the button to be held down longer than usual.

Another thing to bear in mind is that sometimes the text that OCR recognises might just be the control's label, and the actual control could be above, below or to the side of it. I've noticed this in Omnissphere for example. The only way round this is to randomly click around the label with golden cursor, possibly listening for a different colour/brightness level depending on how you have it set up. I usually get tired of this pretty quickly, however.

I hope this helps any.


Kara

On 10/01/2021 01:02, Luke Robinett wrote:

Hi,

 

For inaccessible UIs I’ll often run NVDA OCR on the UI and then press enter or space bar on the UI controls I find in the recognized text such as tabs, menus and buttons. I find this only works in some programs and other times seems to be ignored. I believe the issue in some cases is that some UIs only respond when an actual mouse is clicked on a control. Is there a way to have the mouse cursor move to the UI element corresponding with its label in the OCR viewer and produce a click on that element? If not, is this something I could submit on GitHub as a feature request? It would make the already helpful OCR feature even more useful.

 

Thanks,

Luke

 


Luke Robinett <blindgroupsluke@...>
 

Hi Kara,

Thanks for your reply. I think what I wasn’t clear on is that hitting enter or space in the OCR viewer actually generates a left click on that item, something I was able to confirm. That clears up a lot of my confusion.
I think you might be right that the likely culprit is simply that the label isn’t actually part of the UI control so isn’t doing anything when I click it. Until we get more sophisticated GUI recognition tools in NVDA, my best bet is probably just to have my wife line the mouse cursor up with the controls I need so I can capture some golden cursor hotspots for future use.

Thanks again to everybody who replied,
Luke

On Jan 11, 2021, at 2:08 AM, Kara Goldfinch <kara.louise18@...> wrote:



Hi Luke,

You can move the mouse to where the virtual focus is by pressing NVDA+numpad slash, then you can press numpad slash to click it.

I've also run into the  same issue as you where enter sometimes doesn't work, so here's a couple of things I also try.

After moving the mouse to the thing you want to  click, press shift+numpad slash twice with a short pause in-between. This locks and unlocks the left  mouse button. This could help if the app expects the button to be held down longer than usual.

Another thing to bear in mind is that sometimes the text that OCR recognises might just be the control's label, and the actual control could be above, below or to the side of it. I've noticed this in Omnissphere for example. The only way round this is to randomly click around the label with golden cursor, possibly listening for a different colour/brightness level depending on how you have it set up. I usually get tired of this pretty quickly, however.

I hope this helps any.


Kara

On 10/01/2021 01:02, Luke Robinett wrote:

Hi,

 

For inaccessible UIs I’ll often run NVDA OCR on the UI and then press enter or space bar on the UI controls I find in the recognized text such as tabs, menus and buttons. I find this only works in some programs and other times seems to be ignored. I believe the issue in some cases is that some UIs only respond when an actual mouse is clicked on a control. Is there a way to have the mouse cursor move to the UI element corresponding with its label in the OCR viewer and produce a click on that element? If not, is this something I could submit on GitHub as a feature request? It would make the already helpful OCR feature even more useful.

 

Thanks,

Luke