Re: Working with not accessible interfaces. Thoughts and question.


Alexey Zhelezov
 

Dear Brian,


I am sorry I could not clearly explain what I mean. I am new in the area. Probably the best way is working example, but the person I am working with is sleeping at the moment (he is in US) and I do not want publish completely untested staff.


But I can try to explain. Let say Windows Calculator is written without accessibility and without keyboard support. So while it is displayed as now, it can be operated by mouse only and NVDA say PANE when focusing it.

Than using the concept, someone with sight can define coordinates for digits and operations buttons and coordinates of the result string. Binding numerical and operational keys as gestures, add-on can click on corresponding buttons. After clicking and giving some time to update, result region is grabbed from the screen, passed to OCR and sent to speech. That way such calculator can be perceived as somehow accessible.

More complex elements like list boxes, check boxes, etc. can be defined the same way.


The concept is definitively not new. Clicking part has many incarnation, including an add-on for NVDA. OCR add-on for NVDA can grab and describe an image. What I could not find is a framework to use all that together. And my primary question either I just have not found it or it does not exist. Note that I know at least one big project for JAWS, CakeTalking, which partially use the concept, at least in part of recognizing some particular element states from graphics.


My answers to your questions.

The concept can work with particular software only and so will need an add-on per application. I mean it will not recognize arbitrary interface automatically, at least not before several experienced mathematicians and programmers work together several years.

I am not aware about any existing standard for that, my question was about the existence of it. From answers I guess it does not exist.

My current prototype is an add-on. But my experience with NVDA is still almost zero, so I do not know what is the best approach to represent virtual controls. May be Virtual Buffers, may be new handler. Controls should have persistent properties, at least the focus. Also recreating them all the time can be extremely system resources heavy. Everything I could find in NVDA so far is somehow underlying external objects oriented. Controls is my concept are completely artificial in that respect. So I am not excluding some changes in the core are required to make real integration. At the moment I just keep separate global objects and speakMessage directly. As I have mentioned before, the add-on is not even alpha, it is a prototype. 


I hope that clarify at least something.


Regards,

Alexey.




From: nvda@nvda.groups.io <nvda@nvda.groups.io> on behalf of Brian's Mail list account <bglists@...>
Sent: Sunday, February 19, 2017 1:50 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] Working with not accessible interfaces. Thoughts and question.
 
You see the reason we are all a bit confused here is that the programs
mentioned in the first post would appear to not have any way to tell that
the images were controls, which is what the ocr add on cannot do either, it
can just detect words and help you move a cursor or mouse pointer to it. If
said program does not trigger an event to allow one to recognise what the
control actually does then you are still stuck.
 If the routine mentioned has somehow gotten around this impasse then yes we
need it if it can work on other software.
Some off the software you find on many web sites according to the sighted
looks windows like with buttons on the  screen, but to the screenreader it
may just say pane. these are the places where some kind of shortcut is
needed. If you are saying that there is a standard here which we are unaware
of and you have cracked it, then fine but if its only on just the one suite
of software its better put into a nvda add on for that suite of software,
rather than offering ait as a part of nvda itself. I just cannot get your
concept you are attempting to tell us about to be understandable.
 Brian

bglists@...
Sent via blueyonder.
Please address personal email to:-
briang1@..., putting 'Brian Gaff'
in the display name field.
----- Original Message -----
From: "Alexey Zhelezov" <azslow3@...>
To: <nvda@nvda.groups.io>
Sent: Sunday, February 19, 2017 10:02 AM
Subject: Re: [nvda] Working with not accessible interfaces. Thoughts and
question.


Dear Pranav,


I understand that generic solution you describe can be useful. But that is
much more technically challenging then simple approach I have described.

The process can be divided into following steps: find interface elements in
the image, recognize the state and if required OCR that element, convert the
information into accessible element.

In my case the first step is not automatic. For mentioned music software
that make sense, these interfaces normally have fixed size and content. In
term of Windows programming, they are resource defined dialogs. Apart from
periodic layout changes, for example excluded Classic theme in Window 8,
once the definition is done it can be reused by everyone.

What you describe assumes the first step is automatic and generic. Modern
OCRs, using very complicated mathematic, have managed to locate and
recognize text peaces, blocks and up to some level the layout. Bringing that
up to the level of interface elements recognition is a huge step. The
problem is that most elements are not distinguishable even for people, for
example there is no difference in picture between several lines of text,
table with one column and a list box. First selected item in the list box is
the same as the header of the table. And so on. People use own experience,
intuition and sometimes even the documentation to find that something on the
screen is an interactive element. Programs help there changing mouse cursor
shape and colors. Still people skip for example links in web page, thinking
that is just highlighted static text. Teach computer to be smarter then
human is too challenging for me, at least not for one man project during
free time.


Reading static images is different. I think better processing of embedded
images is more like a future request for existing OCR add-on for NVDA.


Regards,

Alexey.


________________________________
From: nvda@nvda.groups.io <nvda@nvda.groups.io> on behalf of Pranav Lal
<pranav.lal@...>
Sent: Sunday, February 19, 2017 5:03 AM
To: nvda@nvda.groups.io
Subject: Re: [nvda] Working with not accessible interfaces. Thoughts and
question.


Dear Alexey,



If you are referring to a generic solution that would OCR an inaccessible
interface, parse it into its constituent elements like list boxes, combo
boxes, buttons and allow NVDA to interact with it, then yes, this is
something we need. I am not a user of music software therefore cannot
comment on that but for arguments sake take a program like google earth. The
menus may work but what about grabbing the text on the screen? Similarly,
take images imbedded into Microsoft Word documents. I can extract them from
a file but cannot read the images in context.



In addition, take mind mapping programs such as freemind.



Pranav






Join nvda@nvda.groups.io to automatically receive all group messages.