toggle quoted messageShow quoted text
First, I changed the subject line because this writing has nothing to do with the “new addon” that has been hot on the press this day.
Well, I guess I’ll weigh in and you can tell me where I fall on the “accessibility spectrum”
I read a lot of law school textbooks which are divided “intensively sectioned” along with chapters, subchapters, and cases and notes within each of these;
I have received both very “accessible and usable” books where all the headings are marked and I can see that sections are within one another and so on; and
I have received “bad books” where there are no headings, no subsection labels, nor case names in headings, so a “search” will only work if I know specifically what I am looking for.
However, if I want to get the lay of the land of a textbook of this nature, and it isn’t marked up “usably” then I could be screwed;
Please let me know your feedback on this;
It seems in the latter case, I have to take sometimes hours to actually put all the headers in the book in order to make a book of this nature “interactive”.
Thank you very much for allowing me to speak on this topic.
firstname.lastname@example.org <email@example.com> On Behalf Of
Wednesday, December 8, 2021 5:17 PMTo:
Re: [nvda] new addon: NVDA Advanced OCR.
I wasn’t talking about whether there are features such as headings or if its just text. I have no objection to documents that read as just text. I was considering documents that have columns on the pages and where the columns aren’t properly placed. I was saying, perhaps not clearly, that I don’t know how many PDF documents have columns that cause problems. The PDF documents ai’ve read are generally accessible as you are describing, but I haven’t worked with enough to generalize.
Sent: Wednesday, December 08, 2021 3:52 PM
Subject: Re: [nvda] new addon: NVDA Advanced OCR.
On Wed, Dec 8, 2021 at 03:46 PM, Gene wrote:
But without experience of a large number of PDF documents, I wouldn’t assume that.
Gene, I can say, with complete honesty, that I cannot count the number of PDF documents I've dealt with, and in the context of a screen reader. The general hierarchy of accessibility has been:
1. Image Scanned - Inaccessible unless OCRed, and if OCRed, much depends on when as far as how well that works.
2. OCR processed by something designed to do so - If it's a fairly modern OCR engine, things like columnar text are generally handled with very good flow. If it was an early OCR engine, not so much. Document will not have, to quote Mr. Moxley, "proper heading structure, table structures (with appropriately marked headers), accessible links, alt text etc." OCR engines are generally not that sophisticated, though most can detect tables these days and set them up as such.
3. Created as PDF in a PDF Editor or MS-Word: 100% basic accessibility, but not necessarily "prettified" with all of the above noted features. I've created quite a few tutorials in MS-Word that I've then saved as PDF that are one to maybe three pages long, and step by step, and I certainly never go to that level of elaboration because of what the content is and how it's to be accessed. People creating things like church bulletins, flyers, and lots of other simple documents that are often of a "read once and then done" nature are unlikey to ever do so, either.
3. Created as PDF in a PDF Editor and of significant length, and intended for publication and/or a long archival life: 100% maximally accessible with all the features Mr. Moxley noted.
The fact of the matter is that I don't disagree with him, one bit, about what needs to be done to create a maximally accessible PDF if one is creating it from scratch and it is of any significant length. What I do disagree with is that this is necessary for the vast majority of very short PDFs out there that may or may not have been created as such.
When it comes to PDFs, and particularly PDFs of unknown origin, it's completely unrealistic to call them inaccessible if they don't have the prettification. I have scanned, and with OCR scanning at the time of scan, things like owner's manuals and service manuals that are hundreds of pages long. They will not ever have all of the prettification because it's just not possible, but their text content is complete, and searchable. That's accessible, and in most instances way more than just minimally accessible.
It's way faster for me to find what I'm looking for in these scanned PDFs because they are searchable than it is to find it using the source material, as often certain bits of information are put where you really wouldn't expect to find it and there's noting in the table of contents nor index or indices that would indicate that. But if you know the term you're looking for, you can blaze through hundreds of pages very quickly using search functionality. That's accessible whether you're doing this the sighted way or using a screen reader to do the same thing. It may not be as nice as it would be had the source material been created as PDF, but there will never come a time where every PDF started out life that way nor where whatever was used to OCR it could possibly produce something with all the features in characteristic of PDF born as PDF.
There's basic accessibility and publisher-layout-quality accessibility. They're not the same thing. We should, of course, constantly encourage the use of publisher-layout-quality with regard to accessibility where such is warranted. My 2-page flyer for next week's picnic, as a fictional example, would not be one of those times. If it's entirely readable, in the expected order, that's good enough.
The perfect should never be the enemy of the good.
Brian - Windows 10, 64-Bit, Version 21H1, Build 19043
The difference between a top-flight creative man and the hack is his ability to express powerful meanings indirectly.
~ Vance Packard