Topics

Identifying capital letters

Giles Turnbull
 

Hi all,

I am well aware of the NVDA settings for indicating capital letters with beeps or saying "cap" before capitals and the pitch change to indicate capitls. What I really want to find a way to do is to use my preferred option (beep for capitals) when using read-all. The capital indications only seem to be performed when reading character-by-character.

I'm writing up my 6,000-word MA degree dissertation and my supervisor noticed there were a few instances where I'd added some text at the start of an existing  sentence and forgotten to turn the old first letter to lower case. The only way I could detect that would be by reading the 34,869 characters one by one and listening for the beep. That is not an option!

Is there any way to get the capital indication to be active during read-all?

Thanks,

Giles

Chris Mullins
 

Hi Giles

You could try editing your text using an editor that supports search using a regular expression, such as EdSharp.  You could set a pattern of capital letters not preceded by a period, one or more space characters and or line break characters.  I’m sorry I don’t have the knowhow to give you the pattern but I’m sure there are RE experts on the list who could provide this for you. 

 

Good luck

Chris

 

From: nvda@nvda.groups.io [mailto:nvda@nvda.groups.io] On Behalf Of Giles Turnbull
Sent: 7 September 2019 10:20
To: nvda@nvda.groups.io
Subject: [nvda] Identifying capital letters

 

Hi all,

I am well aware of the NVDA settings for indicating capital letters with beeps or saying "cap" before capitals and the pitch change to indicate capitls. What I really want to find a way to do is to use my preferred option (beep for capitals) when using read-all. The capital indications only seem to be performed when reading character-by-character.

I'm writing up my 6,000-word MA degree dissertation and my supervisor noticed there were a few instances where I'd added some text at the start of an existing  sentence and forgotten to turn the old first letter to lower case. The only way I could detect that would be by reading the 34,869 characters one by one and listening for the beep. That is not an option!

Is there any way to get the capital indication to be active during read-all?

Thanks,

Giles

Luke Davis
 

Giles, unfortunately there is no such functionality in NVDA.
An add-on for this could be written, but to my knowledge at least, one hasn't.

As Chris suggested, however, if you have an editor which supports regular expression searches, you may be able to find (most of) the problem areas.

This regexp:

[\.\?!] +[a-z]|[a-zA-Z0-9,\+\)\]\"\'\`-] [A-Z]|^ *[a-z]

Finds every capitalization error in the following test text, with no false positives, using Notepad++ with case sensitive searching (you must use case sensitive searching, of course).

This is the first line, and shouldn't be found.
This Is the second line. It should be targeted because of "is".
This is the third, And should be captured like the 2nd! twice!
This fourth line shouldn't be found.
but this 5th line should.

The regexp above only works assuming certain sentence ending characters, and word boundary characters. Character sets can break it, different punctuation style can break it, almost anything can break it. There are probably more robust ways to write it, using more agnostic word boundary characters and such. I am in a hurry, so I will leave improving it to Brian and others, but it's a start I think.

Good luck.

Luke

Giles Turnbull
 

given the suggestions about regular expression searches, would it be possible to write a regular expression for NVDA's speech dictionary where a capital letter anywehre gets spoken as, for example, Home as "capital Home," or hOme as "h capitl Ome?"

That would enable me to listen from start to end, something I'm going to need to do before submission, and hear any cap;capitals as I go?

Thanks for the suggested regular expressions, Luke :)

Giles

 

Giles,

           As both a regular expression geek, and someone who's spent a lot of time in the halls of academe over the years, I would really suggest that you hire yourself a proofreader.   And I'm not saying this because you're blind, either.

           An experienced proofreader can pick up on all sorts of stuff, very quickly, without knowing anything much at all about the subject matter as that's not what their focus is.  If you are doing a master's thesis it is well worth the money to hire a proofreader if at all possible.

           The only way I have ever been able to proof my own material is if I set it aside after writing it for some time before revisiting it again, as just "knowing what I meant to put there" often results in my seeing what's not there, but instead what I had intended.   That's a luxury, time-wise, that one does not have with theses in general.

--

Brian - Windows 10 Pro, 64-Bit, Version 1903, Build 18362  

The color of truth is grey.

           ~ André Gide

 

 

 

Giles,

        Am I safe in presuming you're using MS-Word as your word processor?   If so we should open up a topic in the Chat Subgroup as there are some tricks you could possibly use with Word's own find and replace (whether you're replacing or not) function that would let you find embedded capitals mid-word or capitalized words mid-sentence.   It wouldn't be able to distinguish when proper names are involved mid-sentence, though.
--

Brian - Windows 10 Pro, 64-Bit, Version 1903, Build 18362  

The color of truth is grey.

           ~ André Gide

 

 

Gene
 

Not as far as I know.  I suggest such an option awhile ago but it hasn't been impleented, again, as far as I know.  I don't follow new features closely, though I read announcements about new versions.
 
Gene

----- Original Message -----
Sent: Saturday, September 07, 2019 4:20 AM
Subject: [nvda] Identifying capital letters

Hi all,

I am well aware of the NVDA settings for indicating capital letters with beeps or saying "cap" before capitals and the pitch change to indicate capitls. What I really want to find a way to do is to use my preferred option (beep for capitals) when using read-all. The capital indications only seem to be performed when reading character-by-character.

I'm writing up my 6,000-word MA degree dissertation and my supervisor noticed there were a few instances where I'd added some text at the start of an existing  sentence and forgotten to turn the old first letter to lower case. The only way I could detect that would be by reading the 34,869 characters one by one and listening for the beep. That is not an option!

Is there any way to get the capital indication to be active during read-all?

Thanks,

Giles

Giles Turnbull
 

I agree Brian ... mind you my dissertation is in poetry so doesn't necessarily folow the same grammar rules as prose. I do have a friendly proof reader and will be running it by her before submission, but I wanted to try to iron out those errant capitals if I could.

The regular expression search is useable, since I know when it flags capitals that are not preceded by a full stop whether it is corect to do so :)

Giles

 

Giles,

            Yes, that regex pattern is definitely "the bomb" for finding the sort of issue you wish to find.  It took me a while to parse it apart to understand it in its entirety; I'm a bit rusty these days.   Now that Windows 10 has PowerShell, you could even use it in conjunction with the grep (short for get regular expression and print) command to have specific lines where the issue is found produced so that you can easily search for those lines.   The thing I don't know is whether you can actually run grep against a .docx document or if you'd need to save a plain text or rich text format version of the file for processing.

             In any case, even if there is an intermediate step, you can use this method to quickly identify the vast majority of instances of uncapitalized sentence starts, capital letters embedded inside words, or capital letters on words within sentences where they'd typically not be.
--

Brian - Windows 10 Pro, 64-Bit, Version 1903, Build 18362  

The color of truth is grey.

           ~ André Gide

 

 

Ricardo Leonarczyk
 

Hi Giles,

Some time ago I needed something similar to what you're asking, for a very specific situation. So I put some entries in the dictionary for doing that, as you suggested in your message.
Even though for your situation I would stick to using the regex in an editor as others have already pointed out, I'm sharing the .dic file with the regular expressions, in case it is useful for you or someone else.
The link: https://drive.google.com/open?id=1TbJPRK37giS2xAAUSiC8WWY2QKaFDf8A

Now, it's in the way that best worked for me in the time, maybe it will need some tweaking to have it according to your taste. Also it's not the best way to do it, but it seems to work most of the time.

The way it is now, when NVDA finds individual upper case letters, it will first say "cap" and after will speak the letter, without breaking the word (inserting a space). When it finds a sequence of uppercase characters it'll say "All cap" and the sequence of letters as a whole word.

So for example, "Test" would become cap test, and "TEST" would be spoken as allcap test.

Em sáb, 7 de set de 2019 às 10:32, Giles Turnbull <giles.turnbull@...> escreveu:

given the suggestions about regular expression searches, would it be possible to write a regular expression for NVDA's speech dictionary where a capital letter anywehre gets spoken as, for example, Home as "capital Home," or hOme as "h capitl Ome?"

That would enable me to listen from start to end, something I'm going to need to do before submission, and hear any cap;capitals as I go?

Thanks for the suggested regular expressions, Luke :)

Giles

Giles Turnbull
 

Thanks so much for that dictionary file, Ricardo. That sounds like exactly what I was hoping for ... a means of listening to the text from start to finish and hearing when there is a capital letter :)

I'll try it out tomorrow :)

Giles

 

Ricardo,

           Might I ask what language you were using?   I, of course, recognize the a-z and A-Z sequences, but have no idea of what unicode characters are embraced by either the lowercase or uppercase range, À-Þ.

Yet another demonstration of how a strategically formulated, but very short, regular expression with capture groups can be used.

--

Brian - Windows 10 Pro, 64-Bit, Version 1903, Build 18362  

The color of truth is grey.

           ~ André Gide

 

 

Ricardo Leonarczyk
 

I used À-Þ to define the range for some latin accented characters in Unicode. It should cover most Romance languages such as Portuguese (my mother tongue) or Spanish. I'm not sure if it's the correct or standard way to do this, but it worked for my case.

Em seg, 9 de set de 2019 às 11:59, Brian Vogel <britechguy@...> escreveu:

Ricardo,

           Might I ask what language you were using?   I, of course, recognize the a-z and A-Z sequences, but have no idea of what unicode characters are embraced by either the lowercase or uppercase range, À-Þ.

Yet another demonstration of how a strategically formulated, but very short, regular expression with capture groups can be used.

--

Brian - Windows 10 Pro, 64-Bit, Version 1903, Build 18362  

The color of truth is grey.

           ~ André Gide