OK, great regex and NVDA Dictionary Gurus


 

This is a direct spin off of an earlier topic asking about how to get two single quotes, when used in place of a double quote, announced as quote.  Since the description is that the double single quote characters always surround another string, I used that entire configuration.

I have used these two regexes, and tested same using Python regex syntax on my favorite regex testing site:

''(.*)''         Single quote single quote left paren dot asterisk right paren single quote single quote
as well as the enumerated variant: '{2}(.*)'{2}

Both of these match any string I've tried that is preceded and followed by two single quotes successively.

The replacement string I've used, with either one of those, in the default dictionary:  quote \1 quote

It does not work.  

If I have the string:
''chinchilla''
I hear chinchilla prime.

I know that synths can come into play, but I wouldn't think the synth should matter at all when my replacement should be:
quote chinchilla quote

What am I missing here?
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov

 


Tyler Spivey
 

I've never heard NVDA say prime. That must be something with your setup, or you're not sending it the characters you think you are.

I did the following:
1. Created a new NVDA config (Run, nvda -c c:\users\tyler\nvdatest).
2. Read ''chinchilla'' with the default OneCore. It just read chinchilla.
3. Same with espeak.
4. Added the dictionary entry you tried:
Pattern: ''(.*)''; Replacement: quote \1 quote; case: off; Type: Regular expression
Read that again with espeak, and it said quote chinchilla quote as expected.

On 3/25/2021 5:02 PM, Brian Vogel wrote:
This is a direct spin off of an earlier topic asking about how to get two single quotes, when used in place of a double quote, announced as quote.  Since the description is that the double single quote characters always surround another string, I used that entire configuration.
I have used these two regexes, and tested same using Python regex syntax on my favorite regex testing site:
''(.*)''         Single quote single quote left paren dot asterisk right paren single quote single quote
as well as the enumerated variant: '{2}(.*)'{2}
Both of these match any string I've tried that is preceded and followed by two single quotes successively.
The replacement string I've used, with either one of those, in the default dictionary:  quote \1 quote
It does not work.
If I have the string:
''chinchilla''
I hear chinchilla prime.
I know that synths can come into play, but I wouldn't think the synth should matter at all when my replacement should be:
quote chinchilla quote
What am I missing here?
--
Brian -Windows 10 Pro, 64-Bit, Version 20H2, Build 19042
/Any idiot can face a crisis. It's the day-to-day living that wears you out./
      ~ Anton Chekhov


Luke Davis
 

Brian

Please confirm something.

In your regex, did you use the ASCII single quote character ('), or the actual quote character from the OP's message, which was unicode codepoint U+2032? (Represented, in HTML, as ′)

From your message here, it seems to be the ASCII apostrophe that you used, not the prime, which if true could be your problem.

To type the prime character on windows, use alt+2032.

Various information here:

https://www.google.com/search?ie=ISO-8859-1&hl=en&source=hp&q=html+character+entity+prime&btnG=Google+Search&iflsig=AINFCbYAAAAAYF03wc4I2yJdrT0JaT__cBeLB4jkrVec&gbv=1

Luke


 

Luke & Tyler,

Thanks to you both.  And before I start, I did use straight apostrophe/single quote, the key to the right of the semicolon on the keyboard.

I do believe I may have sorta solved my own problem, and I have probably asked this general question before:  Does NVDA dictionary processing "drop out" after a first match is made?

In earlier experiements for someone who wanted any lower or upper case letter followed by apostrophe read as {that letter} prime, I put in this Regex:
([a-zA-Z])'
with a replacement string of \1 prime.

This entry is several items above the one I asked about previously.

Now, why under OneCore David that would give the full word, prime, e.g., Gene prime, when I was using Gene, still remains a mystery.

But I'm now thinking that I'm experiencing is a match and drop out before the regex I think should match is ever hit.
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov

 


Tyler Spivey
 

On 3/25/2021 5:51 PM, Brian Vogel wrote:
I do believe I may have sorta solved my own problem, and I have probably asked this general question before:  Does NVDA dictionary processing "drop out" after a first match is made?
No. It just keeps going, with the replaced text.
So your first entry is conflicting with the second one.

To figure this out, I replaced 1 with 2, then 2 with 3. When I hit 1, it said 3.


 

Tyler,

           Thanks.  Now that you mention this it is, to my embarrassment, bringing back the fact that Quentin Christensen once went through all this with me, in great detail.  He went to great lengths to do what you mention using several bizarre substitutions where what comes out at the end is utterly counterintuitive.

            I still find the fact that this process doesn't drop out upon a match being made strange, to say the least.  I would far rather have to think about ordering my regexes to get the result I want than having one doing a replacement, then having that replacement passed on to all subsequent dictionary entries (regex or not), lather, rinse, repeat.  It is just far too easy to induce a completely unexpected result and to have no idea of why.
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov

 


Giles Turnbull
 

Brian, thanks for continuing to investigate this reg expression angle. Tyler, the ALT+2032 keypad doesn't produce a character that NVDA describes as "prime" ... I just tried it with the NVDA speech viewer running and, with Num Lock on, it produces something that sounds like "latin letter thorn" and with Num Lock off, it produces a soft hyphen. It does not produce a character that NVDA describes as prime.

I used to have a VBA macro used in an Excel spreadsheet that would generate the ALT codes for specific characters typed into an Excel form I'd created, but sadly I don't have that spreadsheet anymore and, since I created it in my sighted days, I didn't give any thought to being able to access it if I lost my sight!

My Belarussian friend's name is Julia Sharova and she lives in Minsk. Most of her posts on Facebook are public so you're able to view them to see the use of the prime character in place of a quote mark. As I replied in my latest comment on my question that Brian alludes to at the start of this question, because the post is originally written in Belarussian, Facebook runs it through its own translation service, and I have a suspicion that maybe it is running it through optical character recognition and thinking the single quotation mark is a pair of prime symbols, rather than prime symbols being typed by my friend Julia in her original post.

I have already asked Julia a question about what characters she typed in one of her original posts, and she hasn't replied yet, so please don't ask her any questions directly. This is the link to one such example:
https:// is. gd/ 7Svlxw

I have added spaces after the slashes and the dots because I've never had much luck posting links in this forum!

Thanks all :)

Giles


 

On Fri, Mar 26, 2021 at 12:47 PM, Giles Turnbull wrote:
My Belarussian friend's name is Julia Sharova and she lives in Minsk. Most of her posts on Facebook are public
-
They do not appear to be so, at least not if you aren't a friend.  I've tried looking at all of 2021 and all of 2020 and all I can get back is an updated profile picture post.

Now, mind you, I am about as Facebook Ignorant as they come, but have looked at other pages before and more posts were visible.  I avoid Facebook (all social media, really) like the plague.
 
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov

 


 

On Fri, Mar 26, 2021 at 12:47 PM, Giles Turnbull wrote:
Brian, thanks for continuing to investigate this reg expression angle
-
For others looking at the regex angle, what follows is a direct copy of a pertinent line that is shown when I "de-space" the tinyurl previously given by Giles:

Somewhere on the subway side ′′ Long Live Belarus!" from the amp.

I intentionally pasted this into Word, first, and have used a greatly enlarged font size for the Boldoni MT font since some here may have residual vision and when it's pasted in the default font used by the Groups.io web interface it's visually impossible to see exactly what's going on.

It is clear, particularly when selecting character by character, that two prime marks are used, with a leading space before and a trailing space afterward, before the actual text: Long Live Belarus!

The punctuation afterward is an actual single character which is the double quotation mark character.

Giles, try the following regex:
′′(.*)"
with a replacement string of:
quote \1 quote

and see what that gets you.

I can't for the life of me figure out what would actually create this strange punctuation situation.  It's definitely more work to type three characters, prime prime space, than it is to hit SHIFT along with the quote key for a double quote.  It's even weirder since the double quote is the closer.
 
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov

 


Luke Davis
 

On Fri, 26 Mar 2021, Giles Turnbull wrote:

Tyler, the ALT+2032 keypad doesn't produce a character that NVDA describes as "prime"
That was me, not Tyler. And you're right, it doesn't.

Although U+2032 is its code, I should have said: alt+8242, which is its Windows alt code for some reason. However, it turns out it only works in a handful of places. MS Word, for example. In places that expect only ASCII (Notepad, Notepad++), it only writes the number 2.

I was able to type prime prime (.*) prime prime
into Word (using the actual character, of course, and no spaces), and then copy and paste it into the NVDA dictionary add field.
Basically, I created the entry that Brian suggested, with "\1" as the replacement, and found that it works.

For the text from the facebook post, where there are two primes with spaces around them, I came up with this regex:

′′[^′]

I don't know if that will send correctly, so it's:

prime, prime, left square bracket, up arrow (shifted number 6), prime, right square bracket.

For the replacement text, I used "

That makes the facebook message render with a quote replacing the primes if you read by line or page, and still tells you the primes if you move by character.

Here's the rundown on creating primes: https://www.webnots.com/how-to-type-prime-symbols-with-keyboard/

And here's some explanation of what primes are used for in language (read the full article for more uses): https://en.wikipedia.org/wiki/Prime_(symbol)#Use_in_linguistics

Luke


Luke Davis
 

I should clarify: if you use the second regex I provided, you shouldn't need the first one. At least I don't think so.

On Fri, 26 Mar 2021, Luke Davis wrote:

′′[^′]

I don't know if that will send correctly, so it's:
prime, prime, left square bracket, up arrow (shifted number 6), prime, right square bracket.
For the replacement text, I used "


Giles Turnbull
 

Brian, that reg ex works! It didn't work with the " mark end of the closing parenthesis of the initial text, but I tried deleting that and it works :)

I can't remember if I said in a comment, but my friend confirmed that she hadn't typed those prime symbols, they are apparently an aberation of Facebook's translation system.

Thanks again for your assistance :)

Giles


 

Giles,

          You're quite welcome.  I came to the conclusion late yesterday, after looking at the page at the tinyURL link you gave, that this was likely some artifact of machine translation.  There is no way in Hades that a living, breathing human being would do this, again and again, particularly when they're using "the usual" punctuation on the right side of the quotation to end it!

           I can't imagine why the closing quote would have made it not work.  If, by some bizarre chance, this regex eventually starts tripping over something else that it shouldn't, I also noticed on the example that there is a space after those two initial prime marks, which I elided into the part of the regex that captures what's in the middle.  You can reintroduce a single space after the two prime marks to make it even more contextually selective if that's ever needed.  It probably won't be, but it's worth knowing.
--

Brian - Windows 10 Pro, 64-Bit, Version 20H2, Build 19042  

Any idiot can face a crisis. It's the day-to-day living that wears you out.

      ~ Anton Chekhov