How to spell out Roman numerals


Janet Brandly
 

Hello all,

 

Would someone please tell me how to get NVDA to spell out Roman numerals instead of automatically speaking them as Arabic numerals? I had a quick look at the symbols pronunciation list of more than 3,000 items and couldn’t find them easily,

 

Thanks,

 

Janet


Rui Fontes
 

Normally that kind of reading is controlled by the synth and not by NVDA...


Rui Fontes


Às 00:49 de 07/11/2020, Janet Brandly escreveu:

Hello all,

 

Would someone please tell me how to get NVDA to spell out Roman numerals instead of automatically speaking them as Arabic numerals? I had a quick look at the symbols pronunciation list of more than 3,000 items and couldn’t find them easily,

 

Thanks,

 

Janet


 

Janet,

         Depending on just how elaborate you want to get with this, it could be achieved in the speech dictionary using regular expression matching based on characters that can only be used in a Roman numeral.  But given how long they can get, and the rules regarding what can come before what, there's likely only a "quick and dirty" solution rather than a perfect one.

         I am presuming you want the Roman numeral (RN) seven to read as V I I or RN 900 to be read as M C M.  Is that correct and, if so, is there some expected range of the actual Roman notation as far as lowest and highest numbers involved?   
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Janet Brandly
 

Hello Brian and all,

 

I would just need NVDA to spell out Roman numerals for Arabic numbers 1 through 10.

 

Thanks,

 

Janet

\

 

From: nvda@nvda.groups.io <nvda@nvda.groups.io> On Behalf Of Brian Vogel
Sent: November 6, 2020 6:07 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

 

Janet,

         Depending on just how elaborate you want to get with this, it could be achieved in the speech dictionary using regular expression matching based on characters that can only be used in a Roman numeral.  But given how long they can get, and the rules regarding what can come before what, there's likely only a "quick and dirty" solution rather than a perfect one.

         I am presuming you want the Roman numeral (RN) seven to read as V I I or RN 900 to be read as M C M.  Is that correct and, if so, is there some expected range of the actual Roman notation as far as lowest and highest numbers involved?   
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

Janet,

          Just to be crystal clear, we're talking the following Roman numerals, which I will put spaces between each character for the purposes of this post so that they get read character by character:
I one
I I two
I I I three
I V four
V five
V I six
V I I seven
V I I I  eight  
I X  nine
X ten

You don't want or need something like X V I read for sixteen?

It's very easy to come up with a couple of short regular expressions that can handle the Roman numerals for the numbers one through ten.  It's a lot more difficult if you wanted it to handle any Roman numeral, e.g, 2020 written out as M M X X, or 1959 as M C M L I X

If all you want is one through 10 I'll toss together the regexes and replacement strings and post them for you to add to your speech dictionary.  I will also presume that any of these Roman numerals will have a preceding space and space after unless they are located as the start of a line or end of a line.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Janet Brandly
 

Hi again Brian,

 

That’s right, these are the only Roman numerals I need NVDA to speak. There would always be a space preceding the numeral; however, occasionally (perhaps 5 to 10% of the time) there would either be a letter or Arabic  number directly after the Roman numeral, without a space. I am using these numbers/numerals for things like fracture and cancer staging/grading.

 

Thanks so much for your help,

 

Janet

 

From: nvda@nvda.groups.io <nvda@nvda.groups.io> On Behalf Of Brian Vogel
Sent: November 7, 2020 2:15 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

 

Janet,

          Just to be crystal clear, we're talking the following Roman numerals, which I will put spaces between each character for the purposes of this post so that they get read character by character:
I one
I I two
I I I three
I V four
V five
V I six
V I I seven
V I I I  eight  
I X  nine
X ten

You don't want or need something like X V I read for sixteen?

It's very easy to come up with a couple of short regular expressions that can handle the Roman numerals for the numbers one through ten.  It's a lot more difficult if you wanted it to handle any Roman numeral, e.g, 2020 written out as M M X X, or 1959 as M C M L I X

If all you want is one through 10 I'll toss together the regexes and replacement strings and post them for you to add to your speech dictionary.  I will also presume that any of these Roman numerals will have a preceding space and space after unless they are located as the start of a line or end of a line.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

Janet,

           I believe I understand what you're saying about the character afterward, but would you mind tossing out a couple of examples.  Particularly, if the character(s) that can come after the Roman numeral are limited to a select few.

           Also, can the Roman numerals be either upper case or lower case letters for the numeral, or strictly one or the other?
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Gene
 

This could be made much easier, and possibly adding this ability would be useful to others in different contexts.

You can now tell the dictionary to recognize the item anywhere, whole word, or regular expression. There should be an option for something like only exactly as written. In other words, if I wrote a Roman Numeral, placed a space before and after it, and made it case sensative, that would minimize the number of times it would be recognized by the dictionary. The system wouldn't be perfect. You would have to not have the letter capital I alone recognized as something to be spoken as a numeral. But the listener could know without any trouble when it is meant as a Roman numeral. I don't think there would be any other such cases. It may be that adding the exactly as written only option wouldn't help many users, perhaps it would require too much work to implement to be justified, given all the other things developers might work on and the time versus benefit ratio. But perhaps this would be easy and would benefit many people. Maybe it would benefit people in other contexts than just Roman numerals that I don't know about. Whatever the case, this might be something worth discussing.

Gene

-----Original Message-----
From: Brian Vogel
Sent: Saturday, November 07, 2020 3:14 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

Janet,

Just to be crystal clear, we're talking the following Roman numerals, which I will put spaces between each character for the purposes of this post so that they get read character by character:
I one
I I two
I I I three
I V four
V five
V I six
V I I seven
V I I I eight
I X nine
X ten

You don't want or need something like X V I read for sixteen?

It's very easy to come up with a couple of short regular expressions that can handle the Roman numerals for the numbers one through ten. It's a lot more difficult if you wanted it to handle any Roman numeral, e.g, 2020 written out as M M X X, or 1959 as M C M L I X

If all you want is one through 10 I'll toss together the regexes and replacement strings and post them for you to add to your speech dictionary. I will also presume that any of these Roman numerals will have a preceding space and space after unless they are located as the start of a line or end of a line.
--


Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041

It’s hard waking up and realizing it’s not always black and white.

~ Kelley Boorn


Janet Brandly
 

Hi Brian,

 

Yes, the Roman numerals must be in upper case. Some examples of Roman numerals combined with letters and Arabic numbers are:

 

â Schatzker III: Depression only of lateral tibial plateau (two types):

Schatzker IIIa: Lateral depression.

Schatzker IIIb: Central depression.  

 

Less often, Roman numerals may also be followed by lower-case letters or Arabic numbers.

 

Thanks again for your help. Maybe this could benefit others as well.

 

Janet

 

 

 

 

From: nvda@nvda.groups.io <nvda@nvda.groups.io> On Behalf Of Brian Vogel
Sent: November 7, 2020 2:50 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

 

Janet,

           I believe I understand what you're saying about the character afterward, but would you mind tossing out a couple of examples.  Particularly, if the character(s) that can come after the Roman numeral are limited to a select few.

           Also, can the Roman numerals be either upper case or lower case letters for the numeral, or strictly one or the other?
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

On Sat, Nov 7, 2020 at 04:53 PM, Gene wrote:
This could be made much easier,
-
What could be?

Everything you write after this is essentially what I proposed:  using the speech dictionary with regular expression matching to very strictly limit what is captured and substituted.

One of the beauties of regular expressions is how they can be crafted to catch only what you want.
 
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

Janet,

        Thanks.  I'll post what you need to use a bit later this evening.  I will only ask this one time again, are the letters after the numeral limited to a certain set?  If not, I'll look for lowercase A through lowercase Z, but if it should only be A through F (or something similar) let me know.  That's very easy to tweak.

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Gene
 

It isn't clear to me after experimenting with the speech dictionary how the whole word setting works. I had originally thought that those who wanted to have something like Roman numerals spoken as standard numbers might need another choice in the speech dictionary. There is now whole word and anywhere as choices. As I thought about it, I thought that whole word should work and that no other option would be necessary. If IV were seen as a whole word and the dictionary spoke 4 when it saw IV with spaces on either side, if you include spaces in the entry, that that would not result in extraneoussspeaking of 4. So another choice in the list of radio buttons, as Ioriginally suggested, wouldn't be needed.

I experimented with this and iv by itself with spaces in the pattern field and 4 in the pronounced as field doesn't work. IV is still spoken as IV when written in this way. what does NVDA consider a whole word? When I try a word such as alive and use the whole word setting, that works. Perhaps what NVDA sees as a whole word needs to be changed.

Since most people won't know how to work with regular expressions, the ability to do this sort of thing using the whole word option might be valuable.

Gene

-----Original Message-----
From: Brian Vogel
Sent: Saturday, November 07, 2020 5:01 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

On Sat, Nov 7, 2020 at 04:53 PM, Gene wrote:
This could be made much easier,-
What could be?

Everything you write after this is essentially what I proposed: using the speech dictionary with regular expression matching to very strictly limit what is captured and substituted.

One of the beauties of regular expressions is how they can be crafted to catch only what you want.

--


Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041

It’s hard waking up and realizing it’s not always black and white.

~ Kelley Boorn


 

Janet,

Below is the list of 10 regular expressions, followed by what you use for the replacement, that you need to enter in your speech dictionary in the order listed.  I emphasize again: in the order listed.

This is important because I believe (and am waiting for confirmation) that the speech dictionary (or any dictionary) has its entries processed in order, and on the first match the replacement is passed to the synthesizer and the processing for that "word/character cluster" stops.  If you had the entry for Roman numeral one first, it would snag Roman numerals 4, 3, and 2 incorrectly since all of them are composed of a collection of capital Is.

In addition, I am going to presume from your example that all of these Roman numeral, with possible optional letter, sequences must have a colon after the last character of the sequence with no space between the two.  If there is no colon then the match will not work, and that's by design, as I do not want the pronoun I to be captured as Roman numeral one.

If you want the word "Roman" or something else in front of the individual characters of the numeral before they're read out one by one then stick that in front of the first character of the replacement string.  I just went for the individual letters making up the numeral, along with the letter following it, if that letter is present.

The regular expressions all start with a backslash and end with a question mark.  The replacement strings all end with backslash one (the digit 1).  When working with the dictionary to add entries, the regular expression goes in the Pattern edit box, the replacement string in the Replacement box, and the Type radio button must be set to regular expression.

\s?IIII([a-z])?:\s?    I I I I \1
 
\s?III([a-z])?:\s?    I I I \1
 
\s?II([a-z])?:\s?    I I \1
 
\s?I([a-z])?:\s?    I \1
 
\s?IV([a-z])?:\s?    I V \1
 
\s?VIII([a-z])?:\s?    V I I I \1
 
\s?VII([a-z])?:\s?    V I I \1
 
\s?VI([a-z])?:\s?    V I \1
 
\s?V([a-z])?:\s?    V \1
 
\s?IX([a-z])?:\s?    I X \1
 
\s?X([a-z])?:\s?    X \1


--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

Janet,

           A quick addendum, I don't know how the letter A following a Roman numeral will end up being pronounced, as that's based on the synthesizer, so you may get ah or you may get A.  I just can't be sure.  I think letters B through Z are more likely to be read as the character itself.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

Janet,

          A second quick addendum, I just realized that what I've given so far may not work for cancer staging, as I presume that would be the word "stage" followed by the Roman numerals one through four, depending on which stage.

          We can create four more patters specific to the word stage preceding the Roman numerals one through four.  Luckily, you won't have to worry about reordering, as the prior matches all require a colon, and would all fail for the Roman numerals one through four that don't have a colon immediately following.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Luke Davis
 

Brian

I'm not receiving Janet's messages for some reason, so I'm not sure of every detail of her requirement for this, but I am left with a question.

What is the \s? at each end doing?
I mean obviously it is looking for zero or one space characters, but why?

If you can have zero space characters, that means you can have any character there, including a space character, since the space matching is un-anchored.
In fact, it is the same as \s*, for the same reason. (Or, possibly even the same as .*)

So I think the expression should work identically with or without the "\s?", although I could understand a "\b".

What am I missing?

Luke


 

My thinking it that there can be no whitespace after the colon, or an instance of a single whitespace character, but not multiple whitespace characters.  Definitely not the same as .* at all.

I agree that one could probably use \b, but I was thinking "whitespace" and used whitespace matching.  And remember whitespace is not just a space, but includes space, tab stop and line break.

Also, I sometimes change my mind about what I'm going to capture, and \b is non-capturing.

There's a reason I have said, repeatedly, that I am doing "quick and dirty" to get the result I'm looking for.  It's entirely possible, nay, probable, that certain of my regexes could be expressed more elegantly.  If it works on the tests I'm running, as I expect it to, it's "good enough."
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Luke Davis
 

An alternative to Brian's method, might be something longish like the following. Although again I don't fully understand the issue, not having gotten Janet's messages, so it might fail the use case after all.

I spent about an hour trying to figure out some more elegant way of doing this, and couldn't come up with anything shorter than the below. Brian's method is probably easier to understand, although this cuts and pastes as a single entry, so i guess it has that going for it. :)

The idea below is to match, at the start of any word, any RN between one and nine characters long, and additionally to match one optional subsequent non RN character, and a required final colon. That was what I understood from Brian's messages anyway.

Match type: regular expression
Case sensitive: yes
Pattern:

\b([MCLXVI])([MCLXVI])?([MCLXVI])?([MCLXVI])?([MCLXVI])?([MCLXVI])?([MCLXVI])?([MCLXVI])?([MCLXVI])?([a-zA-Z])?(?=:)

Replacement:

\1 \2 \3 \4 \5 \6 \7 \8 \9 \10

I tested a version of this in a temporary dictionary, and it appeared to work.

The weird construct for the colon at the end, is because it's punctuation. I don't know when NVDA applies punctuation processing to this chain of dictionaries, and so I thought it better to make sure the colon was there, but let it actually be processed by normal rules with a forward reference. I did not test that part in the temp dictionary, as I only just thought of it. If this fails, try replacing "(?=:)" with just ":", and put a colon at the end of the replacement string as well.

Luke


Gene
 

I think I figured it out. When using the whole word setting, if I don't include a space before and afgter the numeral, it works. I made an entry iv and in the pronounced as field I placed 4. I didn't make it case sensative because I wanted to test what the dictionary would do in general.

When it saw iv in a word such as exclusive, it read the word properly. When it saw iv just as letters, whether they were at the beginning of a line with a space after, in a sentence with a space before and after, or at the end with a period afgter, the dictionary read iv as 4. This may be of considerable value for those who don't know how to work with regular expressions and want to make Roman numeral pronunciation rules that work properly. The only thing I can think of that shouldn't be placed in the dictionary is a single I and 1 in the pronounced as field. You would constantly hear I spoken as in One went to the store.

Gene

-----Original Message-----
From: Gene
Sent: Saturday, November 07, 2020 5:41 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

It isn't clear to me after experimenting with the speech dictionary how the
whole word setting works. I had originally thought that those who wanted to
have something like Roman numerals spoken as standard numbers might need
another choice in the speech dictionary. There is now whole word and
anywhere as choices. As I thought about it, I thought that whole word
should work and that no other option would be necessary. If IV were seen as
a whole word and the dictionary spoke 4 when it saw IV with spaces on either
side, if you include spaces in the entry, that that would not result in
extraneoussspeaking of 4. So another choice in the list of radio buttons,
as Ioriginally suggested, wouldn't be needed.

I experimented with this and iv by itself with spaces in the pattern field
and 4 in the pronounced as field doesn't work. IV is still spoken as IV
when written in this way. what does NVDA consider a whole word? When I try
a word such as alive and use the whole word setting, that works. Perhaps
what NVDA sees as a whole word needs to be changed.

Since most people won't know how to work with regular expressions, the
ability to do this sort of thing using the whole word option might be
valuable.

Gene
-----Original Message-----
From: Brian Vogel
Sent: Saturday, November 07, 2020 5:01 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

On Sat, Nov 7, 2020 at 04:53 PM, Gene wrote:
This could be made much easier,-
What could be?

Everything you write after this is essentially what I proposed: using the
speech dictionary with regular expression matching to very strictly limit
what is captured and substituted.

One of the beauties of regular expressions is how they can be crafted to
catch only what you want.

--


Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041

It’s hard waking up and realizing it’s not always black and white.

~ Kelley Boorn


Sean Randall
 

the problom with this approach is that other letters than I would lose their meaning, such as x, v, c and so on.

whether NVDA should pronounce Iii as "3` or "I I I" is a matter for the synthesizer and user though, surely we shouldn't proscribe that level of detail for users in the general case.

On 8 Nov 2020, at 11:55, Gene <gsasner@gmail.com> wrote:

I think I figured it out. When using the whole word setting, if I don't include a space before and afgter the numeral, it works. I made an entry iv and in the pronounced as field I placed 4. I didn't make it case sensative because I wanted to test what the dictionary would do in general.

When it saw iv in a word such as exclusive, it read the word properly. When it saw iv just as letters, whether they were at the beginning of a line with a space after, in a sentence with a space before and after, or at the end with a period afgter, the dictionary read iv as 4. This may be of considerable value for those who don't know how to work with regular expressions and want to make Roman numeral pronunciation rules that work properly. The only thing I can think of that shouldn't be placed in the dictionary is a single I and 1 in the pronounced as field. You would constantly hear I spoken as in One went to the store.

Gene
-----Original Message----- From: Gene
Sent: Saturday, November 07, 2020 5:41 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

It isn't clear to me after experimenting with the speech dictionary how the
whole word setting works. I had originally thought that those who wanted to
have something like Roman numerals spoken as standard numbers might need
another choice in the speech dictionary. There is now whole word and
anywhere as choices. As I thought about it, I thought that whole word
should work and that no other option would be necessary. If IV were seen as
a whole word and the dictionary spoke 4 when it saw IV with spaces on either
side, if you include spaces in the entry, that that would not result in
extraneoussspeaking of 4. So another choice in the list of radio buttons,
as Ioriginally suggested, wouldn't be needed.

I experimented with this and iv by itself with spaces in the pattern field
and 4 in the pronounced as field doesn't work. IV is still spoken as IV
when written in this way. what does NVDA consider a whole word? When I try
a word such as alive and use the whole word setting, that works. Perhaps
what NVDA sees as a whole word needs to be changed.

Since most people won't know how to work with regular expressions, the
ability to do this sort of thing using the whole word option might be
valuable.

Gene
-----Original Message----- From: Brian Vogel
Sent: Saturday, November 07, 2020 5:01 PM
To: nvda@nvda.groups.io
Subject: Re: [nvda] How to spell out Roman numerals

On Sat, Nov 7, 2020 at 04:53 PM, Gene wrote:
This could be made much easier,-
What could be?

Everything you write after this is essentially what I proposed: using the
speech dictionary with regular expression matching to very strictly limit
what is captured and substituted.

One of the beauties of regular expressions is how they can be crafted to
catch only what you want.

--


Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041

It’s hard waking up and realizing it’s not always black and white.

~ Kelley Boorn









[Recycle trees] Save a tree...please don't print this e-mail unless you really need to.

Confidentiality Notice
This message and any attachments are private and confidential and may be subject to legal privilege and copyright. If you are not the intended recipient please do not publish or copy it to anyone else. If you have received this message in error please notify the sender immediately by using the reply facility in your email software and then remove it from your system.

Data Protection
We comply with data protection legislation, including the General Data Protection Regulation (GDPR), and take the security and privacy of personal data very seriously. If you no longer wish to receive emails from us please forward this email (so we can see who it was sent to you by) to dpo@ncw.co.uk<mailto:dpo@ncw.co.uk> with your request, and we will review our information in line with your wishes.

Disclaimer
Although this email and attachments have been scanned for viruses, New College Worcester accepts no liability for any loss or damage arising from the receipt or use of this communication.