Topics

Pronunciation of phone numbers


George McCoy
 

I have a couple synthesizers that do not pronounce phone numbers as individual digits. Are there settings or speech dictionary entries that would correct this problem?


Thanks,

George


 

George,

            Do you use a consistent format for your phone numbers?   For example, are all written with the area code in parentheses, (814) 536-2250, or all written as area code hyphen exchange hyphen last 4 digits, 814-536-2250, or some other format that's consistent?

            It is possible to achieve what you're looking for using regular expression matching, but it's helpful to know how you format phone numbers so that the regex can be constructed correctly.

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 
Edited

George,

         If you use either of the formats I previously mentioned, whether or not the area code is enclosed in parentheses or not, you can use the instructions that follow.  I am going to give the regular expression (regex) you will want to use first and the replacement string right afterward before giving the step-by-step instructions that would be used to do this for any regular expression and replacement.  You begin copying the regex with the opening backslash and stop at the end of the line.  For the replacement, if you do not want NVDA to say, "area code," before the first three digits then you can omit those two words and start copying at the opening backslash until the end of the line.

regex: \(?(\d)(\d)(\d)\)?[ -](\d)(\d)(\d)-(\d)(\d)(\d)(\d)
 
replacement: area code \1 \2 \3 \4 \5 \6 \7 \8 \9 \10
 
Adding a Regular Expression match to the Default Dictionary
1. Hit NVDA+N,P,D,D   NVDA Main Menu, Preferences, Speech dictionaries, Default Dictionary
2. Hit ALT+A to activate the Add Button or navigate to it and activate it.
3. In the dialog that appears, paste or enter a regular expression in the Pattern edit box.
4. Paste or enter the replacement you want to hear in the Replacement edit box.  If you're using a capturing regular expression this may be a sequence of backslashes followed by the numbers of the capture groups you're using.
5. The Comment edit box should either have a comment that helps you to remember the purpose of this dictionary entry, which is preferable, or be left blank.
6. Leave the Case Sensitive checkbox unchecked.
7. The Type radio button should be set to Regular Expression.
8. Activate the OK button

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


George McCoy
 

Thanks brian.

I thought regular expressions might do the trick, but I don't know much about them. The format I usually use is xxx-xxx-xxxx.


George


George McCoy
 

Many thanks, Brian. I'll give it a try.


On 9/25/2020 10:25 PM, Brian Vogel wrote:
George,

         If you use either of the formats I previously mentioned, whether or not the area code is enclosed in parentheses or not, you can use the instructions that follow.  I am going to give the regular expression (regex) you will want to use first and the replacement string right afterward before giving the step-by-step instructions that would be used to do this for any regular expression and replacement.  You begin copying the regex with the opening open parenthesis and stop at the end of the line.  For the replacement, if you do not want NVDA to say, "area code," before the first three digits then you can omit those two words and start copying at the opening backslash until the end of the line.

regex: \(?(\d)(\d)(\d)\)?[ -](\d)(\d)(\d)-(\d)(\d)(\d)(\d)
 
replacement: area code \1 \2 \3 \4 \5 \6 \7 \8 \9 \10
 
Adding a Regular Expression match to the Default Dictionary
1. Hit NVDA+N,P,D,D   NVDA Main Menu, Preferences, Speech dictionaries, Default Dictionary
2. Hit ALT+A to activate the Add Button or navigate to it and activate it.
3. In the dialog that appears, paste or enter a regular expression in the Pattern edit box.
4. Paste or enter the replacement you want to hear in the Replacement edit box.  If you're using a capturing regular expression this may be a sequence of backslashes followed by the numbers of the capture groups you're using.
5. The Comment edit box should either have a comment that helps you to remember the purpose of this dictionary entry, which is preferable, or be left blank.
6. Leave the Case Sensitive checkbox unchecked.
7. The Type radio button should be set to Regular Expression.
8. Activate the OK button

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


George McCoy
 

Brian,


This regex works for phone numbers formatted with dashes, with or without parentheses, but fails when it encounters spaces.

If I replace the dashes in the pattern string with spaces, it works with spaces, but, of course, not with dashes.


George

On 9/25/2020 10:25 PM, Brian Vogel wrote:
George,

         If you use either of the formats I previously mentioned, whether or not the area code is enclosed in parentheses or not, you can use the instructions that follow.  I am going to give the regular expression (regex) you will want to use first and the replacement string right afterward before giving the step-by-step instructions that would be used to do this for any regular expression and replacement.  You begin copying the regex with the opening open parenthesis and stop at the end of the line.  For the replacement, if you do not want NVDA to say, "area code," before the first three digits then you can omit those two words and start copying at the opening backslash until the end of the line.

regex: \(?(\d)(\d)(\d)\)?[ -](\d)(\d)(\d)-(\d)(\d)(\d)(\d)
 
replacement: area code \1 \2 \3 \4 \5 \6 \7 \8 \9 \10
 
Adding a Regular Expression match to the Default Dictionary
1. Hit NVDA+N,P,D,D   NVDA Main Menu, Preferences, Speech dictionaries, Default Dictionary
2. Hit ALT+A to activate the Add Button or navigate to it and activate it.
3. In the dialog that appears, paste or enter a regular expression in the Pattern edit box.
4. Paste or enter the replacement you want to hear in the Replacement edit box.  If you're using a capturing regular expression this may be a sequence of backslashes followed by the numbers of the capture groups you're using.
5. The Comment edit box should either have a comment that helps you to remember the purpose of this dictionary entry, which is preferable, or be left blank.
6. Leave the Case Sensitive checkbox unchecked.
7. The Type radio button should be set to Regular Expression.
8. Activate the OK button

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

George,

Give me a short list of examples that cover the range of formats you need for this to work with and I can tweak the regex.  Trying to explain how is not something I'm inclined to do, at least not on-group, as it's really off-topic if you get hot and heavy into regular expression syntax.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


George McCoy
 

Sure thing, Brian. I wasn't looking for a short course on regular expressions. :) I'd like to handle phone numbers where the area code, excnagge and local sections are separated with dashes or spaces. The area code may or may not be enclosed in parentheses.

###-###-####

(###)-###-####

(###) ### ####

### ### ####

I don't expect to handle a phone number containing both spaces and dashes as separaters.


I really appreciate your help on this. I'm sure it will be of general benefit to Espeak users.


George

On 9/26/2020 10:45 PM, Brian Vogel wrote:
George,

Give me a short list of examples that cover the range of formats you need for this to work with and I can tweak the regex.  Trying to explain how is not something I'm inclined to do, at least not on-group, as it's really off-topic if you get hot and heavy into regular expression syntax.
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Luke Davis
 

On Sat, 26 Sep 2020, Brian Vogel wrote:

George,Give me a short list of examples that cover the range of formats you need for this to work with and I can tweak the regex.  Trying to explain how is
not something I'm inclined to do, at least not on-group, as it's really off-topic if you get hot and heavy into regular expression syntax.--
Why not just use something like this? Untested, but it should catch periods, dashes, and spaces.
Replacement uses a comma for pausing.

Regex: \(?(\d)(\d)(\d)\)?[ \.-](\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)

Replacement: \1 \2 \3, \4 \5 \6, \7 \8 \9 \10

Luke


Luke Davis
 

On Sun, 27 Sep 2020, Luke Davis wrote:

Regex: \(?(\d)(\d)(\d)\)?[ \.-](\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)
I should note, that the one case that won't handle, is:

(###)###-####

I suspect we would have to get creative to manage that one. Along the lines of:

\(?(\d)(\d)(\d)(?:\)?[ \.-]|\))(\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)\D

I added a \D at the end to hopefully make sure we're not grabbing something that isn't a phone number.
The former expressions would have munged things like:

123-456-789090009876

(Like a serial number or tracking number or similar.)

Luke


 

On Sun, Sep 27, 2020 at 12:01 AM, Luke Davis wrote:
Why not just use something like this?
-
Well, your example is exactly what my tweak would have been, so, sure.  All I wanted was a set of examples of exactly what formats could be expected, which would lead to the [ \.-] character class for the two separation regions.  Although I must say I've never seen anyone use a period as the separator in a phone number, but just because I haven't seen it doesn't mean it doesn't exist, just that it's not common in my little corner of the world.
 
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Luke Davis
 

On Sat, 26 Sep 2020, Brian Vogel wrote:

would lead to the [ \.-] character class for the two separation regions.  Although I must say I've never seen anyone use a period as the separator in a
phone number, but just because I haven't seen it doesn't mean it doesn't exist, just that it's not common in my little corner of the world.
I don't know what the reasoning for it is, but I have seen it in some places, usually done by North American businesses. I always assumed it was somehow related to fonts on business cards or the like, or maybe just a desire to be different.

I've seen it a lot in the whois database as well.

Luke


 

Luke,

           Just to be clear, I am really not, ever, trying to handle every possible odd exception condition that might come up.  If what I craft gets 99.5% of all commonly used formats, I'm fine.  If someone makes a typo like leaving no space between the closing parenthesis of an area code and the exchange, or were to skip a hyphen between the area code and the exchange, I'm more than happy to have that read badly, as it should clue in the listener that something's off about the formatting.

            I've had to write hellishly complicated regexes in the past that needed to handle a multitude of normal and exception conditions, and they exhaust me when they get to that extent, particularly when "that extent" is really unlikely to be necessary.

            But George should have his solution at this point, that's for sure!
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


George McCoy
 

Thanks to Brian and Luke for their help on this. Luke, yours works perfectly for me in all cases.


George

On 9/26/2020 11:01 PM, Luke Davis wrote:
On Sat, 26 Sep 2020, Brian Vogel wrote:

George,Give me a short list of examples that cover the range of formats you need for this to work with and I can tweak the regex.  Trying to explain how is
not something I'm inclined to do, at least not on-group, as it's really off-topic if you get hot and heavy into regular expression syntax.--
Why not just use something like this?  Untested, but it should catch periods, dashes, and spaces.
Replacement uses a comma for pausing.

Regex: \(?(\d)(\d)(\d)\)?[ \.-](\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)

Replacement: \1 \2 \3, \4 \5 \6, \7 \8 \9 \10

Luke




 

George,

           You're quite welcome.  I'm curious whether you used Luke's "version one" that did not have the backslash capital D at the end, or his "version two" that did?  Also, did you include the commas in your replacement string, and are they giving an adequate pause between the components of the phone number?

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


 

On Sun, Sep 27, 2020 at 12:41 AM, Luke Davis wrote this regular expression:
\(?(\d)(\d)(\d)(?:\)?[ \.-]|\))(\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)\D
-
By the way, that's very, very elegant for the purpose.  I never want you to think that my comments on the group are meant to be critical/disparaging in any way.  I'm just generally going for "quick and dirty for the common cases" while you are being incredibly thorough, and the latter is more complicated and requires great skill.

I'd actually forgotten about non-capturing groups with alternatives.  A master stroke!
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


George McCoy
 

I used version 2 that ends in (\D). I left the commas in the replacement string. They provide enough of a pause to make the phone number much easier to listen to.


On 9/27/2020 1:31 PM, Brian Vogel wrote:
George,

           You're quite welcome.  I'm curious whether you used Luke's "version one" that did not have the backslash capital D at the end, or his "version two" that did?  Also, did you include the commas in your replacement string, and are they giving an adequate pause between the components of the phone number?

--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn

 


Luke Davis
 

On Sun, 27 Sep 2020, Brian Vogel wrote:

On Sun, Sep 27, 2020 at 12:41 AM, Luke Davis wrote this regular expression:
\(?(\d)(\d)(\d)(?:\)?[ \.-]|\))(\d)(\d)(\d)[ \.-](\d)(\d)(\d)(\d)\D
-By the way, that's very, very elegant for the purpose.  I never want you to think that my comments on the group are meant to be critical/disparaging in any
way.  I'm just generally going for "quick and dirty for the common cases" while you are being incredibly thorough, and the latter is more complicated and
requires great skill.I'd actually forgotten about non-capturing groups with alternatives.  A master stroke!
Thank you! Regular expressions are amazing tools, and I delight in building one that gets the job done while eliminating as many false positives as possible.

Although ones like this are really hard to read! :)

Luke


 

On Tue, Sep 29, 2020 at 06:23 PM, Luke Davis wrote:
Regular expressions are amazing tools
-
Indeed they are.  The sad thing is that they really are not easy to learn or understand.  Once you do they seem far more "natural" but you'll even find yourself "missing something" or "catching something you don't want" without extensive testing ahead of deployment.

I would love to know the mind that came up with regular expressions, because they're the closest things I know of to "human filtering" because you build them by thinking, very carefully, about all the alternatives and sometimes all the alternatives are huge in number, alternative forms, or both.
 
--

Brian - Windows 10 Pro, 64-Bit, Version 2004, Build 19041  

It’s hard waking up and realizing it’s not always black and white.

     ~ Kelley Boorn