Re: New version of TesseractOCR add-on


Rui Fontes
 

Hello!


Since version 2022.06.27, it uses Tesseract 5.1 for 32-bit...

So, it is very strange...


By the way, you can update to version 2022.07.13, already released...

The changes are:

- Corrected the threading for the update routine;
- Updated turkish translation;
- Small code corrections...


Best regards,

Rui Fontes
NVDA portuguese team



Às 02:23 de 14/07/2022, mk360 escreveu:

Hi,

About the version, is 2022.7, but it happened when I installed the version that uses Sesseract 5.1 because I run it in a 32 b system.

Note that the english language worked when I installed that version, but spanish doesn't.

El 13/07/2022 a las 21:14, Rui Fontes escribió:
Hello!


Wich version of tesseractOCR?

Your system is 32 or 64-bit?


Cumprimentos,

Rui Fontes
Equipa portuguesa do NVDA



Às 00:49 de 14/07/2022, mk360 escreveu:
Hi,

Two problems here:

If I set spanish as my preffered language I gives an error and never display ocr.txt, this is the log using control windows r:

ERROR - stderr (19:42:00.880) - Thread-22 (5292):
Exception in thread Thread-22:
Traceback (most recent call last):
  File "threading.pyc", line 926, in _bootstrap_inner
  File "threading.pyc", line 870, in run
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\__init__.py", line 176, in _doRoutines
    self.convertPDFToPNG()
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\__init__.py", line 113, in convertPDFToPNG
    self.backgroundProcessing(command)
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\__init__.py", line 103, in backgroundProcessing
    p = subprocess.Popen(self.command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, startupinfo=si)
  File "subprocess.pyc", line 800, in __init__
  File "subprocess.pyc", line 1207, in _execute_child
OSError: [WinError 216] Esta versión de %1 no es compatible con la versión de Windows que está ejecutando. Compruebe la información de sistema del equipo para consultar si necesita una versión x86 (32 bits) o x64 (64 bits) del programa, y después póngase en contacto con el editor del software


Also, sometimes when I start NVDA it never speak and I need to restart it, here is the log when it start finally with voice feedback:

NVDA initialized
ERROR - unhandled exception (19:46:35.521) - MainThread (2396):
Traceback (most recent call last):
  File "wx\core.pyc", line 3407, in <lambda>
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\update.py", line 80, in upgradeVerify
    r = urllib.request.urlopen(p).read()
  File "urllib\request.pyc", line 222, in urlopen
  File "urllib\request.pyc", line 525, in open
  File "urllib\request.pyc", line 543, in _open
  File "urllib\request.pyc", line 503, in _call_chain
  File "urllib\request.pyc", line 1393, in https_open
  File "urllib\request.pyc", line 1350, in do_open
  File "http\client.pyc", line 1277, in request
  File "http\client.pyc", line 1323, in _send_request
  File "http\client.pyc", line 1272, in endheaders
  File "http\client.pyc", line 1032, in _send_output
  File "http\client.pyc", line 972, in send
  File "http\client.pyc", line 1439, in connect
  File "http\client.pyc", line 944, in connect
  File "socket.pyc", line 707, in create_connection
  File "socket.pyc", line 752, in getaddrinfo
LookupError: unknown encoding: idna

El 13/07/2022 a las 7:26, Rui Fontes escribió:
Hello!


From 2022.06 to 2022.06.27:

- Updated Tesseract from version 5.0 Alpha (64-bit) to 5.1 (32-bit);
- Added several more recognition languages;
- Introduced the option to select a second language to be used in OCR of documents with multiple languages and a button to forget it;
- Introduced a new document type, "With auto-orientation", that allows the OCR engine to rotate the image as necessary;
- Introduced beeps to signal the add-on is working;
- Corrected code to avoid the non population of the download languages combobox;
- Corrected a problem with controlTypes roles preventing compatibility with NVDA 2020.4;
- Added russian translation.


From 2022.06.27 to 2022.07:

- Allow using any number of recognition languages;
- Complete code re-wrote, including:
    - Split in various modules to make code clear;
    - End using batch files;
    - Allow recognize files on Desktop;
- Added translation to spanish, french, russian and ukranian.


Best regards,

Rui Fontes
NVDA portuguese team


Às 07:12 de 13/07/2022, Brian's Mail list account via groups.io escreveu:
What is the difference between the old and new ones?
Brian











Join nvda@nvda.groups.io to automatically receive all group messages.