Re: New version of TesseractOCR add-on

Rui Fontes


Wich version of tesseractOCR?

Your system is 32 or 64-bit?


Rui Fontes
Equipa portuguesa do NVDA

Às 00:49 de 14/07/2022, mk360 escreveu:


Two problems here:

If I set spanish as my preffered language I gives an error and never display ocr.txt, this is the log using control windows r:

ERROR - stderr (19:42:00.880) - Thread-22 (5292):
Exception in thread Thread-22:
Traceback (most recent call last):
  File "threading.pyc", line 926, in _bootstrap_inner
  File "threading.pyc", line 870, in run
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\", line 176, in _doRoutines
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\", line 113, in convertPDFToPNG
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\", line 103, in backgroundProcessing
    p = subprocess.Popen(self.command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, startupinfo=si)
  File "subprocess.pyc", line 800, in __init__
  File "subprocess.pyc", line 1207, in _execute_child
OSError: [WinError 216] Esta versión de %1 no es compatible con la versión de Windows que está ejecutando. Compruebe la información de sistema del equipo para consultar si necesita una versión x86 (32 bits) o x64 (64 bits) del programa, y después póngase en contacto con el editor del software

Also, sometimes when I start NVDA it never speak and I need to restart it, here is the log when it start finally with voice feedback:

NVDA initialized
ERROR - unhandled exception (19:46:35.521) - MainThread (2396):
Traceback (most recent call last):
  File "wx\core.pyc", line 3407, in <lambda>
  File "C:\Users\usuario\AppData\Roaming\nvda\addons\tesseractOCR\globalPlugins\tesseractOCR\", line 80, in upgradeVerify
    r = urllib.request.urlopen(p).read()
  File "urllib\request.pyc", line 222, in urlopen
  File "urllib\request.pyc", line 525, in open
  File "urllib\request.pyc", line 543, in _open
  File "urllib\request.pyc", line 503, in _call_chain
  File "urllib\request.pyc", line 1393, in https_open
  File "urllib\request.pyc", line 1350, in do_open
  File "http\client.pyc", line 1277, in request
  File "http\client.pyc", line 1323, in _send_request
  File "http\client.pyc", line 1272, in endheaders
  File "http\client.pyc", line 1032, in _send_output
  File "http\client.pyc", line 972, in send
  File "http\client.pyc", line 1439, in connect
  File "http\client.pyc", line 944, in connect
  File "socket.pyc", line 707, in create_connection
  File "socket.pyc", line 752, in getaddrinfo
LookupError: unknown encoding: idna

El 13/07/2022 a las 7:26, Rui Fontes escribió:

From 2022.06 to 2022.06.27:

- Updated Tesseract from version 5.0 Alpha (64-bit) to 5.1 (32-bit);
- Added several more recognition languages;
- Introduced the option to select a second language to be used in OCR of documents with multiple languages and a button to forget it;
- Introduced a new document type, "With auto-orientation", that allows the OCR engine to rotate the image as necessary;
- Introduced beeps to signal the add-on is working;
- Corrected code to avoid the non population of the download languages combobox;
- Corrected a problem with controlTypes roles preventing compatibility with NVDA 2020.4;
- Added russian translation.

From 2022.06.27 to 2022.07:

- Allow using any number of recognition languages;
- Complete code re-wrote, including:
    - Split in various modules to make code clear;
    - End using batch files;
    - Allow recognize files on Desktop;
- Added translation to spanish, french, russian and ukranian.

Best regards,

Rui Fontes
NVDA portuguese team

Às 07:12 de 13/07/2022, Brian's Mail list account via escreveu:
What is the difference between the old and new ones?

