Am 27.05.2018 um 18:44 schrieb Vlad Dragomir:
I need to find a way to transform image pdf files into text. These are mostly books that have been scanned a while ago, but I haven't had the time and patience to find a solution. Now I really need to do something about this, for professional reasons.
try robobraille.org thiis is a Danish Service for converting Documents to mp3, braille and text. I don't know if you need to recognize letters from Russia or so, this is also possible with this browser app. This app uses tesseract as engine for recognizing the Text out of the document.
I found this Windows 10 app called KNFB Reader, which seems to do exactly that. However, since this is a rather expensive app, I'd like to ask those who have already used it a few things if I may:
1. How does this app deal with multi-lingual documents? Is it possible to choose two or more recognission languages for the same book? Most of my books are manuals. I am a language teacher and it is very important that both languages used in a book be accurately detected.
2. Is formatting being retained, at least in part?
3. Are there any alternatives to this application? It seems to be the only accessible solution, but I might be wrong.
I would be very grateful if anyone could help me with this.