[Bug 461177] [NEW] tesseract 2.03 generates empty file
neuromancer
neuromancer at devonlinux.net
Mon Oct 26 15:25:14 UTC 2009
Public bug reported:
Karmic Koala 9.10 beta - tesseract 2.03
Installed tesseract-ocr, tesseract-ocr-eng and tesseract-ocr-ita.
I opened with gimp an image with some number and other informations and cutted just a selection of it and then saved to a tif format.
The image is very cleaned and well contrasted (white background and black text) but when I launched
tesseract inputimage.tif outputfile
the outputfile.txt generated was empty (no text and 1 byte in size).
After a bit of searching, I've found a solution here http://groups.google.com/group/tesseract-ocr/browse_thread/thread/2434f09ed180c092/e5ed41969097c708?lnk=gst&q=screenshot#e5ed41969097c708
Just do
convert inputimage.tif inputimage_tmp.pbm
convert inputimage_tmp.pbm inputimage_ok.tif
This problem exists because sometimes tif images have an alpha transparent layer that block the text recognition.
Accordly to this page http://code.google.com/p/tesseract-
ocr/issues/detail?id=160, the new version, 2.04, have fixed this
problem, so the only thing to do is to package new version.
** Affects: tesseract (Ubuntu)
Importance: Undecided
Status: New
--
tesseract 2.03 generates empty file
https://bugs.launchpad.net/bugs/461177
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
ubuntu-bugs at lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
More information about the universe-bugs
mailing list