Sep 6, 2006 114 reads by Navneet Kaushal

Google has recently re-released the Tesseract OCR software to the open source community. OCR or optical character recognition is a sophisticated technique that helps digitally converting physical text into computer based text. Physical text is passe. With the OCR software you can now store a bulk of your earlier papers in digital formats.

Google has also reported that they are not the original developer of the OCR software. This particular Tesseract OCR software was originally developed at the Hewlett Packard Laboratories during 1985 – 1995. But unfortunately HP got out of the Tesseract OCR software business and the software was unused till Google's recent re-launch of the software.

The Tesseract OCR software supports only one language, i.e. English. The software may not include a page layout analysis module but it's far more accurate than any Open Source OCR package available in the market.

Navneet Kaushal
Navneet Kaushal is the Editor-in-Chief of PageTraffic Buzz. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic Buzz as one of the well know digital marketing blog.
Navneet Kaushal
Navneet Kaushal
Most popular Posts
Upcoming Events
Events are coming soon, stay tuned!More