Google Releases Tesseract OCR Software

Sep 6, 2006 | 3,155 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 0.0/5 (0 votes cast)

Google has recently re-released the Tesseract OCR software to the open source community. OCR or optical character recognition is a sophisticated technique that helps digitally converting physical text into computer based text. Physical text is passe. With the OCR software you can now store a bulk of your earlier papers in digital formats.

Google has also reported that they are not the original developer of the OCR software. This particular Tesseract OCR software was originally developed at the Hewlett Packard Laboratories during 1985 – 1995. But unfortunately HP got out of the Tesseract OCR software business and the software was unused till Google's recent re-launch of the software.

The Tesseract OCR software supports only one language, i.e. English. The software may not include a page layout analysis module but it's far more accurate than any Open Source OCR package available in the market.

1.thumbnail Google Releases Tesseract OCR Software
Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
1.thumbnail Google Releases Tesseract OCR Software
1.thumbnail Google Releases Tesseract OCR Software