Sep 6, 2006 114 readsby Navneet Kaushal

Google has recently re-released the Tesseract OCR software to the open source community. OCR or optical character recognition is a sophisticated technique that helps digitally converting physical text into computer based text. Physical text is passe. With the OCR software you can now store a bulk of your earlier papers in digital formats.

Google has also reported that they are not the original developer of the OCR software. This particular Tesseract OCR software was originally developed at the Hewlett Packard Laboratories during 1985 – 1995. But unfortunately HP got out of the Tesseract OCR software business and the software was unused till Google's recent re-launch of the software.

The Tesseract OCR software supports only one language, i.e. English. The software may not include a page layout analysis module but it's far more accurate than any Open Source OCR package available in the market.

Navneet Kaushal
Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
Navneet Kaushal
Navneet Kaushal
Most popular Posts
Upcoming Events
  • Events are coming soon, stay tuned!