Verifying Text from an Image Using Optical Character Recognition Tool

Modified on Tue, 12 Nov, 2024 at 3:19 PM

TABLE OF CONTENTS

1. Optical Character Recognition Overview

2. Prerequisites

3. Testing Data

Available for Python Selenium, Java WinAppDriver and  Java Selenium frameworks.


1. Optical Character Recognition Overview


Optical Character Recognition (OCR) is the process that converts an image of text into a machine-readable text format. For example, if you scan a receipt, your computer saves the scanned receipt as an image file. You cannot use a text editor to edit, search, or count the words in the image file. However, you can use OCR to convert the image into a text document with its contents stored as text data.


In this topic, Python-tesseract tool is used to convert an image from a report generated in a web automation.
Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and read the text embedded in images.

To know more about Java WinAppDriver and Java SeleniumTess4J installation is required. Install Tess4J by including it as a dependency in your Java project. Tess4J can be obtained from the official Tess4J GitHub repository here. Follow the instructions provided in the README to add Tess4J to your project.


2. Prerequisites

  • You must install Python 3.11.x version. To install this version, click here.
  • You must have installed Selenium. If not, run the following command in the command prompt.   
    pip install selenium

  • To install tesseract tool, you must download tesseract-ocr-w64-setup-5.3.3.20231005.exe and follow the on-screen prompts to complete the installation process.

3. Testing Data

Perform the following

  1. Run the following command in the command prompt to install pytesseract and opency-python
    pip install pytesseract
    pip install opencv-python

    pip install pytesseract: This command uses the pip package manager to install the pytesseract package, which is a Python wrapper for Google's Tesseract-OCR Engine. This package allows Python to interact with Tesseract-OCR, enabling text recognition from images.
    pip install opencv-python: This command uses pip to install the opencv-python package, which is a Python library for computer vision and image processing. OpenCV (Open Source Computer Vision Library) provides tools and functions to manipulate images and perform various computer vision tasks.

  2. To verify the installation, run the following command to list all the installed packages.
    Pip list 


  3.   The output file contains the verified text from an image and displays the image content.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article