How to Use Optical Character Recognition with Doctr

Apr 17, 2022 | Educational

Optical Character Recognition (OCR) has revolutionized the way we interact with text and images. Thanks to tools like Doctr, powered by TensorFlow 2 and PyTorch, using OCR has become seamless and accessible to all. In this guide, we will walk you through how to use Doctr for image classification and text extraction.

What You Need

  • Python installed on your computer.
  • Doctr library: You can install it using pip.
  • An image file containing text that you want to recognize.

Getting Started

To begin, you’ll want to load your image into the Doctr framework. Below are the steps to accomplish that.

Example Usage

Here’s a step-by-step breakdown of the code to set up and run the OCR process:

from doctr.io import DocumentFile
from doctr.models import ocr_predictor, from_hub

# Load your document
img = DocumentFile.from_images([image_path])

# Load the pre-trained model from the hub
model = from_hub('mindeemy-model')

# Set up the predictor
# For a recognition model
predictor = ocr_predictor(det_arch='db_mobilenet_v3_large',
                            reco_arch=model,
                            pretrained=True)

# For a detection model
# predictor = ocr_predictor(det_arch=model,
#                            reco_arch='crnn_mobilenet_v3_small',
#                            pretrained=True)

# Get your predictions
res = predictor(img)

Breaking Down the Code

Think of this process as making a smoothie:

  • First, you gather your ingredients (load the image using DocumentFile.from_images).
  • Next, you select the appropriate recipe (choose the right model using from_hub).
  • Then, you blend everything together (initialize the predictor). You have options depending on whether you want to focus on the contents of the smoothie (recognition) or decide the smoothie type (detection).
  • Finally, you taste it (run predictor(img) to see your results).

Troubleshooting

If your OCR predictions aren’t coming out as expected, consider the following troubleshooting tips:

  • Ensure that your image_path is correct.
  • Double-check that you have the right model architecture for your needs.
  • Make sure that the required libraries such as TensorFlow and PyTorch are installed in your environment.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

Utilizing OCR with Doctr can significantly enhance your text extraction capabilities, making it easier to digitize and analyze written information. Remember to refer to the documentation for more advanced features and options!

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox