Detecting text in images can be crucial for various applications such as document processing, automated surveys, and data extraction. Enter CRAFT (Character Region Awareness for Text Detection)—a robust and flexible tool designed to identify text areas by analyzing character regions and their affinities. In this blog, we’ll guide you through the installation process, basic usage, and even delve into some advanced techniques for maximizing CRAFT’s potential.
Overview of CRAFT
CRAFT is a PyTorch-based text detection model that finds bounding boxes around texts by implementing a minimum bounding rectangle method based on the character region and affinity scores.

Getting Started with CRAFT
Installation
Begin by installing CRAFT via pip. Open your command line interface and run:
pip install craft-text-detector
Basic Usage
Once installed, you can start detecting text in images! Here’s a quick guide:
from craft_text_detector import Craft
# Set image path and output directory
image = 'figures/idcard.png' # Path to the image file
output_dir = 'outputs' # Directory to save results
# Create a CRAFT instance
craft = Craft(output_dir=output_dir, crop_type='poly', cuda=False)
# Apply CRAFT text detection
prediction_result = craft.detect_text(image)
# Clean up RAM
craft.unload_craftnet_model()
craft.unload_refinenet_model()
Analogy: Think of CRAFT as a Detective
Imagine you’re a detective analyzing a scene (the image). CRAFT acts like a keen investigator, examining each letter (character region) and its relationship with nearby letters (character affinities). After thorough investigation, it draws secure lines around every group of letters forming a word (bounding boxes), making sure nothing is left out.
Advanced Usage
If you’re looking to expand what you can do with CRAFT, take a look at the advanced usage options:
from craft_text_detector import (
read_image,
load_craftnet_model,
load_refinenet_model,
get_prediction,
export_detected_regions,
export_extra_results,
empty_cuda_cache)
# Set image path and output directory
image = 'figures/idcard.png'
output_dir = 'outputs'
# Read the image
image = read_image(image)
# Load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)
# Perform prediction
prediction_result = get_prediction(
image=image,
craft_net=craft_net,
refine_net=refine_net,
text_threshold=0.7,
link_threshold=0.4,
low_text=0.4,
cuda=True,
long_size=1280)
# Export detected text regions
exported_file_paths = export_detected_regions(
image=image,
regions=prediction_result['boxes'],
output_dir=output_dir,
rectify=True)
# Export additional results
export_extra_results(
image=image,
regions=prediction_result['boxes'],
heatmaps=prediction_result['heatmaps'],
output_dir=output_dir)
# Clean up GPU cache
empty_cuda_cache()
Troubleshooting Common Issues
While CRAFT is relatively easy to implement, you may encounter some challenges along the way. Here are a few troubleshooting tips:
- Model Not Loading: Ensure that your CUDA environment is properly configured or set cuda=False if you’re using the CPU.
- No Text Detected: Adjust the
text_threshold
andlink_threshold
parameters. Sometimes lowering these values helps in detecting more text. - Image Quality: Make sure the input image is clear and well-lit. Text detection can struggle with blurry images.
- Error Messages: Refer to the documentation or GitHub issues page for logs and solutions. Community support can be quite helpful.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Final Thoughts
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
Now that you have an overview of how to use CRAFT effectively, it’s time to dive in and see what text detection wonders await you!