Welcome to your complete guide on how to utilize the Pix2Text Mathematical Formula Recognition (MFR) model. This tool is designed to help you convert images of mathematical formulas into LaTeX text representation efficiently. With its roots in the TrOCR architecture developed by Microsoft, the Pix2Text model showcases impressive performance in recognizing both printed and handwritten mathematical formulas. Read on to learn how to make the most out of this fantastic tool!
Model Overview
The Pix2Text MFR model leverages advanced AI capabilities to interpret images of mathematical formulas. It is crucial for researchers, students, and educators who want to digitize mathematical content easily. Here’s what you need to know:
- Purpose: Converts images of mathematical formulas into LaTeX text.
- Limitation: The model is specifically trained on mathematical formula images; hence it may falter when handling other types of images.
How to Use Pix2Text
Here, we will walk you through three different methods to utilize the Pix2Text model for your projects.
Method 1: Direct Use without Pix2Text Installation
This method allows you to use the model without installing Pix2Text, best for recognizing pure formula images. Here’s a simple analogy: think of it as using a calculator to solve a math problem without building the calculator yourself!
python
#! pip install transformers=4.37.0 pillow optimum[onnxruntime]
from PIL import Image
from transformers import TrOCRProcessor
from optimum.onnxruntime import ORTModelForVision2Seq
processor = TrOCRProcessor.from_pretrained('breezedeus/pix2text-mfr')
model = ORTModelForVision2Seq.from_pretrained('breezedeus/pix2text-mfr', use_cache=False)
image_fps = [
'examples/example.jpg',
'examples/42.png',
'examples/0000186.png',
]
images = [Image.open(fp).convert('RGB') for fp in image_fps]
pixel_values = processor(images=images, return_tensors='pt').pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(f'Generated IDs: {generated_ids}, Generated text: {generated_text}')
Method 2: Installation of Pix2Text
This method requires the installation of Pix2Text, enabling recognition of both pure and mixed formula images. Similar to installing an app on your phone to unlock additional functionalities!
bash
$ pip install pix2text==1.1
python
from pix2text import Pix2Text, merge_line_texts
image_fps = [
'examples/example.jpg',
'examples/42.png',
'examples/0000186.png',
]
p2t = Pix2Text.from_config()
outs = p2t.recognize_formula(image_fps) # recognize pure formula images
outs2 = p2t.recognize('examples/mixed.jpg', file_type='text_formula', return_text=True, save_analysis_res='mixed-out.jpg') # recognize mixed images
print(outs2)
Method 3: Using the Notebook
If you’re not keen on coding, you can try the Pix2Text model through a pre-prepared notebook. Simply follow this link to access the [Pix2Text Notebook](https://github.com/breezedeus/Pix2Text/blob/main/pix2text_v1_1.ipynb).
Performance Insights
The effectiveness of the Pix2Text model can be gauged by its performance on various mathematical formula images. It’s like conducting a race where we compare which runner finishes fastest; here, the models are compared based on their Character Error Rates (CER) to understand which performs better under different conditions.

As highlighted, the Pix2Text V1.0 model excels in recognizing diverse mathematical formula images, outperforming its predecessors significantly.
Troubleshooting Tips
If you encounter issues while using the Pix2Text model, consider the following troubleshooting ideas:
- Ensure you are using image files that contain clear mathematical formulas; blurred or low-resolution images may not yield good results.
- Verify that the libraries required for installation align with those specified in the documentation.
- If using a jupyter notebook, check for kernel restarts or resource limitations that may hinder execution.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.