Transforming Math into Markdown: A Guide to Using Texify

Jun 7, 2022 | Data Science

Welcome to the wonderful world of Texify, an OCR (optical character recognition) model that’s your trusty companion for converting images or PDFs containing mathematical content into elegant markdown and LaTeX formats. Say goodbye to the tedious manual transcription of equations, as Texify enriches your text with automated accuracy, enabling smooth rendering by MathJax with ease. Let’s embark on this digital journey to simplify how we manage mathematical expressions!

What You Will Need

  • Python version 3.9 or higher
  • PyTorch (CPU or GPU version)

Step-by-Step Installation

To get Texify up and running, follow the straightforward steps outlined below:

  • Ensure you have Python and PyTorch installed. If you’re unsure about your PyTorch installation, consult the official PyTorch installation guide.
  • Open your command line interface and run the following command:
  • pip install texify
  • When you run Texify for the first time, model weights will automatically be downloaded, preparing your environment for action!

Usage Instructions

Knowing how to properly deploy Texify will ensure you have a seamless experience. Here’s how to get started:

  • Before initiating Texify, inspect the texify/settings.py file. You can override settings using environment variables if needed.
  • Your device will automatically be detected; if you wish to specify, use:
    • TORCH_DEVICE=cuda for GPU
    • TORCH_DEVICE=mps for MPS
  • To convert equations into markdown, simply run the command below, replacing path_to_folder_or_file with your actual file path:
  • texify path_to_folder_or_file --max 8 --json_path results.json

Code Explanation: Understanding Texify

Imagine Texify as a skilled artist meticulously crafting a painting. When provided with an image of a complex vase (math expression), Texify analyzes every curve and line (the characters and symbols), transforming them into a stunning artwork (markdown/LaTeX) that can easily be admired and understood by MathJax. Just like an artist sometimes needs to adjust their brush strokes for clarity, Texify requires precise input—choosing the right dimensions for your image helps achieve the best results.

Troubleshooting Tips

Even the best tools may encounter hiccups. Should you face challenges while using Texify, consider the following troubleshooting steps:

  • If the results are not accurate, experiment with the box dimensions around the text. Larger or smaller selections could yield better results!
  • Texify is sensitive to cropping, so try adjusting your selection slightly or splitting the box if necessary.
  • In some instances, KaTeX may not render an equation (indicated by a red error), but the LaTeX will still be valid. Feel free to copy it and render it in another environment.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Interactive Conversion with Streamlit

For a more hands-on experience, Texify includes a Streamlit app that enables you to interactively select equations from images or PDF files. To set it up:

  • Install the required packages:
  • pip install streamlit streamlit-drawable-canvas
  • Run the app using the command:
  • streamlit run texify_gui.py

Final Thoughts

Texify is not only a robust model for OCR but also a crucial tool for researchers, educators, and students alike, enabling the easy conversion of mathematical content into usable formats. Remember, slight adjustments can yield much better outputs, much like tuning a musical instrument! At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox