How to Use the LILT-EN-FUNSD Model

Nov 26, 2022 | Educational

The LILT-EN-FUNSD model is an advanced tool designed for document understanding, specifically fine-tuned to process layouts in documents. In this guide, we will walk you through the steps of utilizing this model effectively.

Model Overview

The LILT-EN-FUNSD model is a fine-tuned version of the lilt-roberta-en-base model on the FUNSD-layoutLMv3 dataset. It provides a range of evaluations and stellar metrics, including:

  • Overall Precision: 0.8797
  • Overall Recall: 0.9006
  • Overall F1 Score: 0.8900
  • Overall Accuracy: 0.8204

Setting Up the Model

To use the LILT-EN-FUNSD model, follow these steps:

python
from transformers import LiltForTokenClassification, LayoutLMv3Processor
from PIL import Image, ImageDraw, ImageFont
import torch

# Load model and processor from Hugging Face Hub
model = LiltForTokenClassification.from_pretrained("philschmid/lilt-en-funsd")
processor = LayoutLMv3Processor.from_pretrained("philschmid/lilt-en-funsd")

Understanding the Code

Imagine creating a superhero team of your own, with each member representing a specific function. In this scenario, the LILT-EN-FUNSD model is akin to your lead superhero, adept at understanding and interpreting documents. The LayoutLMv3Processor acts as the sidekick, preparing the data for the superhero to engage with.

The lines of code above load both the superhero (the model) and the sidekick (the processor) from a secret repository (Hugging Face Hub), allowing them to work together to clarify document layouts.

Running Inference

The key function used for drawing bounding boxes and predictions onto an image works like this:

python
def draw_boxes(image, boxes, predictions):
    width, height = image.size
    normalizes_boxes = [unnormalize_box(box, width, height) for box in boxes]
    draw = ImageDraw.Draw(image)
    font = ImageFont.load_default()
    for prediction, box in zip(predictions, normalizes_boxes):
        if prediction == "O":
            continue
        draw.rectangle(box, outline="black")
        draw.rectangle(box, outline=label2color[prediction])
        draw.text((box[0] + 10, box[1] - 10), text=prediction, fill=label2color[prediction], font=font)
    return image

Example of Inference Function

To see the model in action and apply it to an image, use the run_inference function:

python
def run_inference(image, model=model, processor=processor, output_image=True):
    encoding = processor(image, return_tensors="pt")
    del encoding["pixel_values"]
    outputs = model(**encoding)
    predictions = outputs.logits.argmax(-1).squeeze().tolist()
    labels = [model.config.id2label[prediction] for prediction in predictions]
    if output_image:
        return draw_boxes(image, encoding["bbox"][0], labels)
    else:
        return labels

run_inference(dataset[test][34]["image"])

Training Procedure and Hyperparameters

The model is trained with a set of hyperparameters that significantly contribute to its performance:

  • Learning Rate: 5e-05
  • Training Batch Size: 8
  • Optimizer: Adam
  • Training Steps: 2500

Troubleshooting Ideas

If you encounter issues while running the model or have specific questions, consider the following troubleshooting tips:

  • Ensure you have the latest versions of the necessary libraries (Transformers, PyTorch, Datasets, and Tokenizers).
  • Check your Tensor flow setup, as misconfigured environments can lead to errors.
  • Rerun your inference to see if it may have been a one-off error.
  • For any setup, installation, or coding questions, feel free to consult the community or reach out.
  • For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Thoughts

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox