In this blog, we’ll explore the amazing capabilities of a Vision Transformer fine-tuned on the kvasir_v2 dataset tailored for colonoscopy classification. This cutting-edge model boasts an impressive accuracy of 0.93, making it a reliable tool for analyzing medical images. Let’s dive into how to get started!
Live Demo
Before we jump into the coding details, you can test the model yourself! Drag the following images to the widget to see the model in action:
Training and Metrics
The model has shown exceptional results during its evaluation phase, producing the following metrics:
precision recall f1-score support
dyed-lifted-polyps 0.95 0.93 0.94 60
dyed-resection-margins 0.97 0.95 0.96 64
esophagitis 0.93 0.79 0.85 67
normal-cecum 1.00 0.98 0.99 54
normal-pylorus 0.95 1.00 0.97 57
normal-z-line 0.82 0.93 0.87 67
polyps 0.92 0.92 0.92 52
ulcerative-colitis 0.93 0.95 0.94 59
accuracy 0.93 480
macro avg 0.93 0.93 0.93 480
weighted avg 0.93 0.93 0.93 480
How to Use the Model
Here’s a step-by-step guide on how to implement the model in your own projects!
- First, import the necessary libraries:
- Define the model path and initialize the classifier:
- Finally, provide the image path and classify the image:
from transformers import ViTFeatureExtractor, ViTForImageClassification
from hugsvision.inference.VisionClassifierInference import VisionClassifierInference
path = "mrm8488/vit-base-patch16-224_finetuned-kvasirv2-colonoscopy"
classifier = VisionClassifierInference(
feature_extractor = ViTFeatureExtractor.from_pretrained(path),
model = ViTForImageClassification.from_pretrained(path),
)
img = "Your image path"
label = classifier.predict(img_path=img)
print("Predicted class:", label)
Code Analogy
Imagine your code as a chef preparing a gourmet meal. The imports act like the ingredients that you gather at the market. You have fresh vegetables (ViTFeatureExtractor) and a perfectly marinated meat (ViTForImageClassification). The preparation process includes washing, chopping, and sautéing those ingredients, just like you initialize your classifier with the necessary model. Finally, the image represents the meal, and predicting the class is akin to tasting your dish to confirm it’s delicious. Just as a chef refines recipes, you can tweak the model to improve accuracy further!
Troubleshooting
If you encounter any issues while using this model, here are some common troubleshooting steps:
- Ensure that you have the correct version of the Transformers library installed.
- Check if the image path is correct and that the image format is supported.
- Review the console for any error messages, which can provide clues on what went wrong.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.




