In the realm of image classification, utilizing advanced models can significantly enhance your project’s output. The WD SwinV2 Tagger v3 by SmilingWolf, now compatible with the π€ Transformers library, allows you to classify images intelligently. In this guide, we will walk you through the setup and usage of this powerful tool.
Step 1: Installation
To get started, you need to install the Transformers library. Open your terminal and run the following command:
pip install transformers
Step 2: Setting Up the Pipeline
Once you have the library installed, it’s time to set up your image classification pipeline. Below is a simple analogy to help understand what a pipeline does:
Imagine you are at a restaurant. The pipeline is like the entire dining experience: you place an order (inputting your image), the kitchen prepares the meal (processing the image using the model), and finally, the waiter brings it to your table (displaying the classification results).
Here’s how you can set it up in your code:
from transformers import pipeline
pipe = pipeline(
"image-classification",
model="p1atdev/wd-swinv2-tagger-v3-hf",
trust_remote_code=True,
)
print(pipe("sample.webp", top_k=15))
This code snippet initializes the image classification pipeline using the WD SwinV2 Tagger model. You can then classify your images and obtain the top results.
Step 3: Using the AutoModel
If you need more control over the model’s output, you can use the AutoModel feature. This method allows for more detailed interactions with the model, similar to a chef showing you how each ingredient in your dish contributes to the overall flavor.
Below is how you can implement this:
from PIL import Image
import numpy as np
import torch
from transformers import (AutoImageProcessor, AutoModelForImageClassification)
MODEL_NAME = "p1atdev/wd-swinv2-tagger-v3-hf"
model = AutoModelForImageClassification.from_pretrained(MODEL_NAME)
processor = AutoImageProcessor.from_pretrained(MODEL_NAME, trust_remote_code=True)
image = Image.open("sample.webp")
inputs = processor.preprocess(image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs.to(model.device, model.dtype))
logits = torch.sigmoid(outputs.logits[0]) # take the first logits
results = {model.config.id2label[i]: logit.float() for i, logit in enumerate(logits)}
results = {k: v for k, v in sorted(results.items(), key=lambda item: item[1], reverse=True) if v > 0.35} # 35% threshold
print(results) # rating tags and character tags are also included
This snippet exploits the model to analyze the image and return a dictionary of tags and scores that exceed a defined threshold.
Step 4: Optimization with π€ Optimum
If you’re looking to enhance performance, you can utilize the π€ Optimum integration, which makes the model faster and lighter. Itβs like upgrading your kitchen appliances for better efficiency in the cooking process. Here’s how to install and implement it:
pip install optimum[onnxruntime]
from transformers import pipeline
from optimum.pipelines import pipeline
pipe = pipeline(
"image-classification",
model="p1atdev/wd-swinv2-tagger-v3-hf",
trust_remote_code=True,
)
print(pipe("sample.webp", top_k=15))
With this setup, you can expect improved speed, though the accuracy may adjust slightly.
Understanding the Labels
The tags returned from the model are categorized as follows:
- Rating tags: Tags that help in understanding the nature of the image (e.g., rating:general, rating:sensitive).
- Character tags: Specific labels that denote characters in the image (e.g., character:frieren, character:hatsune miku).
Troubleshooting
If you encounter issues during installation or execution, here are some troubleshooting tips:
- Ensure that you have the latest version of Python and the Transformers library installed.
- Double-check that all model names and paths are correctly specified.
- If you experience runtime errors related to memory, consider using a machine with more RAM or optimizing your input images.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

