The tf_mixnet_s.in1k is an impressive MixNet image classification model trained on ImageNet-1k. This blog will walk you through setting it up and using it for image classification, feature map extraction, and generating image embeddings, all while making it user-friendly.
Model Overview
The MixNet model is designed for image classification tasks and features a backbone to support various tasks. Here’s a brief overview:
- Model Type: Image classification feature backbone
- Parameters: 4.1 million
- GMACs: 0.3
- Activations: 6.3 million
- Image Size: 224 x 224
For more detailed information on the architecture, you can explore the MixConv: Mixed Depthwise Convolutional Kernels paper.
Using the tf_mixnet_s.in1k Model
1. Image Classification
To classify images, follow these steps:
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
model = timm.create_model("tf_mixnet_s.in1k", pretrained=True)
model = model.eval()
# Get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # Unsqueeze single image into a batch of 1
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
In this code:
- We first retrieve an image from a URL, akin to a chef pulling ingredients from the pantry.
- Next, we create the model, similar to assembling our kitchen tools ready for cooking.
- We then apply transformations to our image, akin to prepping our ingredients before cooking.
- Finally, we classify the image and retrieve the top five predictions, much like presenting the best dishes from a buffet.
2. Feature Map Extraction
Feature maps help in visualizing the various aspects the model learns from the images. Here’s how to extract them:
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
model = timm.create_model("tf_mixnet_s.in1k", pretrained=True, features_only=True)
model = model.eval()
# Get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # Unsqueeze single image into a batch of 1
for o in output:
print(o.shape)
Similar to how chefs check the consistency and presentation of each dish, we examine the output shapes of each feature map generated by the model:
- Feature maps correspond to the transformations applied at various layers, revealing strengths in the model’s training.
3. Image Embeddings
Image embeddings, useful for various downstream tasks, can be obtained as follows:
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
model = timm.create_model("tf_mixnet_s.in1k", pretrained=True, num_classes=0) # Remove classifier nn.Linear
model = model.eval()
# Get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # Output shaped as (batch_size, num_features)
output = model.forward_features(transforms(img).unsqueeze(0)) # Unpooled output
output = model.forward_head(output, pre_logits=True) # Output shaped as (1, num_features)
Extracting embeddings is like taking a snapshot of the finished dish to showcase its flavors. The output tensor provides a compact representation of the image, ready for use in various applications.
Troubleshooting
If you encounter any issues while setting up or using the model, here are some troubleshooting tips:
- Issue with image URL: Ensure the URL is accessible and contains a valid image format.
- Model not loading: Check for proper installation of the timm library and that your Python version is compatible.
- Runtime errors: Verify the input image dimensions match the model’s requirements (224 x 224 pixels).
- Unclear outputs: Simplify your model calls and check intermediate values to understand where the issue might be occurring.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Model Comparison
Interested in comparing the performance of this model with others? You can explore dataset and runtime metrics through the timm model results.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

