The world of image classification is continuously evolving, and the ConvNeXt Nano model is a prime example of how far we’ve come. Developed by Ross Wightman and trained on the renowned ImageNet-1k dataset, this model stands out due to its efficiency and performance. In this guide, we’ll explore how to use the ConvNeXt Nano for image classification, feature map extraction, and image embeddings.
Model Overview
- Model Type: Image classification feature backbone
- Parameters: 15.6M
- GMACs: 2.5
- Image Size: Train = 224 x 224, Test = 288 x 288
- Papers: A ConvNet for the 2020s
- Dataset: ImageNet-1k
Getting Started with Image Classification
To get started with image classification using ConvNeXt Nano, follow these steps:
- Import the necessary libraries and load your image.
- Initialize the ConvNeXt model.
- Apply data transformations.
- Run the model and obtain classification probabilities.
Sample Code for Image Classification
from urllib.request import urlopen
from PIL import Image
import timm
# Load the image
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
# Initialize the model
model = timm.create_model('convnext_nano.d1h_in1k', pretrained=True)
model = model.eval()
# Data transformation
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
# Make predictions
output = model(transforms(img).unsqueeze(0)) # Unsqueeze to create a batch of 1
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
Feature Map Extraction
Feature map extraction is similar to peeling an onion to find hidden layers beneath its surface. Here’s how to extract the model’s feature maps:
- Load the model with the feature extraction mode enabled.
- Transform the input image.
- Extract the feature maps from the model’s layers.
Sample Code for Feature Map Extraction
from urllib.request import urlopen
from PIL import Image
import timm
# Load the image
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
# Initialize model for feature extraction
model = timm.create_model('convnext_nano.d1h_in1k', pretrained=True, features_only=True)
model = model.eval()
# Data transformation
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
# Extract features
output = model(transforms(img).unsqueeze(0)) # Unsqueeze to create a batch of 1
# Output shapes
for o in output:
print(o.shape)
Generating Image Embeddings
Image embeddings are like compressing a whole library into just a few key insights. Here’s a quick way to obtain embeddings:
- Load the model with the classifier removed.
- Transform your image.
- Run the model and retrieve the embeddings.
Sample Code for Image Embeddings
from urllib.request import urlopen
from PIL import Image
import timm
# Load the image
img = Image.open(urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png"))
# Initialize model for embeddings
model = timm.create_model('convnext_nano.d1h_in1k', pretrained=True, num_classes=0) # Remove classifier
model = model.eval()
# Data transformation
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
# Generate embeddings
output = model(transforms(img).unsqueeze(0)) # Output shape is (batch_size, num_features)
output = model.forward_features(transforms(img).unsqueeze(0)) # Unpooled
output = model.forward_head(output, pre_logits=True) # Get shape (1, num_features)
Troubleshooting Tips
If you encounter issues while using the ConvNeXt Nano model, consider the following solutions:
- Import Errors: Ensure that all necessary libraries (like
timmandtorch) are properly installed. - Image Not Loading: Verify the image URL; make sure it’s accessible.
- Incompatible Input Size: Ensure the input image dimensions match the expected model input size (224×224 for training and 288×288 for testing).
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
Using the ConvNeXt Nano, you can efficiently classify images, extract valuable feature maps, and generate insightful embeddings. As you explore its capabilities, you’ll find that it delivers both performance and ease of use, much like having a sharp, efficient tool at your disposal.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

