In this blog, we will guide you through the step-by-step process of training an image classifier to distinguish between cats and dogs using a dataset from Hugging Face. We’re leveraging the ResNet18 model from the torchvision library. With our setup, you can achieve impressive accuracy while learning more about deep learning practices. Let’s dive in!
General Information
- Used Dataset: cats_vs_dogs
- Labeling Logic: Flipped images are labeled as 1, and unflipped ones as 0.
- Model: ResNet18 from torchvision
- Number of Classes: 2 (0 = No flip, 1 = Flipped image)
- Train-Test Split: 70-30
Sample Images and Labels
Here’s a taste of what the dataset looks like:

Specific Information About the Dataset
During training, the following broken or corrupted files were removed:
- .kagglecatsanddogs_3367aPetImagesCat666.jpg
- .kagglecatsanddogs_3367aPetImagesCat10404.jpg
- .kagglecatsanddogs_3367aPetImagesDog11702.jpg
Training Information
Here’s the setup for training the model:
- Total Epochs: 5
- Pretrained: True (ImageNet weight, all layers are trainable)
- Image Size: 224 x 224
- Batch Size: 128
- Optimizer: SGD
- Learning Rate: 0.001 (constant throughout training)
- Momentum: 0.9
- Loss: CrossEntropy Loss
Results
Your trained model achieved the following impressive metrics:
- Accuracy: 98.4266%
- F1 Score: 98.4271%
- Recall: 98.4261%
- Precision: 98.4265%
Confusion Matrix
Here’s how the model performed visually:

Misclassified Images
Some images were misclassified, and we can learn from these errors:

Possible Improvements
How can we take our model performance from great to even better? Here are a few suggestions:
- Improve the quality of occluded and partially visible images in the dataset.
- Hyperparameter tuning could provide better results. Testing with a cyclical learning rate could help the model overcome local minima.
- Augmentation techniques like CutMix could assist with images that present a unique pose different from the dataset.
Troubleshooting
If you encounter any issues throughout your training process, here are some troubleshooting ideas:
- Ensure your dataset is correctly formatted and not corrupted. Double-check your image paths.
- Monitor memory usage during training; consider lowering the batch size if you run out of memory.
- Experiment with different optimizers and learning rates if your model fails to converge.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
Code Analogy
Think of your model training as a chef preparing a complex dish. Each ingredient represents a layer in your model. Just like a chef meticulously selects, measures, and combines these ingredients (in this case, images and labels), you too combine your hyperparameters, loss functions, and optimization techniques to create a masterpiece – a well-trained image classifier!
