Welcome to the world of Automated Machine Learning (AutoML) where we can efficiently design high-performance architectures for tasks such as image classification and language modeling using DARTS (Differentiable Architecture Search). This blog will walk you through the process of setting up DARTS, running architecture searches, and evaluating results. Strap in as we explore this innovative method!
Understanding DARTS
DARTS employs a continuous relaxation and gradient descent approach to search for optimal neural network architectures. Imagine you’re a chef in a kitchen filled with various ingredients (network components). Instead of randomly concocting recipes, DARTS allows you to trial combinations iteratively, adjusting your recipe based on taste until you’ve crafted the perfect dish (optimal architecture) — all on a single GPU!
Requirements
- Python: 3.5.5
- PyTorch: 0.3.1
- torchvision: 0.2.0
Note: Using PyTorch 0.4 is currently unsupported and may lead to out-of-memory (OOM) issues.
Getting Started with Datasets
To utilize DARTS, you’ll need datasets for image classification and language modeling. Here’s how to acquire them:
- Download PTB and WT2 datasets: Instructions are available here.
- CIFAR-10 can be automatically downloaded using torchvision.
- ImageNet must be manually downloaded (preferably to a SSD) following the instructions here.
Using Pretrained Models
To take a shortcut and evaluate pretrained DARTS models, execute the following commands:
CIFAR-10
cd cnn
python test.py --auxiliary --model_path cifar10_model.pt
Expected result: 2.63% test error rate with 3.3M model parameters.
PTB
cd rnn
python test.py --model_path ptb_model.pt
Expected result: 55.68 test perplexity with 23M model parameters.
ImageNet
cd cnn
python test_imagenet.py --auxiliary --model_path imagenet_model.pt
Expected result: 26.7% top-1 error and 8.7% top-5 error with 4.7M model parameters.
Architecture Search
To carry out architecture searches using the second-order approximation, execute:
cd cnn
python train_search.py --unrolled
cd rnn
python train_search.py --unrolled
Error evaluation during this stage does not indicate the final performance. It’s crucial to run different trials as various seeds can lead to different local minimum outcomes.
Architecture Evaluation
After identifying the best architectures, you can evaluate their performance from scratch:
cd cnn
python train.py --auxiliary --cutout # CIFAR-10
cd rnn
python train.py # PTB
cd rnn
python train.py --data ..data/wikitext-2 --dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7
cd cnn
python train_imagenet.py --auxiliary # ImageNet
Note: CIFAR-10 results may vary due to the non-deterministic nature of the cuDNN back-prop kernels. Expect an average test error of around 2.76 ± 0.09% after multiple runs.
Visualization of Learned Cells
To visualize your learned architectures, you’ll need the graphviz package. Run:
python visualize.py DARTS
You can replace “DARTS” with any customized architecture defined in genotypes.py
.
Troubleshooting
If you encounter issues during implementation, consider the following:
- Ensure that all packages are correctly installed with the specified versions.
- Check your dataset paths and ensure they are correctly set up for the models.
- If you face out-of-memory errors, consider reducing your batch size or the model size.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
Conclusion
With this guide, you should now have a solid foundation to start experimenting with DARTS for architecture search. Enjoy discovering the optimal configurations for your neural networks, and happy coding!