Introduction to Baal
Baal is an active learning library designed to streamline both industrial applications and research use cases. You can explore the full documentation here. Our paper outlining the concepts and mechanisms behind Baal can be found on arXiv. For a quick introduction to Baal and Bayesian active learning, check out these valuable resources:
Originally developed by ElementAI before being independently managed, Baal is your go-to library for enhancing your machine learning projects.
Installation and Requirements
To use Baal effectively, ensure you have Python version 3.8 installed. You can easily install Baal using pip:
pip install baal
We also utilize Poetry as our package manager. To install Baal from the source, you can run:
poetry install
What is Active Learning?
Active learning is a unique approach in machine learning where the algorithm can interactively query the user or other information sources for desired outputs. To dive deeper into this fascinating subject, refer to our tutorial.
The Baal Framework
The Baal framework comprises four pivotal components for accomplishing active learning:
- ActiveLearningDataset
- Heuristics
- ModelWrapper
- ActiveLearningLoop
Imagine Baal as a concert orchestra where each section represents an individual method of active learning. Just like a conductor ensures that every section plays its part harmoniously, the Baal framework brings together active learning methods like Monte-Carlo Dropout, Deep ensembles, and Semi-supervised learning to perform complex tasks more efficiently.
Starting with Baal
To get started, wrap your dataset in the ActiveLearningDataset class to ensure the data is split into training and pool sets. The pool set contains unlabelled data awaiting your intervention.
Utilize the ModelWrapper for a familiar experience akin to Keras. If your model is not yet primed for active learning, we also provide modules, like the MCDropoutModule, to facilitate this adjustment.
Example Script
Your script for conducting active learning experiments may resemble the following:
dataset = ActiveLearningDataset(your_dataset)
dataset.label_randomly(INITIAL_POOL) # label some data
model = MCDropoutModule(your_model)
wrapper = ModelWrapper(model, args=TrainingArgs(...))
experiment = ActiveLearningExperiment(
trainer=wrapper,
al_dataset=dataset,
eval_dataset=test_dataset,
heuristic=BALD(),
query_size=100,
iterations=20,
pool_size=None,
criterion=None
)
metrics = experiment.start()
Re-running Experiments
To re-run experiments, you can use the following command:
bash docker build [--target base_baal] -t baal .
docker run --rm baal --gpus all python3 experiments/vgg_mcdropout_cifar10.py
Using Baal for Your Unique Experiments
Simply clone the repository and fashion your own experiment script, mirroring the structure in experiments/vgg_mcdropout_cifar10.py. Be sure to incorporate the four vital components of the Baal framework. Happy experimenting!
Troubleshooting
Should you encounter any issues during installation or while running your experiments, here are some troubleshooting ideas:
- Verify you have the correct version of Python installed.
- Ensure all dependencies are correctly installed via Poetry or pip.
- Check the compatibility of your model with the Baal framework requirements.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Contributing and License
We welcome contributions! For guidelines, check the CONTRIBUTING.md. To learn about the licensing, please review the LICENSE document.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.