Welcome to the world of Scenic, a powerful codebase that simplifies the creation and training of attention-based models for various computer vision tasks. Whether you’re interested in classification, segmentation, or detection across image, video, and audio modalities, Scenic has something to offer.
What We Offer
Scenic is more than just a library. Here’s what you can expect:
- Boilerplate code for launching experiments, summary writing, logging, profiling, etc.
- Optimized training and evaluation loops, losses, metrics, bipartite matchers, etc.
- Input pipelines for popular vision datasets.
- Baseline models, including strong non-attentional baselines.
State-of-the-Art (SOTA) Models and Baselines in Scenic
Scenic hosts several SOTA models and has reimplemented pivotal baselines. To give you a taste, here are some noteworthy projects:
- ViViT: A Video Vision Transformer
- OmniNet: Omnidirectional Representations from Transformers
- TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
- VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface Modeling
- A Generative Approach for Wikipedia-Scale Visual Entity Recognition
More information can be found in the projects section.
Philosophy
At Scenic, our goal is to enable rapid prototyping of large-scale vision models. We believe in:
- Forking and copying over added complexity.
- Upstreaming functionality only when it demonstrates usefulness across tasks.
Getting Started
Ready to dive in? Here’s how to get started with Scenic:
- Ensure you have Python 3.9 or later.
- Clone the code from GitHub:
- Install the necessary dependencies:
- Run training for ViT on ImageNet:
- Refer to specific projects’ README.md or requirements.txt for additional packages.
git clone https://github.com/google-research/scenic.git
cd scenic
pip install .
python scenic/main.py -- --config=scenic/projects/baselines/configs/imagenet/imagenet_vit_config.py --workdir=.
For a hands-on experience, check out this Colab notebook to train a simple feed-forward model using Scenic.
Understanding Scenic Component Design
Let’s visualize the design of Scenic using an analogy. Imagine Scenic as a well-organized toolbox, where each type of tool has its designated drawer:
- Library-level code: These are your essential tools – the screwdrivers and hammers that everyone uses. This part is minimal, well-tested, and provides shared functionalities like loading datasets, building models, and constructing training loops.
- Project-level code: Think of this as the specialized tools that you only take out for certain projects, like a specific type of saw for woodworking. This allows for extensive customization, from hyperparameters to training techniques.
The modular nature of Scenic allows projects to stand anywhere between using predefined tools or crafting bespoke solutions with intricate designs.
Troubleshooting Tips
While using Scenic, you might encounter some hiccups. Here are some troubleshooting suggestions:
- Ensure your environment meets Python version requirements.
- Check for any missing packages mentioned in the specific project’s README.md or requirements.txt.
- If you encounter errors during training, verify your configurations and hyperparameters.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.