How to Use Squirrel: A Guide to Collaborate, Load, and Transform Data Efficiently

Jun 18, 2023 | Data Science

Welcome to the world of Squirrel, a powerful Python library designed to empower machine learning (ML) teams in sharing, loading, and transforming data seamlessly. In this guide, we will walk you through the essentials of using Squirrel for your projects.

What is Squirrel?

Squirrel is not just any library; it’s like having a flexible toolkit that allows ML teams to work collaboratively. Whether it’s speeding up data loads, reducing costs, or adapting to various data types, Squirrel has you covered!

  • SPEED: No more waiting for expensive GPUs in your machines while data loads!
  • COSTS: Decrease your cloud storage costs by sharding and loading data in bundles.
  • FLEXIBILITY: Adaptable to any dataset, including multimodal data types.
  • COLLABORATION: Simplified sharing of data and code within teams through a self-service model.

Getting Started with Squirrel

Streaming data to your ML model is quick and easy with Squirrel. Here’s how you can do it:

it = (Catalog.from_plugins()
    [imagenet]
    .get_driver()
    .get_iter(train)
    .map(lambda r: (augment(r[image]), r[label]))
    .batched(100))

Think of this code as a well-organized choreographed dance, with every move in sync: sourcing the catalog is like choosing your dancers, the driver gets them in formation, and each iteration dances through the data while augmenting and batching it for the spotlight!

For a step-by-step guide, check out our full getting started tutorial.

Installation

To install Squirrel, open your terminal and run:

pip install squirrel-core

If you want all features, simply run:

pip install squirrel-core[all]

Alternatively, choose the specific dependencies you need:

pip install squirrel-core[gcs,torch]

Visit the installation documentation for more details.

Documentation and Community Resources

Access our complete documentation, which serves as a vital resource to navigate through Squirrel at ReadTheDocs.

Additionally, you may delve into the companion package, Squirrel-datasets-core, which enhances data transformation and access through custom drivers.

Contributing to Squirrel

Squirrel thrives on community contributions! If interested, please see our contribution guide to learn how you can help.

Troubleshooting

If you encounter issues while using Squirrel, here are a few troubleshooting steps:

  • Ensure you have all the necessary dependencies installed as indicated in the documentation.
  • Check your internet connection if accessing remote datasets.
  • Look for error messages in the terminal and consult the documentation or online forums for specific solutions.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

The Humans Behind Squirrel

Developed by Merantix Momentum, Squirrel was born from real-life challenges faced by a team of around 30 ML engineers. They sought a solution that enabled flexible data loading and transforming without the constraints typical of traditional setups. Whether for research or industry applications, Squirrel is here to help you on your journey!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox