Overeasy

Jun 11, 2024 | Data Science

Issues License Docs Colab Demo

Create powerful zero-shot vision models!

Overeasy allows you to chain zero-shot vision models to create custom end-to-end pipelines for tasks like:

  • Bounding Box Detection
  • Classification
  • Segmentation (Coming Soon!)

All of this can be achieved without needing to collect and annotate large training datasets. Overeasy makes it simple to combine pre-trained zero-shot models to build powerful custom computer vision solutions.

Installation

Its as easy as

bash
pip install overeasy

For installing extras, refer to our Docs.

Key Features

  • Agents: Specialized tools that perform specific image processing tasks.
  • Workflows: Define a sequence of Agents to process images in a structured manner.
  • Execution Graphs: Manage and visualize the image processing pipeline.
  • Detections: Represent bounding boxes, segmentation, and classifications.

Documentation

For more details on types, library structure, and available models please refer to our Docs.

Example Usage

Note: If you don’t have a local GPU, you can run our examples by making a copy of this Colab notebook.

Download example image

bash
wget https://github.com/overeasy-sh/overeasy/blob/73adbaeba51f532a7023243266da826ed1ced6ec/examples/construction.jpg?raw=true -O construction.jpg

Example workflow to identify if a person is wearing a PPE on a work site:

python
from overeasy import *
from overeasy.models import OwlV2
from PIL import Image

workflow = Workflow([
    # Detect each head in the input image
    BoundingBoxSelectAgent(classes=[‘persons head’], model=OwlV2()),
    # Applies Non-Maximum Suppression to remove overlapping bounding boxes
    NMSAgent(iou_threshold=0.5, score_threshold=0),
    # Splits the input image into images of each detected head
    SplitAgent(),
    # Classifies the split images using CLIP
    ClassificationAgent(classes=[‘hard hat’, ‘no hard hat’]),
    # Maps the returned class names
    ClassMapAgent(‘hard hat’: ‘has ppe’, ‘no hard hat’: ‘no ppe’),
    # Combines results back into a BoundingBox Detection
    JoinAgent()
])
image = Image.open(‘construction.jpg’)
result, graph = workflow.execute(image)
workflow.visualize(graph)

Understanding the Code

Imagine you’re a director making a movie. In this case, the movie is a process for identifying if a person is wearing PPE. Each step of the workflow can be compared to the scenes in your movie:

  • BoundingBoxSelectAgent: Like the actor who focuses on each head in the crowd!
  • NMSAgent: The editor who cuts out duplicate scenes for a clearer view.
  • SplitAgent: Similar to the director who isolates each head into their own scene.
  • ClassificationAgent: Think of this as the props manager—tagging the scenes with “hard hat” or “no hard hat.”
  • ClassMapAgent: The script supervisor ensuring all props are labeled correctly for continuity.
  • JoinAgent: The final cut, bringing all scenes back together for the big picture!

Once the workflow is executed, you not only get the results but also a breathtaking visualization of your movie through the graph.

Diagram

Here’s a diagram of this workflow. Each layer in the graph represents a step in the workflow:

Execution Graph

The image and data attributes in each node are used together to visualize the current state of the workflow. Calling the visualize function on the workflow will spawn a Gradio instance that looks like this.

Troubleshooting

If you encounter issues during installation or execution, consider checking the following:

  • Ensure you have the latest versions of pip and Python installed.
  • Verify your network connection if installation fails.
  • Check if the required model files are downloaded properly.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Support

If you have any questions or need assistance, please open an issue or reach out to us at help@overeasy.sh.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Let’s build amazing vision models together!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox