Create powerful zero-shot vision models!
Overeasy allows you to chain zero-shot vision models to create custom end-to-end pipelines for tasks like:
- Bounding Box Detection
- Classification
- Segmentation (Coming Soon!)
All of this can be achieved without needing to collect and annotate large training datasets. Overeasy makes it simple to combine pre-trained zero-shot models to build powerful custom computer vision solutions.
Installation
Its as easy as
bash
pip install overeasy
For installing extras, refer to our Docs.
Key Features
- Agents: Specialized tools that perform specific image processing tasks.
- Workflows: Define a sequence of Agents to process images in a structured manner.
- Execution Graphs: Manage and visualize the image processing pipeline.
- Detections: Represent bounding boxes, segmentation, and classifications.
Documentation
For more details on types, library structure, and available models please refer to our Docs.
Example Usage
Note: If you don’t have a local GPU, you can run our examples by making a copy of this Colab notebook.
Download example image
bash
wget https://github.com/overeasy-sh/overeasy/blob/73adbaeba51f532a7023243266da826ed1ced6ec/examples/construction.jpg?raw=true -O construction.jpg
Example workflow to identify if a person is wearing a PPE on a work site:
python
from overeasy import *
from overeasy.models import OwlV2
from PIL import Image
workflow = Workflow([
# Detect each head in the input image
BoundingBoxSelectAgent(classes=[‘persons head’], model=OwlV2()),
# Applies Non-Maximum Suppression to remove overlapping bounding boxes
NMSAgent(iou_threshold=0.5, score_threshold=0),
# Splits the input image into images of each detected head
SplitAgent(),
# Classifies the split images using CLIP
ClassificationAgent(classes=[‘hard hat’, ‘no hard hat’]),
# Maps the returned class names
ClassMapAgent(‘hard hat’: ‘has ppe’, ‘no hard hat’: ‘no ppe’),
# Combines results back into a BoundingBox Detection
JoinAgent()
])
image = Image.open(‘construction.jpg’)
result, graph = workflow.execute(image)
workflow.visualize(graph)
Understanding the Code
Imagine you’re a director making a movie. In this case, the movie is a process for identifying if a person is wearing PPE. Each step of the workflow can be compared to the scenes in your movie:
- BoundingBoxSelectAgent: Like the actor who focuses on each head in the crowd!
- NMSAgent: The editor who cuts out duplicate scenes for a clearer view.
- SplitAgent: Similar to the director who isolates each head into their own scene.
- ClassificationAgent: Think of this as the props manager—tagging the scenes with “hard hat” or “no hard hat.”
- ClassMapAgent: The script supervisor ensuring all props are labeled correctly for continuity.
- JoinAgent: The final cut, bringing all scenes back together for the big picture!
Once the workflow is executed, you not only get the results but also a breathtaking visualization of your movie through the graph.
Diagram
Here’s a diagram of this workflow. Each layer in the graph represents a step in the workflow:
The image and data attributes in each node are used together to visualize the current state of the workflow. Calling the visualize function on the workflow will spawn a Gradio instance that looks like this.
Troubleshooting
If you encounter issues during installation or execution, consider checking the following:
- Ensure you have the latest versions of pip and Python installed.
- Verify your network connection if installation fails.
- Check if the required model files are downloaded properly.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Support
If you have any questions or need assistance, please open an issue or reach out to us at help@overeasy.sh.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
Let’s build amazing vision models together!