How to Implement Smaller, Faster Stable Diffusion Models

May 14, 2024 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitstable_diffusionreadme_segmind_distill-sd

Welcome to your guide on utilizing knowledge-distilled versions of Stable Diffusion that are not only smaller but also incredibly faster! Unofficially based on the methods outlined in BK-SDM, this repository harnesses the power of distillation to create models that mimic the quality of their larger counterparts while speeding up inference considerably.

Overview of the Repository Components

This repository comprises several essential scripts and configurations you will need for your training process:

data.py: Contains scripts to download data for training.
distill_training.py: Handles training of the U-net model. You may need to configure aspects like model type (e.g., sd_small or sd_tiny), batch size, and various hyperparameters.

Additional training formats such as LoRA and checkpointing are available via standard diffusers scripts.

Understanding Knowledge Distillation

Think of Knowledge Distillation training as a mentorship program. Imagine a seasoned chef (the large teacher model), who knows how to create exquisite dishes, teaching an eager apprentice (the student model) to replicate those creations. The apprentice learns not just by watching but by actively cooking smaller portions using simplified recipes, aiming to match the chef’s quality over time.

When training, the larger model acts as the teacher, providing insights on creating images, while the smaller model learns using a subset of the same data but focuses on mimicking the outputs of the larger model. The ultimate goal is for the smaller model to produce images that are almost indistinguishable from those generated by the larger teacher model.

Training Details & Settings

You can utilize scripts to train your models, adjusting parameters according to your needs. Here’s how you set up your training environment:

python
lr = 1e-5
scheduler = cosine
batch_size = 32
output_weight = 0.5  # Lambda Out in the final loss equation
feature_weight = 0.5  # Lambda Feat in the final loss equation

Model Usage

Here’s a small code snippet to help you get started:

python
import torch
from diffusers import DiffusionPipeline
from diffusers import DPMSolverMultistepScheduler
from torch import Generator

path = "segmind/small-sd"  # Path to the model
prompt = "Faceshot Portrait of pretty young (18-year-old) Caucasian wearing a high neck sweater..."
negative_prompt = "deformed iris, semi-realistic, 3d, render, sketch..."

torch.set_grad_enabled(False)
torch.backends.cudnn.benchmark = True

with torch.inference_mode():
    gen = Generator(cuda)
    gen.manual_seed(1674753452)
    pipe = DiffusionPipeline.from_pretrained(path, torch_dtype=torch.float16, safety_checker=None, requires_safety_checker=False)
    pipe.to(cuda)
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    pipe.unet.to(device=cuda, dtype=torch.float16, memory_format=torch.channels_last)
    img = pipe(prompt=prompt, negative_prompt=negative_prompt, width=512, height=512, num_inference_steps=25, guidance_scale=7, num_images_per_prompt=1, generator=gen).images[0]
    img.save("image.png")

Training Instructions

Follow the outlined structure to train the model effectively:

Specify --distill_level based on the model you aim to create.
For nuanced control over your training loss, adjust --output_weight and --feature_weight.
When resuming from checkpoints, set prepare_unet=False.

Troubleshooting Tips

Here are some common issues and their solutions:

If your model isn’t generating images, verify the model path and prompts for accuracy.
Error messages during training often stem from incorrect hyperparameters; recheck your settings, especially batch sizes.
If performance is lagging, ensure you’re utilizing GPU resources correctly and consider lowering the batch size for improved speed.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

This innovative approach to creating smaller, faster models is pivotal as we continue to make strides in AI. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox