How to Get Started with LaDI-VTON

Nov 17, 2020 | Data Science

LaDI-VTON (Latent Diffusion Textual-Inversion Enhanced Virtual Try-On) opens a new realm in the virtual fashion industry by leveraging advanced generative models to create realistic images of clothing on virtual models. In this guide, we will walk you through the process of using LaDI-VTON, from installation to inference, and troubleshooting.

Getting Started

The first step to harnessing the power of LaDI-VTON is setting up your environment. We recommend using the Anaconda package manager to avoid dependency issues.

Installation

  1. Clone the repository:
  2. git clone https://github.com/miccunifi/ladi-vton
  3. Install Python dependencies:
  4. conda env create -n ladi-vton -f environment.yml
    conda activate ladi-vton
  5. If you prefer, create a new conda environment manually:
  6. conda create -n ladi-vton -y python=3.10
    conda activate ladi-vton
    pip install torch==2.0.1 torchvision==0.15.2 opencv-python==4.7.0.72 diffusers==0.14.0 transformers==4.27.3 accelerate==0.18.0 clean-fid==0.1.35 torchmetrics[image]==0.11.4 wandb==0.14.0 matplotlib==3.7.1 tqdm xformers

Data Preparation

LaDI-VTON works with two major datasets: DressCode and VITON-HD. Follow the steps to set them up:

For DressCode:

  1. Download the DressCode dataset.
  2. To enhance performance, use in-shop images with a white background.
  3. Download pre-extracted masks from here, and place them in the dataset folder alongside your images.

For VITON-HD:

  1. Download the VITON-HD dataset.

Inference with Pre-trained Models

To run inference on either dataset, use the command below:

python src/inference.py --dataset [dresscode vitonhd] --dresscode_dataroot path --vitonhd_dataroot path --output_dir path --test_order [paired unpaired] --category [all lower_body upper_body dresses] --mixed_precision [no fp16 bf16] --enable_xformers_memory_efficient_attention --use_png --compute_metrics

Be sure to replace the placeholders with actual data paths.

Training Your Model

Training the model consists of several steps:

1. Train Warping Module:

python src/train_tps.py --dataset [dresscode vitonhd] --dresscode_dataroot path --vitonhd_dataroot path --checkpoints_dir path --exp_name str

2. Train EMASC Module:

python src/train_emasc.py --dataset [dresscode vitonhd] --dresscode_dataroot path --vitonhd_dataroot path --output_dir path

3. Extract and Pre-train Inversion Adapter.

python src/train_inversion_adapter.py --dataset [dresscode vitonhd] --dresscode_dataroot path --vitonhd_dataroot path --output_dir path

And finally, to train the VTO model:

python src/train_vto.py --dataset [dresscode vitonhd] --dresscode_dataroot path --vitonhd_dataroot path --output_dir path

Troubleshooting

If you encounter issues while getting started, consider the following troubleshooting tips:

  • Ensure all dependencies are properly installed.
  • Check if your data paths are correctly set.
  • Verify that you have sufficient resources (memory, GPU capability) to run large models.
  • If you encounter CUDA errors, ensure that your GPU drivers are up to date.
  • For any unexpected errors, refer to the GitHub Issues Page for solutions.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

In Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations. Enjoy creating stunning virtual try-ons with LaDI-VTON!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox