DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Feb 22, 2023 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitstable_diffusionreadme_iamNCJ_DiLightNet

Chong Zeng · Yue Dong · Pieter Peers · Youkang Kong · Hongzhi Wu · Xin Tong

SIGGRAPH 2024 Conference Proceedings

Project Page | arXiv | DiLightNet Model | DiLightNet Demonstration

DiLightNet introduces a sophisticated technique for applying fine-grained lighting control in text-driven diffusion-based image generation. This innovative approach operates through a three-stage process that includes provisional image generation, foreground synthesis, and background inpainting. In this repository, we provide open-source access to the ControlNet model utilized in the second stage of DiLightNet—a neural network designed to transform a provisional image alongside a mask and radiance hints into a foreground image illuminated by targeted lighting conditions.

Environment Setup
Usage
Training
Community Contributions
Citation

Environment Setup

To get started, we utilize the Blender Python binding (bpy) for radiance hint rendering. The implementation requires a minimum Python version of 3.10. Therefore, it is advisable to create a conda environment with Python 3.10, CUDA, and PyTorch dependencies. Here’s how you can set up your environment:

bash
conda create --name dilightnet python=3.10 pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
conda activate dilightnet
git clone https://github.com/iamNCJ/DiLightNet
cd DiLightNet
pip install -r requirements.txt

Usage

To utilize DiLightNet, it’s pivotal to load the Neural Texture Control Net module and model weights. Below is an illustration of how to do this:

python
from diffusers.utils import get_class_from_dynamic_module
NeuralTextureControlNetModel = get_class_from_dynamic_module(
    "dilightnet.model_helpers",
    "neuraltexture_controlnet.py",
    "NeuralTextureControlNetModel"
)
neuraltexture_controlnet = NeuralTextureControlNetModel.from_pretrained("DiLightNet/DiLightNet")

Inference with StableDiffusionControlNetPipeline

The core model of DiLightNet is based on Stable Diffusion 2.1. Here’s how to set up an inference pipeline:

python
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1", controlnet=neuraltexture_controlnet,
)
cond_image = torch.randn((1, 16, 512, 512))
image = pipe("some text prompt", image=cond_image).images[0]

In this analogy, think of the neural network as a skilled artist. The provisional image you provide is like the initial sketch. The mask is akin to a stencil that helps the artist keep certain areas blank or highlight specific features. Finally, the radiance hints dictate the lighting of the scene, allowing the artist to portray shadows, highlights, and ambience just as they would in an actual painting. By iterating through these steps, it becomes possible to create masterpieces tailored to your specifications through fine-grained control.

Troubleshooting

If you encounter issues with dependencies, ensure that you are using the correct versions of Python and the required libraries.
For problems related to image processing or model loading, verify the paths and formats of the input images.
If the generation output does not meet expectations, experiment with different seeds and lighting parameters.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Training

The training setup involves a complex process to prepare and render images accurately. This requires careful organization of 3D model data and rendering scripts to ensure robust training outputs. Here’s a brief rundown:

Training data preparation can be managed through various steps, starting from scripts that handle the rendering of 3D models under specified lighting conditions to generating JSONL files for training data management.

Community Contributions

We welcome contributions in various forms, such as improved image generation pipelines, community adaptations of the ControlNet model, and even enhancements for better sampling strategies. Feel free to open issues or submit pull requests to enrich this project!

Citation

If you find DiLightNet beneficial for your research, kindly cite the work using the following format:

@inproceedings{zeng2024dilightnet,
    title={DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation},
    author={Chong Zeng and Yue Dong and Pieter Peers and Youkang Kong and Hongzhi Wu and Xin Tong},
    booktitle={ACM SIGGRAPH 2024 Conference Papers},
    year={2024}
}

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox