Chong Zeng · Yue Dong · Pieter Peers · Youkang Kong · Hongzhi Wu · Xin Tong
SIGGRAPH 2024 Conference Proceedings
Project Page |
arXiv |
DiLightNet Model |
DiLightNet Demonstration
DiLightNet introduces a sophisticated technique for applying fine-grained lighting control in text-driven diffusion-based image generation. This innovative approach operates through a three-stage process that includes provisional image generation, foreground synthesis, and background inpainting. In this repository, we provide open-source access to the ControlNet model utilized in the second stage of DiLightNet—a neural network designed to transform a provisional image alongside a mask and radiance hints into a foreground image illuminated by targeted lighting conditions.
Table of Contents
Environment Setup
To get started, we utilize the Blender Python binding (bpy) for radiance hint rendering. The implementation requires a minimum Python version of 3.10. Therefore, it is advisable to create a conda environment with Python 3.10, CUDA, and PyTorch dependencies. Here’s how you can set up your environment:
bash
conda create --name dilightnet python=3.10 pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
conda activate dilightnet
git clone https://github.com/iamNCJ/DiLightNet
cd DiLightNet
pip install -r requirements.txt
Usage
To utilize DiLightNet, it’s pivotal to load the Neural Texture Control Net module and model weights. Below is an illustration of how to do this:
python
from diffusers.utils import get_class_from_dynamic_module
NeuralTextureControlNetModel = get_class_from_dynamic_module(
"dilightnet.model_helpers",
"neuraltexture_controlnet.py",
"NeuralTextureControlNetModel"
)
neuraltexture_controlnet = NeuralTextureControlNetModel.from_pretrained("DiLightNet/DiLightNet")
Inference with StableDiffusionControlNetPipeline
The core model of DiLightNet is based on Stable Diffusion 2.1. Here’s how to set up an inference pipeline:
python
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1", controlnet=neuraltexture_controlnet,
)
cond_image = torch.randn((1, 16, 512, 512))
image = pipe("some text prompt", image=cond_image).images[0]
In this analogy, think of the neural network as a skilled artist. The provisional image you provide is like the initial sketch. The mask is akin to a stencil that helps the artist keep certain areas blank or highlight specific features. Finally, the radiance hints dictate the lighting of the scene, allowing the artist to portray shadows, highlights, and ambience just as they would in an actual painting. By iterating through these steps, it becomes possible to create masterpieces tailored to your specifications through fine-grained control.
Troubleshooting
- If you encounter issues with dependencies, ensure that you are using the correct versions of Python and the required libraries.
- For problems related to image processing or model loading, verify the paths and formats of the input images.
- If the generation output does not meet expectations, experiment with different seeds and lighting parameters.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Training
The training setup involves a complex process to prepare and render images accurately. This requires careful organization of 3D model data and rendering scripts to ensure robust training outputs. Here’s a brief rundown:
Training data preparation can be managed through various steps, starting from scripts that handle the rendering of 3D models under specified lighting conditions to generating JSONL files for training data management.
Community Contributions
We welcome contributions in various forms, such as improved image generation pipelines, community adaptations of the ControlNet model, and even enhancements for better sampling strategies. Feel free to open issues or submit pull requests to enrich this project!
Citation
If you find DiLightNet beneficial for your research, kindly cite the work using the following format:
@inproceedings{zeng2024dilightnet,
title={DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation},
author={Chong Zeng and Yue Dong and Pieter Peers and Youkang Kong and Hongzhi Wu and Xin Tong},
booktitle={ACM SIGGRAPH 2024 Conference Papers},
year={2024}
}
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

