In the evolving field of pharmaceutical research, DrugGPT shines as a beacon of innovation. Built on the structure of the GPT model, DrugGPT harnesses the power of natural language processing to explore the vast landscape of chemical possibilities. This guide will take you step-by-step through deploying and utilizing DrugGPT to design potential drug candidates, while ensuring a user-friendly experience throughout.
đźš© Introduction
DrugGPT is a generative model designed specifically for pharmaceutical applications. By leveraging up to 1.8 million data points related to protein-ligand binding, DrugGPT holds the promise of uncovering new molecules that can effectively bind specific proteins. The aim? To foster innovation in drug design and provide faster access to promising drug candidates.
📥 Deployment
To get started with DrugGPT, follow these deployment steps:
- Clone the repository
git clone https://github.com/LIYUESEN/druggpt.git cd druggptAlternatively, you can visit our GitHub repo and click Code > Download ZIP to download the repository.
- Create a virtual environment
conda create -n druggpt python=3.7 conda activate druggpt - Download Python dependencies
pip install datasets transformers scipy scikit-learn pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 conda install -c openbabel openbabel
đź—ť How to Use
Once you’ve set up DrugGPT, you can harness its power through the drug_generator.py script. Below are the required parameters you’ll need and possible use cases:
Required Parameters:
-p, --pro_seq: Input a protein amino acid sequence.-f, --fasta: Input a FASTA file. Only one of-pand-fshould be specified.-l, --ligand_prompt: Input a ligand prompt.-e, --empty_input: Enable directly generate mode.-n, --number: Specify how many molecules to generate.-d, --device: Choose a hardware device (default is CUDA).-o, --output: Specify the output directory for generated molecules (default is.ligand_output).-b, --batch_size: Define how many molecules will be generated per batch (default is32).
Example Usage:
- Input a protein FASTA file:
python drug_generator.py -f bcl2.fasta -n 50 - Input the amino acid sequence of a protein:
python drug_generator.py -p MAKQPSDVSSECDREGRQLQPAERPPQLRPGAPTSLQTEPQGNPEGNHGGEGDSCPHGSPQGPLAPPASPGPFATRSPLFIFMRRSSLLSRSSSGYFSFDTDRSPAPMSCDKSTQTPSPPCQAFNHYLSAMASMRQAEPADMRPEIWIAQELRRIGDEFNAYYARRVFLNNYQAAEDHPRMVILRLLRYIVRLVWRMH -n 50 - Provide a ligand prompt:
python drug_generator.py -f bcl2.fasta -l COc1ccc(cc1)C(=O) -n 50Note: If running in a Linux environment, enclose the ligand prompt with single quotes (
').
🏗 Troubleshooting
If you encounter any issues while deploying or using DrugGPT, try the following troubleshooting tips:
- Ensure that all dependencies are correctly installed within the virtual environment.
- If you experience memory-related issues, consider reducing the
--batch_size. - Check that you are using the correct Python version, as DrugGPT requires Python 3.7.
- If the model fails to generate expected output, verify your input parameters for correctness.
- For persistent issues, consult the GitHub issues page for additional support.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At the Core of DrugGPT’s Operation: An Analogy
Think of using DrugGPT like being a chef in a lab kitchen. The recipes are your proteins and ligands, and the ingredients you can explore are the vast chemical compounds in the universe. Just as a chef would mix and match flavors to create an exquisite dish, you use DrugGPT to experiment with protein-ligand binding combinations, effectively discovering new “dishes” of drug candidates from the “pantry” of molecular structures available to you. The training on 1.8 million protein-ligand bindings serves as the chef’s experience—helping you create enticing new recipes (molecules) that have the potential to treat diseases.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

