Welcome to the exciting world of PlantCaduceus, a cutting-edge DNA language model that has been pre-trained on 16 Angiosperm genomes! Whether you are a researcher, a machine learning enthusiast, or just curious about DNA sequence modeling, this guide will walk you through how to utilize the PlantCaduceus model effectively.
Model Overview
PlantCaduceus leverages the innovative architectures of Caduceus and Mamba and a masked language modeling objective to grasp evolutionary conservation and DNA sequence grammar from a rich dataset spanning 160 million years. Several versions of the model have been trained, each varying in parameters:
- PlantCaduceus_l20: 20 layers, 384 hidden size, 20M parameters
- PlantCaduceus_l24: 24 layers, 512 hidden size, 40M parameters
- PlantCaduceus_l28: 28 layers, 768 hidden size, 112M parameters
- PlantCaduceus_l32: 32 layers, 1024 hidden size, 225M parameters
How to Use the Model
Using PlantCaduceus is a straightforward affair! Here’s how to set up and run the model in a Python environment:
from transformers import AutoModel, AutoModelForMaskedLM, AutoTokenizer
import torch
model_path = 'kuleshov-group/PlantCaduceus_l32'
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = AutoModelForMaskedLM.from_pretrained(model_path, trust_remote_code=True, device_map=device)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
sequence = "ATGCGTACGATCGTAG"
encoding = tokenizer.encode_plus(
sequence,
return_tensors="pt",
return_attention_mask=False,
return_token_type_ids=False
)
input_ids = encoding["input_ids"].to(device)
with torch.inference_mode():
outputs = model(input_ids=input_ids, output_hidden_states=True)
Understanding the Code Analogy
Think of the code above as preparing a delicious recipe. Each step is crucial and builds on the last to create a final dish of information:
- Gathering Ingredients: The imports bring in the necessary libraries, like collecting all the ingredients you need for your recipe.
- Choosing Your Dish: Setting the model path and device (`cuda` or `cpu`) is like deciding what dish you’re going to make – some dishes need an oven, others can be done on a stovetop.
- Mixing Your Ingredients: Instantiating the model and tokenizer provides the structure to hold everything together, much like mixing your base ingredients to form a batter.
- Adding Your Flavors: Tokenizing the input sequence encodes the DNA sequence into a numerical format, similar to adding spices that flavor your dish.
- Cooking: The inference step runs the model to create outputs, akin to putting everything into the oven and waiting for your dish to bake.
Troubleshooting Tips
If you run into issues while using PlantCaduceus, here are a few troubleshooting steps:
- Ensure that the correct version of Python and the required libraries (like transformers and torch) are installed.
- Check your device configuration. If CUDA is not recognized, make sure that the appropriate graphics drivers are installed.
- Look over the input sequence format. Make sure it adheres to the expected DNA format.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

