How to Use CodeFuse-DeepSeek-33B for Code Generation

Feb 11, 2024 | Educational

Welcome to the world of AI-driven code generation! Today, we’re diving deep into how to effectively use the powerful CodeFuse-DeepSeek-33B model. This sophisticated model has been fine-tuned specifically for various code-related tasks to boost productivity and efficiency. Let’s embark on a journey to understand the model’s features, how to run it, and what to do if you encounter any bumps along the way.

Model Overview

CodeFuse-DeepSeek-33B is a 33 billion parameter Code-LLM (Code Language Model) fine-tuned through QLoRA technology. It has exhibited impressive performance, scoring 78.65% on the HumanEval pass@1 metric, making it one of the top choices for code generation tasks.

Getting Started

Here’s a step-by-step guide to help you set up and utilize CodeFuse-DeepSeek-33B:

1. Requirements

Python version: 3.8
PyTorch: 2.0.0
Transformers: 4.33.2
SentencePiece
CUDA: 11.4

2. Setting Up the Environment

Ensure that you have installed the required libraries. You can do this through pip:

pip install torch==2.0.0 transformers==4.33.2 sentencepiece

3. Load the Model

Use the following code snippet to load the CodeFuse-DeepSeek-33B model:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_dir = "codefuse-ai/CodeFuse-DeepSeek-33B"

def load_model_tokenizer(model_path):
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    tokenizer.eos_token = "｜end▁of▁sentence｜"
    tokenizer.pad_token = "｜end▁of▁sentence｜"
    tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
    tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
    tokenizer.padding_side = "left"

    model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
    return model, tokenizer

model, tokenizer = load_model_tokenizer(model_dir)

Understanding the Code with an Analogy

Think of the CodeFuse-DeepSeek-33B model as a skilled chef in a busy restaurant kitchen. Just as a chef combines ingredients and cooks based on specific recipes to create delicious meals, this AI model combines programming concepts to generate code based on the provided prompts. The training it underwent on a diverse array of coding tasks allows it to serve up perfectly crafted coding solutions quickly and adeptly.

Formatting Your Input

The input to the model must be structured correctly to ensure accurate responses. Here’s how to format your messages:

Multi-Round with System Prompt:

system: System instructions
human: Human first round inputs
bot: Bot first round output｜end▁of▁sentence｜ 
human: Human second round inputs 
bot: Bot second round output｜end▁of▁sentence｜ 
.......

human: Human nth round inputs

Single-Round without System Prompt:

human: User prompt... 
bot: sbot

Example Prompt for Code Generation:

human: # language: Python
from typing import List
def separate_paren_groups(paren_string: str) -> List[str]:
    # Input to this function is a string containing multiple groups of nested parentheses.
    # Your goal is to separate those group into separate strings and return the list of those.
    separate_paren_groups(( ) (( )) (( )( )))
bot: sbot

Running Your Model

Now that you have set up everything, you can run the model using the code snippet below:

inputs = tokenizer(text_list, return_tensors="pt", padding=True, add_special_tokens=False).to("cuda")
input_ids = inputs["input_ids"]
attention_mask = inputs["attention_mask"]

generation_config = GenerationConfig(
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    temperature=0.1,
    max_new_tokens=512,
    num_return_sequences=1,
    num_beams=1,
    top_p=0.95,
    do_sample=False
)

outputs = model.generate(
    inputs=input_ids,
    attention_mask=attention_mask,
    **generation_config.to_dict()
)

gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
print(gen_text[0])

Troubleshooting Common Issues

If you encounter any issues, consider the following troubleshooting tips:

Model Loading Errors: Ensure your model path is correct and that the necessary libraries are installed.
CUDA Error: Verify that you have the correct CUDA version installed compatible with PyTorch.
Memory Issues: If you run out of memory, try reducing the batch size.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox