How to Use DeepSeek-Coder-V2 for Code Intelligence

Jul 5, 2024 | Educational

DeepSeek-Coder-V2 is a powerful open-source code language model that’s setting new standards in code intelligence. In this article, we’ll guide you on how to leverage DeepSeek-Coder-V2 effectively, whether for code completion, code insertion, or chatting with the model. Moreover, we will troubleshoot common issues to ensure you have a smooth experience.

Introduction

DeepSeek-Coder-V2 stands out as a robust addition to the realm of code models, comparing favorably against leading proprietary models like GPT-4 Turbo. Powered by a Mixture-of-Experts (MoE) framework, it greatly enhances both coding and mathematical reasoning. With support for an extensive array of programming languages—going from 86 to 338—and a context length of up to 128K tokens, this model is prepared to tackle the most complex coding tasks.

How to Run Locally

If you want to run the DeepSeek-Coder-V2 model locally, you will need to set up the environment and use Hugging Face’s Transformers library.

Inference with Hugging Face’s Transformers

To get started, you can use the following code snippets:

#### Code Completion


from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()

input_text = "#write a quick sort algorithm"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

In this code, imagine you are a chef (the model) preparing a dish (the code). You look at a recipe (the input text) for a quick sort algorithm. Based on the recipe, you gather your ingredients (tokens) and start cooking (code generation). At the end, you plate your dish (output the result).

#### Code Insertion


from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()

input_text = """<|fim▁begin|>def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = []
    right = []
<|fim▁hole|>        if arr[i] < pivot:
            left.append(arr[i])
        else:
            right.append(arr[i])
    return quick_sort(left) + [pivot] + quick_sort(right)<|fim▁end|>"""

inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])

Think of this process as adding an exotic ingredient (the input code) to an existing recipe (the existing code structure) to make something wonderful. Here, the model seamlessly integrates new instructions into the original function.

#### Chat Completion


from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()

messages = [{'role': 'user', 'content': "write a quick sort algorithm in python."}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

In this case, you’re having a conversation with a knowledgeable friend (the model) who expertly guides you through the thought process of coding in real-time, responding to your questions and clarifications seamlessly.

Troubleshooting

While using the DeepSeek-Coder-V2, you might encounter some common issues:

1. CUDA Runtime Error: Ensure you’re using a compatible GPU and have the necessary CUDA toolkit installed.
2. Model Download Issues: Verify your internet connection and that the model links are not broken.
3. Tokenization Errors: Check that you’re using the correct classes and methods from the Transformers library.

If you face obstacles, remember that our community and support are here for you!

For more troubleshooting questions/issues, contact our fxis.ai data scientist expert team.

Conclusion

By following this guide, you should now have a good understanding of how to get started with DeepSeek-Coder-V2 for your coding tasks. Harness the power of this advanced model to enhance your coding projects and endeavors. Let the code wizardry begin!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox