How to Use DeepSeek Coder: Your Guide to Code Completion and Insertion

Mar 7, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_15_140

Welcome aboard the programming express! Today, we will navigate through the exciting world of DeepSeek Coder, a powerful tool that employs cutting-edge code language models to assist you in your coding endeavors. Whether you’re looking for code completion, code insertion, or repository-level code generation, DeepSeek Coder has got you covered. Let’s delve into the details!

1. Introduction to DeepSeek Coder

DeepSeek Coder is no ordinary tool—it’s a state-of-the-art series of code language models, finely tuned over 2 trillion tokens. With a blend of 87% code and 13% natural language in both English and Chinese, it ensures you get optimal support for multiple programming languages. Available in sizes ranging from 1 billion to a whopping 33 billion parameters, DeepSeek Coder strives to meet your diverse coding requirements. Here are some of its notable features:

Massive Training Data: Modeled on an extensive dataset, ensuring high accuracy and versatility.
Highly Flexible & Scalable: Choose models from 1.3B to 33B as per your needs.
Superior Model Performance: Achieves top scores on various benchmarks such as HumanEval and MultiPL-E.
Advanced Code Completion Capabilities: Supported by an impressive window size of 16K for effective code completion tasks.

2. Model Summary

The deepseek-coder-33b-base model, with 33 billion parameters, utilizes Grouped-Query Attention trained on an incredible 2 trillion tokens.

Home Page: DeepSeek
Repository: deepseek-ai/deepseek-coder
Chat With DeepSeek Coder: DeepSeek-Coder

3. How to Use DeepSeek Coder

Now that we have an overview, let’s explore how to make the most out of DeepSeek Coder through practical examples.

Example 1: Code Completion

In this analogy, think of DeepSeek Coder as a chef who can complete your almost finished dish by suggesting the final ingredients needed to enhance its flavor. Below is how you can implement it:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True).cuda()

input_text = "# write a quick sort algorithm"
inputs = tokenizer(input_text, return_tensors='pt').cuda()
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example 2: Code Insertion

Here, DeepSeek Coder acts like an architect helping you insert the missing walls in your structure. It fills in the gaps to make your code complete:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True).cuda()

input_text = """
def quick_sort(arr):
    if len(arr) = 1:
        return arr
    pivot = arr[0]
    left = []
    right = []
    for i in range(len(arr)):
        if arr[i] < pivot:
            left.append(arr[i])
        else:
            right.append(arr[i])
    return quick_sort(left) + [pivot] + quick_sort(right)
"""
inputs = tokenizer(input_text, return_tensors='pt').cuda()
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])

Example 3: Repository Level Code Completion

Think of this process as gathering ingredients from a pantry. DeepSeek Coder helps you gather and assemble all the code components from your repository:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('deepseek-ai/deepseek-coder-33b-base', trust_remote_code=True).cuda()

input_text = """
# utils.py
import torch
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

def load_data():
    iris = datasets.load_iris()
    X = iris.data
    y = iris.target
    # Standardize the data
    scaler = StandardScaler()
    X = scaler.fit_transform(X)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    # Convert numpy data to PyTorch tensors
    X_train = torch.tensor(X_train, dtype=torch.float32)
    X_test = torch.tensor(X_test, dtype=torch.float32)
    y_train = torch.tensor(y_train, dtype=torch.int64)
    y_test = torch.tensor(y_test, dtype=torch.int64)
    return X_train, X_test, y_train, y_test

def evaluate_predictions(y_test, y_pred):
    return accuracy_score(y_test, y_pred)
# model.py
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

class IrisClassifier(nn.Module):
    def __init__(self):
        super(IrisClassifier, self).__init__()
        self.fc = nn.Sequential(
            nn.Linear(4, 16),
            nn.ReLU(),
            nn.Linear(16, 3)
        )

    def forward(self, x):
        return self.fc(x)

    def train_model(self, X_train, y_train, epochs, lr, batch_size):
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.Adam(self.parameters(), lr=lr)
        
        dataset = TensorDataset(X_train, y_train)
        dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
        for epoch in range(epochs):
            for batch_X, batch_y in dataloader:
                optimizer.zero_grad()
                outputs = self(batch_X)
                loss = criterion(outputs, batch_y)
                loss.backward()
                optimizer.step()

    def predict(self, X_test):
        with torch.no_grad():
            outputs = self(X_test)
            _, predicted = outputs.max(1)
        return predicted.numpy()

# main.py
from utils import load_data, evaluate_predictions
from model import IrisClassifier as Classifier

def main():
    # Model training and evaluation
"""
inputs = tokenizer(input_text, return_tensors='pt').to(model.device)
outputs = model.generate(**inputs, max_new_tokens=140)
print(tokenizer.decode(outputs[0]))

4. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. For more details, please view the LICENSE-MODEL.

5. Troubleshooting

Here are a few tips to help you overcome common issues when using DeepSeek Coder:

Ensure that your environment meets the requirements for using PyTorch and the Transformers library.
If you encounter errors while loading the model, check your internet connection and verify that you're using the correct model name.
For performance issues, consider utilizing a GPU to speed up the model inference.
In case of unexpected results, review your input for any syntax errors or logical flaws in your code.

If you need further assistance, feel free to chat with the community on the Discord server or explore other resources. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox