How to Use GPT-2 for Code AutoComplete

Feb 20, 2024 | Educational

Code completion tools have become essential for developers to enhance productivity and streamline coding. With the code-autocomplete library, powered by the GPT-2 model, you can automatically complete your Python code snippets. In this blog, we will walk through the installation and usage of this powerful tool!

What You’ll Need

Python installed on your machine (3.6 or above)
Familiarity with basic Python programming
Access to the internet for library installations

Setting Up the Code-Autocomplete Plugin

First, you need to install the code-autocomplete library. You can download it from the official GitHub repository:

code-autocomplete

Installing Dependencies

Before using the library, make sure to install necessary dependencies. You can do this by running:

pip install torch transformers

Using the Code Completion Model

Here’s how to use the GPT-2 for code completion:

from autocomplete.gpt2_coder import GPT2Coder

m = GPT2Coder("shibing624/code-autocomplete-distilgpt2-python")
print(m.generate("import torch.nn as")[0])

This snippet initializes the GPT2Coder and generates a code completion for the line you passed to it.

Diving Deeper: The Code Explained

Let’s break down the code above with an analogy. Think of writing code like planting seeds in a garden. The GPT-2 model acts as an experienced gardener, who knows what different seeds need to flourish. When you provide a seed (a code snippet), the gardener begins to visualize how the plant (the completed code) will grow. Just as the gardener understands what elements are required for each type of plant to thrive, the model uses its training to predict and generate appropriate code that complements your initial input.

Generating Code from Prompts

To generate code from various prompts, use the following code snippet:

import os
from transformers import GPT2Tokenizer, GPT2LMHeadModel

os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
tokenizer = GPT2Tokenizer.from_pretrained("shibing624/code-autocomplete-distilgpt2-python")
model = GPT2LMHeadModel.from_pretrained("shibing624/code-autocomplete-distilgpt2-python")

prompts = [
    'from torch import nn\nclass LSTM(Module):\n    def __init__(self, *,\n                 n_tokens: int,\n                 embedding_size: int,\n                 hidden_size: int,\n                 n_layers: int):',
    'import numpy as np',
    'import torch',
    'import torch.nn as',
    'import java.util.ArrayList',
    'def factorial(n):',
]

for prompt in prompts:
    input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt')
    outputs = model.generate(input_ids=input_ids,
                             max_length=64 + len(prompt),
                             temperature=1.0,
                             top_k=50,
                             top_p=0.95,
                             repetition_penalty=1.0,
                             do_sample=True,
                             num_return_sequences=1,
                             length_penalty=2.0,
                             early_stopping=True)
    
    decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(decoded)
print("=" * 20)

This code prepares various prompts and generates a code sample for each.

Training Your Model

If you want to customize the code-autocomplete GPT-2 model, you can refer to their training setup. A command to download the dataset is as follows:

shell
cd autocomplete
python create_dataset.py

To train your own model, consult the gpt2_coder.py file on the repository.

Troubleshooting

If you encounter any issues while running the code, consider the following troubleshooting tips:

Ensure all dependencies are correctly installed.
Check that you are using the correct environment variables.
Verify the model names and paths when loading models.
Refer to the Hugging Face Transformers documentation for additional support.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

With code-autocomplete, you have a powerful tool that harnesses the capabilities of GPT-2 for seamless code completion. Enjoy coding with your new AI-assisted partner!

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox