Unlocking the Power of H2O LLM Studio with Transformers

Aug 19, 2023 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_18_122

Leveraging large language models has never been easier thanks to resources like H2O LLM Studio and the Transformers library. In this article, we’ll guide you on how to set up and use the EleutherAI Pythia model to create responses from prompts. Buckle up for a user-friendly exploration into the world of AI!

Getting Started with H2O LLM Studio

To dive into utilizing the EleutherAI Pythia model, you first need to install the necessary libraries. You can do this conveniently using pip:

pip install transformers==4.30.2
pip install accelerate==0.20.3
pip install torch==2.0.1

Using the Model on CPU

Once you have installed the required libraries, you can start generating responses. Here’s where the magic happens through a function defining the input and generating the output:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

def generate_response(prompt, model_name):
    tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float32, device_map='cpu', trust_remote_code=True)
    model.cpu().eval()
    inputs = tokenizer(prompt, return_tensors='pt', add_special_tokens=False).to('cpu')

    tokens = model.generate(
        input_ids=inputs['input_ids'],
        attention_mask=inputs['attention_mask'],
        min_new_tokens=2,
        max_new_tokens=500,
        do_sample=False,
        num_beams=2,
        temperature=float(0.0),
        repetition_penalty=float(1.0),
        renormalize_logits=True
    )[0]

    tokens = tokens[inputs['input_ids'].shape[1]:]
    answer = tokenizer.decode(tokens, skip_special_tokens=True)
    return answer

Breaking Down the Code: An Analogy

Imagine you’re a talented chef prepping for a special dinner. The ingredients are your input text, and the recipe is your code. Just like a chef uses a list of ingredients (prompt, model) to create a dish, the code utilizes the tokenizer and model to process the data.

Step by step:

The tokenizer is your sous-chef, chopping your ingredients (the prompt) into manageable pieces.
The model is the main chef, transforming those pieces into a delightful dish (the output) by applying different cooking techniques (the generation settings).
Finally, you feast on the delicious result (the generated answer)!

How to Generate Responses

Here’s a quick example of how you can utilize the `generate_response` function:

model_name = 'diegomirandatext-to-cypher'
prompt = "Create a Cypher statement to answer the following question: Retorne os processos de Direito Tributário que se baseiam em lei 939 de 1992?"
response = generate_response(prompt, model_name)
print(response)

Using the Model on GPU

If you have a GPU, maximize performance by installing the latest version of Transformers:

pip install transformers==4.31.0

You can leverage Hugging Face’s model which may require authentication. Log in by running:

import huggingface_hub
huggingface_hub.login("ACCESS_TOKEN")

Next, you can create the pipeline for generating responses:

from transformers import pipeline

generate_text = pipeline(
    model='diegomirandatext-to-cypher',
    torch_dtype='auto',
    trust_remote_code=True,
    use_fast=True,
    device_map='cuda:0',
    token=True,
)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=500,
    do_sample=False,
    num_beams=2,
    temperature=float(0.0),
    repetition_penalty=float(1.0),
    renormalize_logits=True
)
print(res[0]['generated_text'])

Troubleshooting Common Issues

Encountering issues is common in coding—don’t fret! Here are a few troubleshooting tips:

Make sure your library versions match the requirements specified.
If you face token-related issues, ensure you’ve entered your Hugging Face access token correctly.
Verify compatibility with your hardware. If you’re using CPU settings but want GPU performance, adjust your code accordingly.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox