How to Use SmolLM: Your Guide to State-of-the-Art Small Language Models

Jul 22, 2024 | Educational

Welcome to our inventive guide on utilizing SmolLM, a series of small yet powerful language models developed for operational efficiency and robustness. With sizes of 135M, 360M, and a whopping 1.7B parameters, SmolLM stands out thanks to its unique training dataset, Cosmo-Corpus. Whether you’re a seasoned AI developer or a curious novice, this article will walk you through everything from installation to running your first model.

Model Summary
Limitations
Training
License
Citation

Model Summary

SmolLM is built on the robust Cosmo-Corpus dataset, comprising educational texts and synthetic information. It boasts impressive performance in common sense reasoning and world knowledge tasks. Its effectiveness has been validated across various benchmarks, making it an ideal tool for generating text in numerous fields.

For in-depth details regarding its benchmarks and performance, visit our full blog post.

Getting Started

Installation

Before you dive into the glorious world of SmolLM, you need to install the transformers library. Here’s how:

pip install transformers

Running the Model on CPU/GPU/Multiple GPUs

Think of launching SmolLM like firing up a culinary masterpiece. First, you need all your ingredients (libraries and models) in place, then you can start cooking (processing data). Below is the code for running the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "HuggingFaceTB/SmolLM-1.7B"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0]))

Optimizing with Precision

If you want to optimize the model using torch.bfloat16, it’s like cooking with the best quality spices to enhance the flavor. Here’s how:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

checkpoint = "HuggingFaceTB/SmolLM-1.7B"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)
inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0]))

Limitations

Despite its strengths, SmolLM has limitations. The models primarily operate in English and may not generate factually accurate content every time. They can reflect biases present in the training data. Thus, it’s important to use them as assistive tools and verify critical information before acting on it.

Training

SmolLM went through extensive training comprising 500,000 pretraining steps using a whopping 1 trillion tokens. This was done on 64 H100 GPUs under the Nanotron training framework. It’s worth noting the model’s architecture details can be found in our full blog post.

License

SmolLM operates under the Apache 2.0 License, making it flexible for various applications.

Citation

If you need to cite SmolLM in your work, here’s the format:

@misc{allal2024SmolLM,
      title={SmolLM - blazingly fast and remarkably powerful},
      author={Loubna Ben Allal and Anton Lozhkov and Elie Bakouch and Leandro von Werra and Thomas Wolf},
      year={2024},
}

Troubleshooting

If you encounter issues while using SmolLM, consider checking the following:

Ensure all installation commands have executed successfully without errors.
Confirm that the right device (CPU/GPU) is being utilized depending on your system capabilities.
If you are running on multiple GPUs, make sure you have installed the accelerate library.
Cross-verify the input tokens to ensure syntax correctness when generating text.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox