Introducing Sujet Finance 8B v0.1: Your Financial Language Model

Apr 26, 2024 | Educational

Welcome to the exciting world of Sujet Finance 8B v0.1 – your go-to language model for all things finance! This state-of-the-art model is a fine-tuned version of the powerful LLAMA 3 model, meticulously trained on the comprehensive Sujet Finance Instruct-177k dataset. Let’s dive into its capabilities!

🎯 Fine-Tuning Focus

In this initial fine-tuning iteration, we’ve focused on three key financial tasks:

✅❌ Yes/No Questions
- Description: This task involves answering financial questions that require a simple yes or no response.
- Class Distribution:
  - Train Set: 5,265 yes examples, 5,302 no examples
  - Eval Set: 1,340 yes examples, 1,303 no examples
📂 Topic Classification
- Description: The model classifies financial texts into specific finance-related categories such as company news, markets, earnings, and more.
- Class Distribution:
  - Train Set: Balanced across 20 classes, with 29-40 examples per class
  - Eval Set: Varies across classes, ranging from 4 to 15 examples per class
😊😐😡 Sentiment Analysis
- Description: This task involves analyzing financial texts to categorize sentiments as positive, negative, neutral, bearish, or bullish.
- Class Distribution:
  - Train Set: 1,160 positive, 1,155 negative, 1,150 neutral, 1,133 bearish, and 1,185 bullish examples
  - Eval Set: 281 positive, 286 negative, 291 neutral, 308 bearish, and 256 bullish examples

📜 Inference Code

Once you have ensured that you’ve installed Unsloth from their GitHub Repository, you are ready to use the Sujet Finance model! Below is an example of how to set up and use the model:

python
from unsloth import FastLanguageModel

max_seq_length = 2048
dtype = None
load_in_4bit = False

alpaca_prompt = "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n### Instruction:\n### Input:\n### Response:"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="sujet-ai/Sujet-Finance-8B-v0.1",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
    token="your hf token here,"
)

example = {
    "system_prompt": "You are a financial sentiment analysis expert. Your task is to analyze the sentiment expressed in the given financial text. Only reply with bearish, neutral, or bullish.",
    "user_prompt": "Expedia's Problems Run Deeper Than SEO Headwinds.",
}

inputs = tokenizer(
    [alpaca_prompt.format(example["system_prompt"], example["user_prompt"], "")],
    return_tensors="pt"
).to("cuda")

outputs = model.generate(**inputs, max_new_tokens=2048, use_cache=True, pad_token_id=tokenizer.eos_token_id)
output = tokenizer.batch_decode(outputs)[0]
response = output.split("### Response:")[1].strip()
print(response)

🏗️ Training Methodology

To ensure optimal performance, we’ve employed a balanced training approach. Our dataset preparation process strategically selects an equal number of examples from each subclass within the three focus tasks. This results in a comprehensive model that can handle a diverse range of financial questions and topics.

The final balanced training dataset consists of 17,036 examples, while the evaluation dataset contains 4,259 examples.

🔧 Model Specifications

Base Model: LLAMA 3 8B
Fine-Tuning Technique: LoRA (Low-Rank Adaptation)
Learning Rate: 2e-4
Weight Decay: 0.01
Epochs: 1
Quantization: float16 for VLLM

📊 Evaluation Results

We’ve put our model to the test, comparing its performance against the base LLAMA 3 model on our evaluation dataset. The results are impressive!

We consider a response correct if the true answer appears within the first 10 words generated by the model. This strict criterion ensures that our model not only provides accurate answers but also prioritizes the most relevant information.

🛠️ Troubleshooting

If you encounter issues while using the Sujet Finance model, here are a few troubleshooting tips:

Ensure you have installed the Unsloth library properly.
Verify your hardware supports the model’s requirements (like GPU for CUDA).
Check for API token validity if authentication fails.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Analogy to Understand the Model

Think of the Sujet Finance model as a highly specialized chef in a vast culinary world. Just as a chef masters various cooking techniques and cuisines, this model has delved deep into financial tasks and fine-tuned its skills accordingly. It has been trained on different types of ‘dishes’ such as yes/no questions (quick decision snacks), topic classification (main course selections), and sentiment analysis (dessert preferences). Each training ‘dish’ prepares the chef to cater to a diverse range of customers (financial queries), ensuring that they deliver not just good food but a delightful dining experience in the world of finance.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox