How to Fine-Tune Transformer Models for Detecting Trolling, Aggression, and Cyberbullying

Sep 12, 2024 | Educational

In this blog, we’ll explore the journey of fine-tuning transformer models to identify negative online behaviors such as trolling, aggression, and cyberbullying. This guide is structured to be user-friendly, ensuring that both newcomers and experienced developers can follow and implement the approach presented in our research paper.

Overview

Our approach was presented in the paper titled: Mishra, Sudhanshu, Shivangi Prasad, and Shubhanshu Mishra. 2020. Multilingual Joint Fine-Tuning of Transformer Models for Identifying Trolling, Aggression and Cyberbullying at TRAC 2020. Understanding how to implement this can aid in creating safer online spaces.

Getting Started with the Code

To start using the models, follow the code snippet below.

python
from transformers import AutoModel, AutoTokenizer, AutoModelForSequenceClassification
import torch
from pathlib import Path
from scipy.special import softmax
import numpy as np
import pandas as pd

TASK_LABEL_IDS = {
    'Sub-task A': ['OAG', 'NAG', 'CAG'],
    'Sub-task B': ['GEN', 'NGEN'],
    'Sub-task C': ['OAG-GEN', 'OAG-NGEN', 'NAG-GEN', 'NAG-NGEN', 'CAG-GEN', 'CAG-NGEN']
}

model_version = 'databank'  # other option is 'hugging face library'
if model_version == 'databank':
    # Make sure you have downloaded the required model file
    # Unzip the file at some model_path
    model_path = next(Path('databank_model').glob('*.output/model'))
    lang, task, _, base_model, _ = model_path.parts
    tokenizer = AutoTokenizer.from_pretrained(base_model)
    model = AutoModelForSequenceClassification.from_pretrained(model_path)
else:
    lang, task, base_model = 'ALL', 'Sub-task C', 'bert-base-multilingual-uncased'
    base_model = f'socialmediaie/TRAC2020_{lang.split()[-1]}_base_model'
    tokenizer = AutoTokenizer.from_pretrained(base_model)
    model = AutoModelForSequenceClassification.from_pretrained(base_model)

# For doing inference set model in eval mode
model.eval()

# If you want to further fine-tune the model you can reset the model to model.train()
task_labels = TASK_LABEL_IDS[task]

sentence = "This is a good cat and this is a bad dog."
processed_sentence = f"{tokenizer.cls_token} {sentence}"
tokens = tokenizer.tokenize(processed_sentence)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokens)
tokens_tensor = torch.tensor([indexed_tokens])

with torch.no_grad():
    logits, = model(tokens_tensor, labels=None)

preds = logits.detach().cpu().numpy()
preds_probs = softmax(preds, axis=1)
preds = np.argmax(preds_probs, axis=1)
preds_labels = np.array(task_labels)[preds]

print(dict(zip(task_labels, preds_probs[0])), preds_labels)

Understanding the Code: An Analogy

Consider this code like assembling a team of superheroes to combat online negativity. Each part of the code plays a crucial role in forming the ultimate squad:

  • Importing Libraries: Just like we need different superheroes with specific powers, we import libraries to bring unique functionalities.
  • Setting Up the Model: We specify which model to use, like choosing which superhero to lead the charge. Whether we pick the ‘databank’ or use a pre-trained one from Hugging Face, it sets the tone for our mission.
  • Tokenization: Think of tokenization as converting a villain’s complex dialogue to something our heroes can understand. This process breaks down the negative content into understandable parts.
  • Model Inference: Here we evaluate how our heroes (model) respond to the villain’s statements (input sentence) and gauge their strengths (predictions).

Using the Models

Once you have set up everything correctly, you can start inputting sentences to classify the types of trolling or aggression present.

Troubleshooting Tips

  • If the model fails to load, ensure that you have the correct paths set for your model files.
  • In case of unexpected output, double-check your input sentences for correctness and syntax.
  • Running out of memory? Consider processing smaller batches of data.
  • For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

This guide offers a solid foundation for fine-tuning transformer models aimed at detecting trolling, aggression, and cyberbullying. By leveraging these models, you contribute to creating a more positive online environment. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox