In today’s world, detecting hate speech and racism in various languages is of utmost importance. This blog will guide you on how to use a fine-tuned Spanish BERT model built for this very purpose, courtesy of the BETO (Spanish BERT).
Getting Started
This model has been fine-tuned using the Datathon Against Racism dataset, aiming to deliver robust performance in racism detection in the Spanish language. Below are the steps to make use of this pre-trained model efficiently.
Requirements
- Python installed on your machine.
- A virtual environment (optional but recommended).
- Install the Transformers library using pip:
pip install transformers
Step-by-Step Implementation
Follow these steps to begin utilizing the racism detection model:
1. Import the Required Libraries
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
2. Load the Model and Tokenizer
Use the model name that is specifically meant for this task. In this case, we’ll use `w-m-vote-nonstrict-epoch-2` for detection.
model_name = "w-m-vote-nonstrict-epoch-2"
tokenizer = AutoTokenizer.from_pretrained("dccuchile/bert-base-spanish-wwm-uncased")
full_model_path = f"MartinoMensioracism-models/{model_name}"
model = AutoModelForSequenceClassification.from_pretrained(full_model_path)
3. Create a Pipeline for Text Classification
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
4. Prepare Your Texts
Next, you can prepare the texts you want to analyze:
texts = [
"Y porqué es lo que hay que hacer con los menas y con los adultos también!!!! NO a los inmigrantes ilegales!!!!",
"Es que los judíos controlan el mundo"
]
5. Analyze Your Texts
Finally, you can assess the results using the pipeline you created:
print(pipe(texts))
The output will consist of labels (either “racist” or “non-racist”) and corresponding confidence scores for each text.
Understanding the Model’s Forecast
Imagine this model as a sophisticated referee in a match of words. It evaluates each player (the text) based on predefined rules (the trained data) and decides whether they are playing fair (non-racist) or unsportingly (racist). Just as a referee utilizes their experience to judge a game, this model uses the patterns it learned from the training data to determine the nature of the text it analyzes.
Troubleshooting
Here are some common issues you might face and their solutions:
- Import Errors: Ensure you have installed the latest version of the Transformers library. If not, re-run the pip install command.
- Path Issues: Double-check the full model path. Ensure you’ve set the correct path to the model files.
- Performance Issues: Running this model might require substantial computational power. If it’s running slow, try using a machine with more resources or limit your input texts.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

