The quest for combating racism involves leveraging technology and data. In this tutorial, we will explore how to use a fine-tuned model named m-vote-nonstrict-epoch-4 that specializes in detecting racist language in Spanish. This model is based on the Spanish BERT architecture and trained using the BETO model on the Datathon Against Racism dataset.
Preparing Your Environment
Before diving into the code, ensure that you have the necessary packages installed. You need to have transformers installed for this to function. You can install it using pip:
pip install transformers
Load the Model and Tokenizer
Now that you are set up, let’s load the model and the tokenizer using Python. We will use the following code snippet:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "m-vote-nonstrict-epoch-4"
tokenizer = AutoTokenizer.from_pretrained("dccuchile/bert-base-spanish-wwm-uncased")
full_model_path = f"MartinoMensio/racism-models/{model_name}"
model = AutoModelForSequenceClassification.from_pretrained(full_model_path)
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
texts = [
"y porqué es lo que hay que hacer con los menas y con los adultos también!!!! NO a los inmigrantes ilegales!!!!",
"Es que los judíos controlan el mundo"
]
print(pipe(texts))
Understanding the Code
Imagine you’re a librarian in a massive library filled with books (our texts). The library has an AI (the model) that can look at the content and tell you if a particular book promotes harmful ideas or prejudices (racism). Here’s a breakdown of what we did:
- Importing Libraries: We brought in the right tools (libraries) so our AI can understand the Spanish texts.
- Preparing the Model and Tokenizer: We loaded a specified model that knows how to identify racism in language.
- Creating a Pipeline: This is like setting up a counter in our library where you can hand me your book, and I’ll pass it to the AI for analysis.
- Text Analysis: Finally, we let the AI analyze our selected texts to classify them as either racist or non-racist.
The output will provide you a list of classifications with scores indicating their certainty regarding the classification. For example:
# Output could look like:
# [label: racist, score: 0.9791656136512756, label: non-racist, score: 0.996966540813446]
Troubleshooting
If you encounter issues while running the code, consider the following troubleshooting tips:
- Ensure that all the dependencies are correctly installed using pip.
- Check the model name and path for typos, as it needs to match exactly.
- Verify your Python version is compatible with the transformers library.
- If you get an error related to connection, ensure you have internet access, as it downloads the model.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

