Are you curious if that Twitter account is a bot or a genuine human? The tweetbotornot R package is here to help! With its powerful machine learning capabilities, it can classify Twitter accounts efficiently. This guide will take you through the installation, usage, and troubleshooting processes. Let’s get started!
Features of tweetbotornot
- Uses machine learning algorithms to classify Twitter accounts as either bots or humans.
- The default model achieves an impressive accuracy of 93.53% for bots and 95.32% for non-bots.
- The fast model has slightly lower accuracy at 91.78% for bots and 92.61% for non-bots.
- The overall accuracy for the default model is 93.8%.
- The overall accuracy for the fast model is 91.9%.
Installation
Follow the steps below to install the tweetbotornot package:
# Install from CRAN
install.packages("tweetbotornot")
# Install the development version from Github
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("mkearney/tweetbotornot")
API Authorization
Before you can begin classifying accounts, you’ll need to authorize with Twitter’s API. Here’s how:
- Log into Twitter and use R in an interactive session, which will prompt you to authorize the
rtweetclient, or; - Create an app within Twitter (you need a developer account) to obtain your own API token. This method offers better stability and permissions.
Please refer to the rtweet package documentation for detailed instructions on creating a Twitter app and generating your token.
Using tweetbotornot
Using the package is straightforward. Here’s an analogy to understand its working: Imagine you have a magical library that categorizes people as either ‘friends’ or ‘strangers’ based on their behavior. You simply hand over a list of names, and the librarian assesses their likelihood of being friends based on various fictional characteristics.
The tweetbotornot() function operates similarly. You input a vector of Twitter usernames or IDs, and it evaluates them using its trained models.
# Load the package
library(tweetbotornot)
# Select users
users = c("realdonaldtrump", "netflix_bot", "kearneymw", "dataandme", "hadleywickham", "ma_salmon", "juliasilge", "tidyversetweets", "American__Voter", "mothgenerator", "hrbrmstr")
# Get bot or not estimates
data = tweetbotornot(users)
# Arrange results by probability estimates
data[order(data$prob_bot), ]
Integration with rtweet
If you’re using the rtweet package to collect tweets, you can easily integrate these results with the tweetbotornot function:
# Get the most recent 100 tweets from each user
tmls = get_timelines(users, n = 100)
# Pass the returned data to botornot()
data = botornot(tmls)
# Arrange by probability estimates
data[order(data$prob_bot), ]
Using the fast Model
To accommodate large datasets efficiently, you can use the fast model by setting fast = TRUE. This approach sacrifices some accuracy but allows for a significantly higher number of estimates per 15 minutes:
# Get bot or not estimates using fast model
data = botornot(users, fast = TRUE)
# Arrange by probability estimates
data[order(data$prob_bot), ]
Note
The package was renamed from “botrnot” to “tweetbotornot” in June 2018. Kindly ensure you are using the correct package to avoid confusion.
Troubleshooting
Here are some common troubleshooting tips to help you out:
- If you receive authorization errors, double-check your API tokens and ensure that your application has the appropriate permissions.
- For package installation issues, make sure your R environment is updated, and your internet connection is stable.
- If you’re facing limitations on estimates, consider using the fast model for quicker results.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
