Welcome to the world of Automated Time-series Outlier Detection! In this article, we’ll explore how to utilize the TODS (Time-series Outlier Detection System) effectively, along with troubleshooting tips to help you out when challenges arise. Whether you’re dealing with multivariate time-series data or looking for efficient anomaly detection methods, TODS has you covered.
What is TODS?
TODS is a full-stack automated machine learning system designed for outlier detection on multivariate time-series data. Think of it as a modern Swiss Army knife for data scientists, offering various tools for data processing, feature analysis, detection algorithms, and even a reinforcement module to refine the detection process with human expertise.
Key Features of TODS
- Full Stack Machine Learning System: Each part of the pipeline, from preprocessing data to applying detection algorithms, is thoughtfully laid out.
- Wide-range of Algorithms: It incorporates all point-wise detection algorithms found in PyOD, along with advanced collective detection algorithms like DeepLog and Telemanon.
- Automated Machine Learning: TODS is designed to construct an optimal pipeline based on your data without prior knowledge, effectively searching the best module combinations.
How to Install TODS
Before you dive into detection, let’s set up your environment. TODS requires Python 3.7+ and pip 19+. Follow these steps to ensure a smooth installation:
- Install necessary system packages (Ubuntu/Debian):
- Clone the repository:
- Navigate to the directory and install:
sudo apt-get install libssl-dev libcurl4-openssl-dev libyaml-dev build-essential libopenblas-dev libcap-dev ffmpeg
git clone https://github.com/datamllab/tods.git
cd tods
pip install -e .
Using TODS: An Example Pipeline
Once you have TODS installed, you can start building your detection pipeline. Here’s a simple analogy: Imagine you’re a chef preparing a meal—TODS will help you select the right ingredients and cooking methods to create a delicious dish (your detection model).
Here’s how you can load a dataset and evaluate it with a default pipeline:
import pandas as pd
from tods import schemas as schemas_utils
from tods import generate_dataset, evaluate_pipeline
table_path = 'datasets/anomaly/raw_data/yahoo_sub_5.csv' # Path to dataset
target_index = 6 # Target column index
metric = 'F1_MACRO' # Metric for evaluation
# Read data and generate dataset
df = pd.read_csv(table_path)
dataset = generate_dataset(df, target_index)
# Load the default pipeline
pipeline = schemas_utils.load_default_pipeline()
# Run the pipeline
pipeline_result = evaluate_pipeline(dataset, pipeline, metric)
print(pipeline_result)
Troubleshooting
If you encounter any issues while using TODS, here are a few troubleshooting ideas:
- Installation Errors: Ensure all system packages are installed correctly and you are using the right version of Python and pip.
- Data Not Loading: Verify that the path to your dataset is correct and that the dataset is in an acceptable format.
- Pipeline Errors: Double-check that you’ve defined target indices and metrics correctly.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that advancements like TODS are crucial for the future of AI, enabling comprehensive and effective solutions. Our team continually explores new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.