An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection
Introduction
Welcome to TSB-UAD, an open-source benchmark suite designed specifically for evaluating univariate time-series anomaly detection methods. We provide a comprehensive collection of 12,686 time series with annotated anomalies across a variety of domains. This extensive dataset ensures high variability in anomaly types, ratios, and sizes, making it easier for researchers and developers alike to assess their methods effectively.
Quick Start
To get started with TSB-UAD, you can swiftly install it using pip. Follow the steps below to kick off your journey in time-series anomaly detection:
pip install tsb-uad
Installation
Before installing TSB-UAD, ensure you have the necessary tools:
- git
- conda (either Anaconda or Miniconda)
Step-by-Step Installation
- Download the datasets from the links provided:
-
Clone the repository using git:
git clone https://github.com/TheDatumOrg/TSB-UAD.git
Change to the root directory:cd TSB-UAD
-
Create and activate a conda environment:
conda env create --file environment.yml conda activate TSB
-
Install the dependencies:
pip install -r requirements.txt
-
Finally, install TSB-UAD:
pip install TSB-UAD
Note: NormA and Series2Graph need to be installed manually from the downloaded zip files.
Benchmarking Datasets
TSB-UAD features not only public real datasets but also synthetic and artificial datasets designed for comprehensive benchmarking of anomaly detection methods. Think of it like a vast library, where each book (dataset) contains unique stories (anomalies) that help reveal patterns and insights in the realm of univariate time-series.
Using Anomaly Detectors
TSB-UAD allows users to implement various anomaly detection algorithms. Below is an analogy to help you understand our approach:
Consider a librarian who carefully inspects each book in the library. Just as the librarian identifies various sections (e.g., fiction, non-fiction) categorized by unique characteristics (like the characteristics of time series), the IForest algorithm categorizes anomalies based on deviations from normal patterns.
Here’s an example of how to use the IForest anomaly detector:
import os
import numpy as np
import pandas as pd
from TSB_UAD.models.iforest import IForest
from TSB_UAD.models.feature import Window
from TSB_UAD.utils.slidingWindows import find_length
from TSB_UAD.vus.metrics import get_metrics
df = pd.read_csv('data/benchmark/ECG_MBA_ECG805_data.out', header=None).to_numpy()
data = df[:, 0].astype(float)
label = df[:, 1]
slidingWindow = find_length(data)
X_data = Window(window=slidingWindow).convert(data).to_numpy()
clf = IForest(n_jobs=1)
clf.fit(X_data)
score = clf.decision_scores
score = MinMaxScaler(feature_range=(0,1)).fit_transform(score.reshape(-1,1)).ravel()
score = np.array([score[0]]*math.ceil((slidingWindow-1)/2) + list(score) + [score[-1]]*((slidingWindow-1)//2))
results = get_metrics(score, label, metric='all', slidingWindow=slidingWindow)
for metric in results.keys():
print(metric, ":", results[metric])
Troubleshooting
If you encounter any issues during installation or while running your models, here are a few troubleshooting tips:
- Make sure you have all necessary dependencies installed correctly. Check your environment.yml and requirements.txt files.
- Verify that the dataset paths are correct.
- If an anomaly detector isn’t working as expected, review the parameters you’ve set; ensure they align with your dataset characteristics.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
TSB-UAD is a powerful tool to support the evaluation of univariate time-series anomaly detection methods. Its extensive dataset and robust feature set are essential for researchers and practitioners in the field. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.