Table of Contents
- Introduction
- Comparison with Related Libraries
- Installation
- Documentation
- Getting Started
- Evaluation and Benchmarking
- Technical Report and Citing Merlion
Introduction
Merlion is a robust Python library specifically designed for time series intelligence. Think of it as your multi-functional toolkit for analyzing and making predictions based on temporal data. Whether you’re forecasting future trends, detecting anomalies, or identifying changes, Merlion has crafted a streamlined end-to-end machine learning framework.
Comparison with Related Libraries
Below is a visual overview of how Merlion’s features compare to similar libraries:
Merlion Prophet Alibi Detect Kats darts statsmodels nixtla GluonTS RRCF STUMPY Greykite pmdarima
:--- :---: :---: :---: :---: :---: :---: :---: :---: :---: :---: :---:
Univariate Forecasting ✔️
Multivariate Forecasting ✔️
Univariate Anomaly Detection ✔️
Multivariate Anomaly Detection ✔️
Pre Processing ✔️
Post Processing ✔️
AutoML ✔️
Ensembles ✔️
Benchmarking ✔️
Visualization ✔️
Installation
Setting up Merlion is as easy as pie! You can install it directly from PyPI with the following command:
pip install salesforce-merlion
For those who like to tinker and customize, you can also install it from source. Here’s how:
git clone https://github.com/salesforce/Merlion.git
pip install -e Merlion
Remember, Merlion is split into two sub-repos – the main library and those handling time series datasets. Depending on your needs, you can install both for comprehensive functionality.
Documentation
For example code and a deeper introduction to Merlion, check out the examples provided. You can also access extensive API documentation here. The technical report showcases Merlion’s infrastructure and experimental outcomes.
Getting Started
Let’s dive into the exciting part – using Merlion! The GUI web-based dashboard makes it easy to experiment with models on custom datasets. To launch it, install Merlion with the dashboard dependency pip install salesforce-merlion[dashboard], and run:
python -m merlion.dashboard
Access the dashboard at http://localhost:8050.
Anomaly Detection
Imagine trying to find a needle in a haystack – that’s how anomaly detection works! You can spot irregularities in data using the DefaultDetector model:
from merlion.utils import TimeSeries
from ts_datasets.anomaly import NAB
# Load Data
time_series, metadata = NAB(subset=realKnownCause)[3]
train_data = TimeSeries.from_pd(time_series[metadata.trainval])
test_data = TimeSeries.from_pd(time_series[~metadata.trainval])
test_labels = TimeSeries.from_pd(metadata.anomaly[~metadata.trainval])
# Train Model
from merlion.models.defaults import DefaultDetectorConfig, DefaultDetector
model = DefaultDetector(DefaultDetectorConfig())
model.train(train_data=train_data)
# Get Predictions
test_pred = model.get_anomaly_label(time_series=test_data)
Finally, we can visualize how well our model is performing!
from merlion.plot import plot_anoms
import matplotlib.pyplot as plt
fig, ax = model.plot_anomaly(time_series=test_data)
plot_anoms(ax=ax, anomaly_labels=test_labels)
plt.show()
Forecasting
Forecasting in Merlion is like predicting the weather. With proper models, we can gauge future outcomes based on past data. Here is how to replicate the results from the forecasting dashboard:
from merlion.utils import TimeSeries
from ts_datasets.forecast import M4
# Load Data
time_series, metadata = M4(subset=Hourly)[0]
train_data = TimeSeries.from_pd(time_series[metadata.trainval])
test_data = TimeSeries.from_pd(time_series[~metadata.trainval])
# Train Model
from merlion.models.defaults import DefaultForecasterConfig, DefaultForecaster
model = DefaultForecaster(DefaultForecasterConfig())
model.train(train_data=train_data)
# Get Predictions
test_pred, test_err = model.forecast(time_stamps=test_data.time_stamps)
Your forecasting predictions can also be visualized to validate accuracy.
Evaluation and Benchmarking
Merlion features an evaluation pipeline that simulates live deployment on historical data, allowing for realistic model performance assessment. This is how it works:
- Train an initial model on historical data.
- Retrain the model at regular intervals with new data.
- Collect predictions for the intervening periods.
- Compare predictions with real outcomes to measure effectiveness.
Technical Report and Citing Merlion
For greater insights into the technical aspects, visit the technical report. If you utilize Merlion in your projects, kindly cite it with the provided BibTeX.
Troubleshooting
If you encounter issues while using Merlion, consider the following:
- Ensure you have installed all external dependencies required for your specific models, like OpenMP or JDK, depending on your operating system.
- If your historical data isn’t loading correctly, check your data format – it should be in a pandas DataFrame compatible layout.
- For PySpark distributed backend setup, ensure you follow the integration guide accordingly.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.