Anomaly Detection - ilya-khadykin/notes-outdated GitHub Wiki
TO DO:
- http://datascience.stackexchange.com/questions/6547/open-source-anomaly-detection-in-python
- https://anomaly.io/anomaly-detection-twitter-r/
Anomaly is any deviation from normal pattern
There are a lot of anomalies.
The following algorithm is suite for monitoring system
- Learn normal behaviour of your data
Every signal has a normal behaviour and we have to learn it
At any point in time your model should return a range of expected values for the metric with some probability
Understand and classify data distribution - you need classification algorithm
You have to be adaptive since patterns could change, but you adapt learning rate or accelerating it
Exponential forgetting - Score
Determine type of anomaly and its significance or importance
How long have the anomaly been present? - duration based scoring probability model
How significant is deviation from normal pattern?
- Grouping anomalies - creating a graph of relationships
Determine if there is correlation between other anomalies metrics in your monitoring system
- Apache Kafka - even queue processor
- Elasticsearch
- Apache Have - daily analytics jobs
- Cassandra - real time storage
- Dr Ira Cohen - Discovering real time anomalies in large scale time series signals - https://www.youtube.com/watch?v=SrOM2z6h_RQ
http://stackoverflow.com/questions/33801034/real-time-anomaly-detection