Skip to content

Commit

Permalink
Various
Browse files Browse the repository at this point in the history
- Webapp js - first element in the list does not load timeseries data as per
  #17 added a Known bugs seciton

Modified:
skyline/webapp/templates/panorama.html

- Apply time zone fix to skyline.js -added a Known bugs section as per
  #18

Modified:
skyline/webapp/templates/now.html
- Added an overview image for Mirage
- Added Order Matters section to Mirage explaining Analyzer routing of ALERT
  tuples
- Atempted to make the Mirage doc page it flow better and clearer

Added:
docs/images/crucible/webapp/skyline.webapp.basic.overview.png
docs/skyline.mirage.overview.uml
Modified:
docs/mirage.rst

- Various other minor docs changes

Modified:
docs/installation.rst
docs/overview.rst
docs/webapp.rst
  • Loading branch information
earthgecko committed Aug 1, 2016
1 parent 6587d12 commit b522a07
Show file tree
Hide file tree
Showing 20 changed files with 492 additions and 227 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 0 additions & 6 deletions docs/_build/html/_sources/installation.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,12 +84,6 @@ Steps
cd /opt/skyline/github
git clone https://github.com/earthgecko/skyline.git

.. code-block:: bash

mkdir -p /opt/skyline/github
cd /opt/skyline/github
git clone https://github.com/earthgecko/skyline.git

- Once again using the Python-2.7.12 virtualenv, install the requirements using
the virtualenv pip, this can take a long time, the pandas install takes quite
a while.
Expand Down
197 changes: 133 additions & 64 deletions docs/_build/html/_sources/mirage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,60 +4,39 @@ Mirage

The Mirage service is responsible for analyzing selected timeseries at custom
time ranges when a timeseries seasonality does not fit within
:mod:`settings.FULL_DURATION`.
:mod:`settings.FULL_DURATION`. Mirage allows for testing of real time data
and algorithms in parallel to Analyzer. Mirage was inspired by Abe Stanway's
Crucible and the desire to extend the temporal data pools available to Skyline
in an attempt to handle seasonality better, reduce noise and increase signal,
specifically on seasonal metrics.

The Mirage app allows for second order resolution analysis of metrics that
have a ``SECOND_ORDER_RESOLUTION_HOURS`` defined in their Analyzer alert tuple
:mod:`settings.ALERTS` setting.
An overview of Mirage
=====================

Mirage is fed by Analyzer.
Mirage gets timeseries data from Graphite.
- Mirage is fed specific user defined metrics by Analyzer.
- Mirage gets timeseries data for metrics from Graphite.
- Mirage does not have its own ``ALERT`` settings it uses :mod:`settings.ALERTS`
just like Analyzer does.
- Mirage also sends anomaly details to Panorama, like Analyzer does.

Analyzer's :mod:`settings.FULL_DURATION` somewhat limits Analyzer's usefulness
for metrics that have a seasonality / periodicity that is greater than
:mod:`settings.FULL_DURATION`. Increasing :mod:`settings.FULL_DURATION` to
anything above 24 hours (86400) is not necessarily realistic or useful, because
the greater the :mod:`settings.FULL_DURATION`, the greater memory required for
Redis and the longer Skyline analyzer will take to run.

Mirage uses the user-defined seasonality for a metric
(``SECOND_ORDER_RESOLUTION_HOURS``) and if Analyzer finds a metric to be
anomalous at :mod:`settings.FULL_DURATION` and the metric alert tuple has
`SECOND_ORDER_RESOLUTION_HOURS` and :mod:`settings.ENABLE_MIRAGE` is ``True``,
Analyzer will push the metric variables to the Mirage check file for Mirage to
surface the metric's timeseries at its defined seasonality, in real time from
Graphite in json format and then analyze the timeseries to determine if the
datapoint that triggered analyzer, is anomalous at the metric's true
seasonality.
.. figure:: images/crucible/mirage/skyline.mirage.overview.png
:alt: An overview of Mirage

A real world example with tenfold.com
-------------------------------------
`Fullsize overview image <_images/skyline.mirage.overview.png>`_ for a clearer picture.

:blak3r2: Our app logs phone calls for businesses and I want to be able to
detect when VIP phone systems go down or act funny and begin flooding us with
events. Our work load is very noisy from 9-5pm... where 9-5 is different for
each customer depending on their workload so thresholding and modeling isn't
good.
Why Mirage?
-----------

:earthgecko: Yes, Mirage is great at user defined seasonality, in your case
weekday 9-5 peaks, evening drop offs, early morning and weekend lows - multi
seasonal, Mirage is the ticket.
Your best bet would be to try 7days (168) as your SECOND_ORDER_RESOLUTION_HOURS
value for those app log metrics, however, you may get away with a 3 day
window, it depends on the metrics really, but it may not be noisy at 3 days
resolution, even at the weekends.
Also bear in mind, Mirage does some "normalizing" if your have aggregations
in Graphite (e.g retentions), due to Mirage probably pulling aggregated data,
however it is analyzing the timeseries at the aggregated resolution so it is
"normalised" as the data point that Analyzer triggered on is ALSO aggregated
in that timeseries resolution. So intuitively on may think it may miss it in
the aggregation then. True, but Analyzer will likely trigger on the next run
again if it IS anomalous, anomalous metrics normally trigger multiple,
multiple times (hence the EXPIRATION_TIME settings), so when Analyzer pushes
to Mirage again, each aggregation is more likely to trigger anomalous, IF it
is anomalous at the user defined full duration. A little flattened maybe, a
little lag maybe, but less noise, more signal.
Analyzer's :mod:`settings.FULL_DURATION` somewhat limits Analyzer's usefulness
for metrics that have a seasonality / periodicity that is greater than
:mod:`settings.FULL_DURATION`. This means Analyzer is not great in terms of
"seeing the bigger picture" when it comes to metrics that have a weekly pattern
as well as a daily patterns for example.

Increasing :mod:`settings.FULL_DURATION` to anything above 24 hours (86400) is
not necessarily realistic or useful, because the greater the
:mod:`settings.FULL_DURATION`, the greater memory required for Redis and the
longer Analyzer will take to run.

What Mirage can and cannot do
=============================
Expand Down Expand Up @@ -85,6 +64,9 @@ consumption in an office building is a good example of a multi-seasonal data set

For now let us just consider the daily and weekly seasonality.

The difference between the Analyzer and Mirage views of a timeseries
--------------------------------------------------------------------

.. plot::

# A bit of a contrived example...
Expand Down Expand Up @@ -224,25 +206,55 @@ For now let us just consider the daily and weekly seasonality.

As we can see above, on a Saturday morning the energy consumption does not
increase as it normally does during the week days. Analyzer would probably find
the metric to be anomalous if :mod:`settings.ANALYZER_CRUCIBLE_ENABLED` was set
to 86400 (24 hours), Saturday morning would seem anomalous.
the metric to be anomalous if :mod:`settings.FULL_DURATION` was set to 86400 (24
hours), Saturday morning would seem anomalous.

However, if the metric's alert tuple was set up with a
``SECOND_ORDER_RESOLUTION_HOURS`` of 168, Mirage would analyze the data point
against a week's worth of data points and the Saturday and Sunday daytime data
points would have less probability of triggering as anomalous. *The above
image is plotted as if the Mirage ``SECOND_ORDER_RESOLUTION_HOURS`` was set to
image is plotted as if the Mirage* ``SECOND_ORDER_RESOLUTION_HOURS`` *was set to
172 hours just so that the trailing edges can be seen.*

A real world example with tenfold.com
-------------------------------------

:blak3r2: Our app logs phone calls for businesses and I want to be able to
detect when VIP phone systems go down or act funny and begin flooding us with
events. Our work load is very noisy from 9-5pm... where 9-5 is different for
each customer depending on their workload so thresholding and modeling isn't
good.

:earthgecko: Yes, Mirage is great at user defined seasonality, in your case
weekday 9-5 peaks, evening drop offs, early morning and weekend lows - multi
seasonal, Mirage is the ticket.
Your best bet would be to try 7days (168) as your SECOND_ORDER_RESOLUTION_HOURS
value for those app log metrics, however, you may get away with a 3 day
window, it depends on the metrics really, but it may not be noisy at 3 days
resolution, even at the weekends.

Mirage "normalizes"
-------------------

Mirage is a "tuning" tool for seasonal metrics and it is important to understand
that Mirage is probably using aggregated data (unless your Graphite is not using
retentions and aggregating) and due to this Mirage will lose some resolution
resulting in it being less sensitive to anomalies than Analyzer is.

So Mirage does some "normalizing" if your have aggregations in Graphite (e.g
retentions), however it is analyzing the timeseries at the aggregated resolution
so it is "normalised" as the data point that Analyzer triggered on is ALSO
aggregated in the timeseries resolution that Mirage is analyzing.
Intuitively one may think it may miss it in the aggregation then. This is true
to an extent, but Analyzer will likely trigger multiple times if the metric
**IS** anomalous, so when Analyzer pushes to Mirage again, each aggregation is
more likely to trigger as anomalous, **IF** the metric anomalous at the user
defined full duration. A little flattened maybe, a little lag maybe, but less
noise, more signal.

Setting up and enabling Mirage
==============================


By default Mirage is disabled, various Mirage options can be configured in the
``settings.py`` file and Analyzer and Mirage can be configured as appropriate
for your environment.
Expand All @@ -252,20 +264,76 @@ absolute path):

.. code-block:: bash

sudo mkdir -p $MIRAGE_CHECK_PATH
sudo mkdir -p $MIRAGE_DATA_FOLDER
mkdir -p $MIRAGE_CHECK_PATH
mkdir -p $MIRAGE_DATA_FOLDER


Configure ``settings.py`` with some :mod:`settings.ALERTS` alert tuples that
have the ``SECOND_ORDER_RESOLUTION_HOURS`` defined. For example below is an
Analyzer only :mod:`settings.ALERTS` tuple that does not have Mirage enabled as
it has no ``SECOND_ORDER_RESOLUTION_HOURS`` defined:

.. code-block:: python

ALERTS = (
("stats_counts.http.rpm.publishers.*", "smtp", 300), # --> Analyzer sends to alerter
)

To enable Analyzer to send the metric to Mirage we append the metric alert tuple
in :mod:`settings.ALERTS` with the ``SECOND_ORDER_RESOLUTION_HOURS`` value.
Below we have used 168 hours to get Mirage to analyze **any** anomalous metric
in the "stats_counts.http.rpm.publishers.*" namespace using using 7 days worth
of timeseries data from Graphite:

.. code-block:: python

ALERTS = (
# ("stats_counts.http.rpm.publishers.*", "smtp", 300), # --> Analyzer sends to alerter
("stats_counts.http.rpm.publishers.*", "smtp", 300, 168), # --> Analyzer sends to Mirage
)

Order Matters
-------------

.. warning:: It is important to note that Mirage enabled metric namespaces must
be defined before non Mirage enabled metric namespace tuples as Analyzer uses
the first alert tuple that matches.

So for example, with some annotation

.. code-block:: python

ALERTS = (
("skyline", "smtp", 1800),
("stats_counts.http.rpm.publishers.seasonal_pub1", "smtp", 300, 168), # --> To Mirage
("stats_counts.http.rpm.publishers.seasonal_pub_freddy", "smtp", 300, 168), # --> To Mirage
("stats_counts.http.rpm.publishers.*", "smtp", 300), # --> To alerter
)

The above would ensure if Analyzer found seasonal_pub1 or seasonal_pub_freddy
anomalous, instead of firing an alert as it does for all other
``stats_counts.http.rpm.publishers.*``, because they have 168 defined, Analyzer
sends the metric to Mirage.

Configure ``settings.py`` with some alert tuples that have the
``SECOND_ORDER_RESOLUTION_HOURS`` defined, e.g.:
The below would NOT have the desired effect of analysing the metrics
seasonal_pub1 and seasonal_pub_freddy with Mirage

.. code-block:: python

ALERTS = (
("skyline", "smtp", 1800),
("stats_counts.http.rpm.publishers.*", "smtp", 300, 168),
("stats_counts.http.rpm.publishers.*", "smtp", 300), # --> To alerter
("stats_counts.http.rpm.publishers.seasonal_pub1", "smtp", 300, 168), # --> NEVER gets reached
("stats_counts.http.rpm.publishers.seasonal_pub_freddy", "smtp", 300, 168), # --> NEVER gets reached
)

Hopefully it is clear that the first ``stats_counts.http.rpm.publishers.*``
alert tuple would route ALL to alerter and seasonal_pub1 and seasonal_pub_freddy
would never get sent to be analyzed by Mirage.

Enabling
--------

And ensure that ``settings.py`` has Mirage options enabled, specifically the
basic ones:

Expand All @@ -275,19 +343,16 @@ basic ones:
ENABLE_FULL_DURATION_ALERTS = False
MIRAGE_ENABLE_ALERTS = True

Start Mirage:
Start Mirage and restart Analyzer:

.. code-block:: bash

cd skyline/bin
sudo ./mirage.d start

./mirage.d start
./analyzer.d restart

Mirage allows for testing of real time data and algorithms in parallel to
Analyzer allowing for comparisons of different timeseries and/or algorithms.
Mirage was inspired by Crucible and the desire to extend the temporal data pools
available to Analyzer in an attempt to handle seasonality better, reduce noise
and increase signal, specifically on seasonal metrics.
Rate limited
------------

Mirage is rate limited to analyze 30 metrics per minute, this is by design and
desired. Surfacing data from Graphite and analyzing ~1000 data points in a
Expand All @@ -300,6 +365,10 @@ signals would still be sent.
What Mirage does
================

- If Analyzer finds a metric to be anomalous at :mod:`settings.FULL_DURATION`
and the metric alert tuple has ``SECOND_ORDER_RESOLUTION_HOURS`` and
:mod:`settings.ENABLE_MIRAGE` is ``True``, Analyzer will push the metric
variables to the Mirage check file.
- Mirage watches for added check files.
- When a check is found, Mirage determines what the configured
``SECOND_ORDER_RESOLUTION_HOURS`` is for the metric from the tuple in
Expand Down
2 changes: 1 addition & 1 deletion docs/_build/html/_sources/overview.txt
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Skyline uses to following technologies and libraries at its core:
4. **scipy** - `SciPy`_ Library - Fundamental library for scientific computing
5. **pandas** - `pandas`_ - Python Data Analysis Library
6. **mysql/mariadb** - a database - `MySQL`_ or `MariaDB`_
7. **:red:`re`:brow:`brow`** - Skyline uses a modified port of Marian
7. :red:`re`:brow:`brow` - Skyline uses a modified port of Marian
Steinbach's excellent `rebrow`_

.. _Etsy: https://www.etsy.com/
Expand Down
2 changes: 1 addition & 1 deletion docs/_build/html/_sources/webapp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ A basic overview of the Webapp
==============================

.. figure:: images/crucible/webapp/skyline.webapp.basic.overview.png
:alt: A simplified workflow of Skyline
:alt: A basic overview of the Webapp


Deploying the Webapp
Expand Down
5 changes: 0 additions & 5 deletions docs/_build/html/installation.html
Original file line number Diff line number Diff line change
Expand Up @@ -254,11 +254,6 @@ <h2>Steps<a class="headerlink" href="#steps" title="Permalink to this headline">
git clone https://github.com/earthgecko/skyline.git
</pre></div>
</div>
<div class="highlight-bash"><div class="highlight"><pre><span></span>mkdir -p /opt/skyline/github
<span class="nb">cd</span> /opt/skyline/github
git clone https://github.com/earthgecko/skyline.git
</pre></div>
</div>
<ul class="simple">
<li>Once again using the Python-2.7.12 virtualenv, install the requirements using
the virtualenv pip, this can take a long time, the pandas install takes quite
Expand Down
Binary file modified docs/_build/html/mirage-1.pdf
Binary file not shown.
Loading

0 comments on commit b522a07

Please sign in to comment.