syscheck

syscheck is a simple dashboard for monitoring the go/no go state of arbitrary systems and powered by Tornado. A series of “checkers” can be defined to periodically check the status of distributed systems and display the results in a browser. It was originally developed to monitor laser frequencies, optical cavities, and other components for quantum physics experiments, but can be used for getting a quick visual overview of the up or down status of assorted systems.

syscheck comes with a number of built-in checkers to handle some common use cases. These include checkers for:

  • HTTP: see that an HTTP(S) endpoint is accessible
  • JSON: check that a given HTTP request results in specific JSON values
  • Redis: like JSON checkers, but with Redis!
  • Supervisor: see if a process started by Supervisor is up

Syscheck is licensed under the BSD license. The source code can be found on Bitbucket.

Getting started

Install using pip:

$ pip install syscheck

To start checking systems with syscheck, create a Python file and create a new SystemsMonitor:

from syscheck.monitor import SystemsMonitor

monitor = SystemsMonitor()

Now say you want to monitor that the web server on localhost is up. This requires creating a new Checker and registering it with the SystemsMonitor:

from syscheck.checker import HttpChecker

monitor.register(
    HttpChecker('http://localhost'),
    description='localhost Apache')

Categories are also supported so that you can group systems based on type or function:

from syscheck.checker import PortChecker, SupervisorChecker

monitor.register_category(
    'Servers',
    [
        (PortChecker('localhost', 1234), 'My cool service'),
        (SupervisorChecker('localhost:9001/RPC2', 'thing', 'Another thing'))
    ]
)

Now it’s time to start checking:

from tornado.ioloop import IOLoop
from syscheck.web import CheckerApp

app = CheckerApp(monitor)
app.listen(8080)
IOLoop.instance().start()

Point your browser to http://localhost:8080 and see it in action!

Creating custom checkers

From time to time, it may be useful to create custom checkers. For example, say you have a wavelength meter and want to ensure that a given laser has a frequency within certain limits. If you have an HTTP service running which can be queried to return this status, you could define a LaserChecker:

import json
import tornado.gen
import tornado.httpclient
from syscheck.checker import Checker

class LaserChecker(Checker):
    def __init__(self, channel, f_min, f_max,
                 hostname='http://waveserver:1729'):
        self.channel = channel
        self.f_min = f_min
        self.f_max = f_max
        self.hostname = hostname

    @tornado.gen.coroutine
    def check(self):
        result = False
        try:
            response = yield tornado.httpclient.AsyncHTTPClient().fetch(
                self.hostname + '/frequencies/', request_timeout=1.5)
            frequencies = json.loads(response.body)
            freq = float(frequencies['frequencies'][self.channel])
            if self.f_min <= freq <= self.f_max:
                result = True
            else:
                result = False
        except Exception:
            result = False
        raise tornado.gen.Return(result)

Web frontend

The included web frontend provides a simple dashboard for checking the status of all your systems at a glance. Running the CheckerApp instance automatically injects the following command-line options:

  --debug                          Enable debug mode (default False)
  --period                         Approximate time between checks in seconds
                                   (default 5)
  --port                           HTTP port to serve on (default 8999)
  --title                          Page title to use (default Systems Check)
  --url-prefix                     URL subdir

Additional options exist to configure logging. See the tornado.options documentation for details.

Endpoints

The index route (http://localhost:<port>/) serves the default dashboard template and static files. Updates are sent via server-sent events at the /stream route. If desired, these can be monitored with, for example, curl. The server can also be polled by external services at /status to return all statuses from the most recent check in JSON format.

Warnings

New in version 0.3.0.

Sometimes, a system that you want to monitor is not critical. In this case, upon registration, a checker can be set to indicate a warning in the event of a failed check rather than an outright failure. This manifests as a different color in the web interface.

API documentation

Systems monitor

class syscheck.monitor.SystemsMonitor(loop=None)[source]

Monitors multiple systems for yes/no status.

check(*args, **kwargs)[source]

Check the status of all systems.

Parameters:timeout (float) – time before assuming a system is down in seconds.
get_categories()[source]

Return a list of unique checker categories.

jsonize()[source]

Return a JSON-serializable version of the checkers dict, i.e., remove the Checker object from each item.

register(checker, description='', category='Systems', warn=False)[source]

Register a new system checker.

Parameters:
  • checker (syscheck.Checker) –
  • description (str) –
  • category (str) –
  • warn (bool) – flag as a warning when the check returns False
register_category(category, checkers)[source]

Register a group of checkers all belonging to one category.

Parameters:
  • category (str) – name of the category
  • checkers (list) – list tuples containing Checkers and (optionally) descriptions and warning status

Example:

monitor.register_category(
    'Category',
    [
        (DummyChecker(), 'First dummy checker'),
        (DummyChecker(), 'Second dummy checker', True)
    ]
)

Checkers

class syscheck.checker.Checker[source]

Base class for all checkers. Subclass this class to define your own checkers.

check()[source]

Override this method to check the status of an arbitrary system. This method should return a boolean, or at least something that behaves sensibly as one.

It is highly recommended, though not required, to implement check() as a Tornado coroutine in order to minimize blocking.

class syscheck.checker.DummyChecker(return_value=None)[source]

A fake checker used for testing. This will either randomly return True or False, or always return one or the other if the keyword argument return_value is set.

syscheck.checker.HTTPChecker

alias of HttpChecker

class syscheck.checker.HttpChecker(url, timeout=1.5, code=200)[source]

Check if an HTTP response (for a GET request) returns an expected HTTP status code.

Parameters:
  • url (str) – URL to check
  • timeout (float) – timeout in seconds
  • code (int) – Expected HTTP status code (default: 200)
syscheck.checker.JSONChecker

alias of JsonChecker

class syscheck.checker.JsonChecker(url, dictionary)[source]

Check that one or more keys has a particular value in an HTTP JSON response.

If multiple keys/values are specified, the result of the check will only be True if all key/value pairs match the input dictionary.

Parameters:
  • url (str) – URL to get JSON response from
  • dictionary (dict) – compare keys/values with the JSON response
class syscheck.checker.PortChecker(host, port)[source]

Check if a TCP port is open.

Note that this is somewhat of a hack and will only work properly if firewall rules allow it.

class syscheck.checker.RedisChecker(host, dictionary, **redis_kwargs)[source]

Check that one or more keys have certain values in a Redis database.

Parameters:
  • host (str) – Redis hostname
  • dictionary (dict) – dictionary to match Redis keys/values
  • **redis_kwargs

    kwargs to pass to the Redis constructor

class syscheck.checker.SerializedRedisChecker(host, key, packing, dictionary, **redis_kwargs)[source]

Check multiple key/value pairs that are serialized into a single Redis key/value pair with either JSON or msgpack.

class syscheck.checker.SupervisorChecker(host, name)[source]

Check that a process running via a supervisord instance is running.

See the supervisor documentation on how to configure supervisord to accept XMLRPC requests over HTTP.

Parameters:
  • host (str) – hostname of the XMLRPC server
  • name (str) – the process name to check

Web