Software /
code /
prosody
File
util/timer.lua @ 11523:5f15ab7c6ae5
Statistics: Rewrite statistics backends to use OpenMetrics
The metric subsystem of Prosody has had some shortcomings from
the perspective of the current state-of-the-art in metric
observability.
The OpenMetrics standard [0] is a formalization of the data
model (and serialization format) of the well-known and
widely-used Prometheus [1] software stack.
The previous stats subsystem of Prosody did not map well to that
format (see e.g. [2] and [3]); the key reason is that it was
trying to do too much math on its own ([2]) while lacking
first-class support for "families" of metrics ([3]) and
structured metric metadata (despite the `extra` argument to
metrics, there was no standard way of representing common things
like "tags" or "labels").
Even though OpenMetrics has grown from the Prometheus world of
monitoring, it maps well to other popular monitoring stacks
such as:
- InfluxDB (labels can be mapped to tags and fields as necessary)
- Carbon/Graphite (labels can be attached to the metric name with
dot-separation)
- StatsD (see graphite when assuming that graphite is used as
backend, which is the default)
The util.statsd module has been ported to use the OpenMetrics
model as a proof of concept. An implementation which exposes
the util.statistics backend data as Prometheus metrics is
ready for publishing in prosody-modules (most likely as
mod_openmetrics_prometheus to avoid breaking existing 0.11
deployments).
At the same time, the previous measure()-based API had one major
advantage: It is really simple and easy to use without requiring
lots of knowledge about OpenMetrics or similar concepts. For that
reason as well as compatibility with existing code, it is preserved
and may even be extended in the future.
However, code relying on the `stats-updated` event as well as
`get_stats` from `statsmanager` will break because the data
model has changed completely; in case of `stats-updated`, the
code will simply not run (as the event was renamed in order
to avoid conflicts); the `get_stats` function has been removed
completely (so it will cause a traceback when it is attempted
to be used).
Note that the measure_*_event methods have been removed from
the module API. I was unable to find any uses or documentation
and thus deemed they should not be ported. Re-implementation is
possible when necessary.
[0]: https://openmetrics.io/
[1]: https://prometheus.io/
[2]: #959
[3]: #960
author | Jonas Schäfer <jonas@wielicki.name> |
---|---|
date | Sun, 18 Apr 2021 11:47:41 +0200 |
parent | 11264:2cdcf55c6dd5 |
child | 12975:d10957394a3c |
line wrap: on
line source
-- Prosody IM -- Copyright (C) 2008-2010 Matthew Wild -- Copyright (C) 2008-2010 Waqas Hussain -- -- This project is MIT/X11 licensed. Please see the -- COPYING file in the source package for more information. -- local indexedbheap = require "util.indexedbheap"; local log = require "util.logger".init("timer"); local server = require "net.server"; local get_time = require "util.time".now local type = type; local debug_traceback = debug.traceback; local tostring = tostring; local xpcall = require "util.xpcall".xpcall; local math_max = math.max; local pairs = pairs; if server.timer then -- The selected net.server implements this API, so defer to that return server.timer; end local _ENV = nil; -- luacheck: std none local _add_task = server.add_task; local _server_timer; local _active_timers = 0; local h = indexedbheap.create(); local params = {}; local next_time = nil; local function _traceback_handler(err) log("error", "Traceback[timer]: %s", debug_traceback(tostring(err), 2)); end local function _on_timer(now) local peek; local readd; while true do peek = h:peek(); if peek == nil or peek > now then break; end local _, callback, id = h:pop(); local param = params[id]; params[id] = nil; --item(now, id, _param); local success, err = xpcall(callback, _traceback_handler, now, id, param); if success and type(err) == "number" then if readd then readd[id] = { callback, err + now }; else readd = { [id] = { callback, err + now } }; end params[id] = param; end end if readd then for id,timer in pairs(readd) do h:insert(timer[1], timer[2], id); end peek = h:peek(); end if peek ~= nil and _active_timers > 1 and peek == next_time then -- Another instance of _on_timer already set next_time to the same value, -- so it should be safe to not renew this timer event peek = nil; else next_time = peek; end if peek then -- peek is the time of the next event return peek - now; end _active_timers = _active_timers - 1; end local function add_task(delay, callback, param) local current_time = get_time(); local event_time = current_time + delay; local id = h:insert(callback, event_time); params[id] = param; if next_time == nil or event_time < next_time then next_time = event_time; if _server_timer then _server_timer:close(); _server_timer = nil; else _active_timers = _active_timers + 1; end _server_timer = _add_task(next_time - current_time, _on_timer); end return id; end local function stop(id) params[id] = nil; local result, item, result_sync = h:remove(id); local peek = h:peek(); if peek ~= next_time and _server_timer then next_time = peek; _server_timer:close(); if next_time ~= nil then _server_timer = _add_task(math_max(next_time - get_time(), 0), _on_timer); end end return result, item, result_sync; end local function reschedule(id, delay) local current_time = get_time(); local event_time = current_time + delay; h:reprioritize(id, delay); if next_time == nil or event_time < next_time then next_time = event_time; _add_task(next_time - current_time, _on_timer); end return id; end return { add_task = add_task; stop = stop; reschedule = reschedule; };