Broadly speaking, there is a few different ways of expressing application state in such a way that it's useful debugging: Logs (implemented in M2 with the excellent psr/log interface) Time series data (also known as metrics). Things like Data Dog, Promethiums, Sensu and so on. Traces; primarily expressed by opentracing.io, but also Zipkin, Lightstep This feature request is to track the implementation of #2; the expression of time series data in the application, as well as the export of this data via whatever mechanism might be setup to receive it. Time series data exposed by an application can provide extremely rapid feedback on an applications performance. This information can be used for a number of useful processes such as: Allowing infrastructure to make decisions around whether to modify the applications runtime environment Debugging the application performance during runtime Alerting when an application violates it's SLO Note: I am doing some work on this with PHP more generally tracked on github. Best place to contrib beyond core support is there. There is currently no simple way to express such time series data in Magento. There are several efforts that have been done to implement this in a third party, vendor specific way including: https://github.com/sitewards/magento-prometheus https://github.com/magento-hackathon/Hackathon_Datadog https://newrelic.com/partner/magento https://github.com/janpapenbrock/magento-statsd However, these suffer form a few drawbacks. In particular, There is no standard API (such as the psr/log api) that can be invoked in third party extensions reliably It locks the users into a hosting infrastructure; undesirable for the continuity of the project. Additionally, there are various metrics that would prove extremely useful when debugging Magento: Cache hit rates (https://github.com/magento/magento2/issues/11151) Cron metrics, such as last execution per job, queue length etc. Indexer status (expressed as 1 or 0) Cache enabled status (expressed as 1 or 0) Cache invalidations issued Orders in various states (processing, shipped etc) Customer logins (aggregate, not per customers) HTTP status codes However, the utilities aren't limited to the above. In particular, third party code often has reason to express it's health (such as the code developed by an agency) to ensure custom logic is functioning correctly, and that a feature is delivering sufficient value to jusify it's cost. General Background It is suggested the implementer have some experience implementing and using application instrumentation. There is a body of knowledge in this area that is immense of it's own right, and not a task that is so simple. Beyond that, the goal of the work should be to checkpoint the data, but defer all processing to the implementing time series databases. It should also have minimal impact on performance. Lastly, high cardinarily data can be expressed through logs or traces, and is perhaps not something that is easily expressed by time series metrics (though honeycomb et. al would beg to differ) Primitives The open source monitoring package Prometheus expresses the primitives: counter gauge histograma summary The author has implemented this in several places, and has found uses for counter and gauge in a repeated way, as well as a limited use for histogram. To begin, the implementation of "guage" and "counter" would be amply sufficient to justify the majority of cases. This is supported by both Prometheus and DataDog, and it's presumed the majority of time series databases. Non Goals It should not be a goal of this work to implement the metrics view in Magento. This can be approached as a separate task, but the primary intent of this work is to express time series data in such a way it can be consumed by third party services. It should not be a goal of this work to track the time associated with the time series data. The sampling is left to the third party application implementations. It should not be a goal of this work to instrument third party applications (redis). They have their own implementations, and should be considered separate services. It's unclear whether instrumenting the PHP runtime itself would be beneficial, though this can likely be handled in the bespoke cases it's required by third party utilisation of the above library. Interface Ideally there should be an interface similar to the psr/log exposed that allows the checkpointing of metrics. Something like: <?php
class Foo
{
private $oMetricRegistry;
/**
* Dependencies automagically wired by DI
*/
public function __construct(
// Singleton injected that stores the metrics. Invariably a global, persists towards the end of the request
\Magento\Framework\Metrics $oMetricRegistry
) {
$this->oMetricRegistry;
// Ensure the metric exists. This operation is an idempotent "create if not exists" type operation.
$this->oMetricRegistry->register(
// metric identifier
'vendor_extension_foo_thing',
// metric type. Can be one of "count" or "gauge"; potentially "histogram".
'count',
// label keys. Labels are mechanisms to slice time series data, supported by both data dog and prometheus
[
'attribute'
]
);
}
public function doThing()
{
// An operation changing the state of the metric. In this case, an increment -- it's a count.
$this->oMetricRegistry->increment(
// The metric identifier
'vendor_extension_foo_thing',
// The label values.
['value']
);
}
} This would allow an extremely simple API to quickly expose the data for ingestion into time series. Storage Ideally, the storage engine should be pluggable, but implement: APCu In Memory (or "null" when considering parity with psr/log) Redis MySQL This would cater the wide range of hosting constraints that Magento can operate with, with a predisposition for not changing exising behaviour (checkpoint to memory but flush at the end of each request) Exposition There are various ways that the data may be exposed. It's suggested the core team deliberately not support any mechanim (except perhaps a native admin panel one, expressed as separate work). Instead, rely on third party / community vendors to provide the required libraries, similar to psr/log. Broader PHP community involvement It is the authors hope that this conversation is extended to the broader PHP community, perhasp through the involvement in FIG. A single interface for time series data would have ramifications for the langauge more generally. Future Work Once implemented, Magento's introspectability could be extended further by implementing traces. Related Reading: http://www.brendangregg.com/usemethod.html https://prometheus.io/docs/introduction/overview/ https://github.com/DataDog/php-datadogstatsd https://github.com/Jimdo/prometheus_client_php
... View more