We help you understand the performance and availability of your services by providing blackbox telemetry for your Prometheus servers to scrape. @discordianfish
We released a release candidate of @PrometheusIO 2.2.0 today with loads of bug fixes and enhancements. Please tests and give us feedback! https://t.co/Nn2YxtMpiA
Service fully recovered. The outage was caused by a kubernetes upgrade with the root cause being a networking change in the past weeks. It's still unclear what exactly cause the outage but we reverted the networking changes.
I've spend the last weeks/months building a Cloudformation installer for reasonably secure multi-master Kubernetes Clusters and today blogged about it: https://t.co/UmZoyFobqH
We <3 all monitoring systems and updated https://t.co/bg7SsFfBhi to reflect that. Thanks to Prometheus' open api, you can use Latency.at with almost any monitoring system.
Google Search will be using page speed as ranking factor for mobile searches: https://t.co/g20noKcdJa — now's the time to audit your site with Lighthouse, PageSpeed Insights, and check your stats in Chrome User Experience Report! #perfmatters
Checking out @LatencyAt, not tried it yet but very interesting idea, providing global infra for blackbox probing, but rather than building some proprietary dashboard thingy you just scrape it with your own @PrometheusIO setup ...
The open @PrometheusIO metrics exposition format allows you to use our service with many more monitoring systems like @datadoghq, @influxdb@sensu, @zabbix and more. Here is how: https://t.co/rSttVvPP3N
We lost the quorum on the etcd used to manage the satellites cluster due to missing affinity constraints. We added the constraints and redeployed. The service should be fully available again.