Tech:SLO

Miraheze’s Site Reliability Engineering team have crafted a set of Service Level Objectives (SLOs) to measure the performance of critical services that we operate. Miraheze does not offer any formal Service Level Agreements (SLAs) so failure to meet the objectives below do not offer any financial or legal penalties - however there is an agreed internal escalation process for SLOs being breached.

SLO Monitoring Process

  • SLOs are monitored on a monthly average basis - either at the end of the month or the beginning, we will publish the results below in the relevant table to signify our performances alongside whether the objective was met or not this month.
  • Once SLOs are published below - if any SLO is failed, it will be highlighted for review internally.
  • Following a first failure, a Phorge task will be created to review the performance data to evaluate whether the failure can be attributed to factors beyond control - or whether something needs to be done or implemented to ensure a success the following month.
  • SLOs will be re-evaluated and if a second failure in a row occurs, a new Phorge task is created and assigned to the other SRE team for an external-team review. The other team will conduct a review to understand the failings that led to the performance indicator not being achieved.
  • In the rare circumstance an SLO is failed three months in a row - the matter will be escalated to the Board of Directors for a review as there is a series of consecutive failures that need to be considered.

Infrastructure SLOs

ServiceTypeObjectiveDec 22Jan 23Feb 23Mar 23Apr 23May 23Jun 23Jul 23Aug 23Sep 23Oct 23Nov 23
BastionAvailability99.5%100%100%100%
rowspan=“2”Cache ProxyAvailability75%/99.5%75%/99.5%75%/99.5%75%/99.9%
Error7.5%6.87%3.81%4.16%
CloudAvailability99.5%100%100%100%
rowspan=“3”DNSAvailability99.5%100%100%100%
Error0.5%0.2%0.15%0.16%
Latency5ms3.32ms3.23ms3.43ms
ElasticSearchAvailability99.5%100%100%100%
rowspan=“3”GraylogAvailability99.5%100%100%100%
Error0.5%0%0%0%
Latency5ms0.65ms0.75ms1.01ms
LDAPAvailability99.5%100%100%100%
Error1%1.71%0.34%0.83%
Latency30s9.61s10.80s28.08s
rowspan=“2”MariaDBAvailability99.5%98.7%100%100%
Error5%7.95%0.01%0.01%
rowspan=“2”PhorgeAvailability99.5%100%99.90%99.90%
Latency5s0.44s0.57s0.64s
rowspan=“2”PuppetAvailability99.5%100%100%99.99%
Latency30ms17ms18.40ms20.90ms
rowspan=“3”SwiftAvailability99.5%100%100%100%
Error1%0.07%1.06%0.75%
Latency1s0.50s0.54s0.52s

MediaWiki SLOs

ServiceTypeObjectiveDec 22Jan 23Feb 23Mar 23Apr 23May 23Jun 23Jul 23Aug 23Sep 23Oct 23Nov 23
rowspan=“2”JobQueueAvailability99.5%95.30%99.90%100%
Errors1.5%1.8%3.37%0.02%
rowspan=“3”MediaWikiAvailability99%96.5%99.30%99.50%
Error3%2.03%1.54%0.35%
Latency3s1.41s1.41s1.35s
MemcachedAvailability99.5%100%100%100%

Go to Source →