We're currently experiencing extended downtime following a system reboot on our primary server. We are actively working to restore all services. We'll update here as soon as everything is back online. Apologies for the inconvenience.
3/ What we shipped today:
โ A network watchdog: probes Cloudflare/Google DNS/Quad9 every 15s via raw TCP. If 2/3 fail, notifications are suppressed within 15โ30 seconds โ before any quorum cycle completes. Lifts automatically within seconds of recovery.
โ A global spike detector: if 30%+ of monitors flip DOWN in 3 min, notifications suppressed.
โ An admin freeze API for immediate manual override.
Sorry for the noise. A monitoring tool that cries wolf is worse than no monitoring tool.
This morning, our infrastructure provider had a 16-minute network outage severe enough to take down their own website (HTTP 500).
Our monitoring server sits on that provider โ so it went
unreachable, and every user's monitors simultaneously reported DOWN, triggering false alerts across the board. Here's what happened and what we shipped to prevent it ๐งต
2/ Immediate fix: we auto-resolved all false incidents, flagged the affected check results as
false positives, and excluded them from uptime calculations. No user's uptime % was harmed.
Most incident workflows are still broken:
โ monitoring in one tool
โ on-call in another
โ status pages updated manually
All just to understand one outage.
Weโre fixing that:
โ fewer alerts
โ more context in one place
โ better visibility during incidents
Less noise. Faster response.
https://t.co/CKVnNUFTPd
More regions, more control ๐
New monitoring regions:
๐ฟ๐ฆ Johannesburg (ZA)
๐ง๐ท Sรฃo Paulo (BR)
Paid plans can now choose a preferred Primary Region for checks.
Better regional accuracy, less "works for me" noise.
#uptime#monitoring#SaaS
๐ New: On-Call Scheduling ๐จ
Define rotations (daily/weekly/custom), add overrides, and auto-assign new incidents to whoever is on-call right now.
Escalation is included if an incident isnโt acknowledged in time.
๐ Docs: https://t.co/6FgetWmMAw
#IncidentManagement#SRE
3 months ago: 1M uptime checks.
Today: 7M+.
The number isnโt the point.
The timing is.
Checks spike during deploys, DNS changes, and partial outages.
Exactly when trust is fragile.
Status pages are boring.
Until you really need one.
Incident communication should not depend on the same infrastructure thatโs failing.
This failure mode keeps repeating because itโs uncomfortable to design around.
Reliance on CDNs for status pages introduces circular failure modes.
How to architect an incident communication system that remains available during major edge network outages.
{ author: @IT_mafija }
https://t.co/1yKP2vrF01
New: 3rd-party Dependencies ๐ฏ
Show vendor status (Cloudflare, Twilio, Webex, etc.) directly on your status page so users instantly know if the issue is yours or upstream.
Stop the "is it just me?" chaos.
๐ Docs: https://t.co/Fwe3ZnUOBB
#SaaS#IncidentManagement#DevOps #devopslife
Status page best practices (2026):
โ host it separately (so it stays up when your app is down)
โ component-level status (not one green dot)
โ predictable update cadence + "next update byโฆ"
โ incident + maintenance templates you can copy/paste
Full playbook:
https://t.co/kCVbWNhTe7
#SRE #DevOps #IncidentManagement
Most outage tickets arenโt new info. Theyโre the same question: "Is this just me?"
A status page answers it once, publicly, instead of 100 times in support.
Key: fast first update + next update time. Details can wait.
https://t.co/6dbgC12yQN
#SaaS#IncidentManagement #CustomerSupport
Public vs private status page?
Public = trust + fewer "is it down?" tickets + SEO.
Private = sensitive/internal/customer-specific details.
Common fear: "public makes us look unreliable."
Reality: silence looks worse.
You can run both. ๐
https://t.co/Ke2JU6g7o0
#SaaS#DevOps #IncidentManagement
Do small SaaS products need a status page?
Usually: yes.
Because when stuff breaks:
- users panic
- inbox explodes
- "silence" looks careless
A status page = one source of truth + fewer tickets + trust.
https://t.co/9dFmHYpApo
#SaaS#DevOps#IncidentManagement
(4/4) We're just getting started in 2026. Privacy-first monitoring & crystal-clear status pages - built for SREs, indie teams, and anyone who hates surprise downtime.
What feature are you most excited to try?
Drop it below! ๐
https://t.co/CKVnNUFTPd
#SRE #DevOps #IncidentManagement #indiehackers #IndieHacker
2026 started with a BANG for @getStatusPage!
In the first 12 days, we shipped:
- Multi-language support (๐ซ๐ท/๐ท๐ธ/๐ฉ๐ช/๐ธ๐ฆ - almost complete)
- Teams feature + billing usage + snapshots ๐ผ
- Privacy-first status page analytics: Now with Swetrix support! ๐
- Public RSS/Atom feeds for incidents (/rss.xml & /atom.xml) ๐ก
and tons more fixes and improvements.
Here's our changelog link if you'd like to see the details: https://t.co/ElXDnE5EIj
(1/4)
#IndieSaaS #StatusPage #PrivacyFirst
(3/4) Public visibility & comms upgrades:
- RSS + Atom feeds + autodiscovery in "Subscribe" modal
- New blog series: Why status pages matter, vs pure monitoring, reducing tickets, public vs private โ great for sharing!
- Simplified dashboard: Operational view with health/incidents/live updates
- Hourly history โ compact segmented progress bar + better tooltips
- UI polish everywhere: purple brand consistency, better incident timelines, translations, email deliverability checks, robots.txt/noindex fixes, and more.
(Pro tip: Feeds are live on every status page now - try yours!)