A Weeklong Outage that went Undetected by downtime monitors

Feature Image for blog post on Uptime Monitoring. Blog of Amar Vyas

I experienced the “perfect storm” this week with my website.This morning, I was looking up analytics on this blog, number of visits had dropped to 10 percent of daily traffic. I am aware that google algorithms have changed (again) and many are experiencing a drop in traffic… but the numbers were alarming. What exactly went wrong?

Click below to listen to the audio version of this story

Possible causes I thought of and checked: a. Changing registrars (long story short, my former registrar pooped and I moved several domains to Porkbun and CloudFlare or CF). But all domains were on CF so should not have been an issue. b. SSL issue – but certification were valid, SSLlabs showed A or A+ rating. c. Site is managed through Gridpane, which sent no errors. d. None of the three Uptime monitors sent an alert about downtime. I use Hetrix Tools, Uptime Robot and Screpy. e. Ping to server showed no result – probably because I use VPN that has been blocked.

Finally, fired up control panel at the vpn provider- lo and behold, server was offline. Fired it up, half an hour later, all systems working.

Wonder why the alerts did not trigger- maybe because I was only checking the domain and not ip?

Way Forward for Uptime Monitoring using Downtime Monitors

Update February 2022: I have revisited the setup for the uptime monitors that I use: Hetrix Tools, Uptime Robot, Screpy, and Freshping. Each service will now have two different monitors, checking the service at different intervals. Hope this will resolve the issue of not receiving alerts!

read my other posts on the topic of uptime Monitoring


Feature Image for blog post on Uptime Monitoring. Blog of Amar Vyas

Leave a Reply

Your email address will not be published. Required fields are marked *