Appearance
Emergency protocol for notification system
Find out if the notification system is operational
There is a quick way to find out whether if the notification system is operational or not:
- Go to https://www.interaction-design.org/admin/nova and check the card "System statuses", "Notification system" line.
If the notification system is down, then oh boy, oh boy, we’re in trouble. But not that much... Because problems are a part of life and they can happen. That’s why we prepared this guide for you!
The command might've temporarily failed
Try running the following commands in the same order and see if it comes back online.
bash
ssh forge@interaction-design.org
cd interaction-design.org/current
php artisan status:notification_system:send_checksIf it works then go back to coding, otherwise proceed with the rest of the documentation.
If the alarm goes off again at a later time it's definitely a problem in either Gsuite or Mailgun.
Check Horizon
Head over to our Horizon dashboard and check pending jobs. If notifications are going out w/o a problem then the problem is most likely in either GSuite or Mailgun.
Check if GSuite is down
https://www.google.com/appsstatus
The status of the notification system is asserted by sending an email from our platform (using the notification system) to a mailbox residing on our GSuite account. If GSuite is down, then the status of the notification system may be a false negative, namely, the notification system might not actually be down.
Don't forget to recheck since it might take a while for the failure reports to show up.
It might help to manually send an email to a non-gmail recipient to check if it arrives.
Check the status of Mailgun
We are using Mailgun as our transactional email provider. It is a 3rd party service and we rely on their operational status.
Two common scenarios can happen:
- Mailgun is down, which will result in a lot of jobs failing in Horizon and reported to our #errors-production channel in Slack.
- Degraded performance, which will result in emails being piled-up in Mailgun.
There are a couple of things you can do to check the status of Mailgun:
- Check Mailgun status as they report it: http://status.mailgun.com
- Check the latest tweets of other people for Mailgun issues: https://x.com/search?f=tweets&vertical=default&q=%40mail_gun&lang=en
- Check our own logs on Mailgun to see whether there is a concurrent problem among our requests: https://mailgun.com/app/logs
Check our DMARC policy
The problem might be an email delivery issue.
- Check our DMARC reports on MXTOOLBOX
- Request new delivery report from MX TOOLBOX:
bash
ssh forge@interaction-design.org
cd interaction-design.org/current
php artisan notification:deliverability:audit- Check Slack #dev channel for the report
- If the report does not arrive then recheck Gsuite/Mailgun status for new reports.