Skip to content

Current state of the Notification system

Generally speaking, the notification system is in good shape. There are no critical bugs, it’s easy to create new notifications as well as delete unused ones. It has a good self-check mechanism to detect faults and notify developers about it.

If you haven’t read the documents on the notification system, the first thing you should do is to read everything following README. After that, this document will make more sense.

Who’s using the notification system?

The notification system is used by many teams.

  • Design team is using it to design notifications. They also do UI tests to see if they’re implemented correctly.
  • QA team is using it to test notifications. They do concrete checks if they deliver on correct conditions.
  • Marketing team is using it to ask for new notifications, or updates for existing ones. They can also track replies to send notifications with the help of Member Experience team.
  • Development team is using it to develop & maintain notifications.

Testing the notifications

Example notifications on the admin panel

You can go to https://www.interaction-design.org/admin/notifications to see examples of notifications. They’re mostly email notifications.

For each notification listed as an example, one can easily send themselves an email. The idea behind this was to see how the email looks on a real inbox. Email technology is old and mostly outdated. So the UI may differ on our web page and in a mailbox.

Note that the email being sent-out is the copy of the last successful instance of the notification.

Videos -- deprecated

There used to be videos on how to test each notification, but the videos are all obsolete.

Testing notifications manually

This is one of the most complicated ways to test a notification since it requires a lot of preparation.

For test instances, you can check the example notification page and follow the trigger conditions to put the system in the correct state. In many cases you might need help from a developer to successfully perform a test.

Testing notifications automatically

Lately, we started testing some notifications automatically to see if they are triggered in the correct conditions.

Most of our notifications have automated tests. We’re also using Behat to create business-driven tests that are high-level & are readable by business people as well as QA.

Newsletters

We don’t use the notification system to send emails to newsletter subscribers. But we use the same Mailgun's batch email API to send them in batches.

Newsletters are sent to 150K+ subscribers, and it takes quite some time (2-3 hours) to finish.

Notifying developers in case of the notification system is down

It doesn't make sense to use the notification system to ping the developers to warn them about notification system being down, does it? This is why we use another way, a bit more low level, to send emails instead of relying on the notification system for this.

Just note that, in case we could not deliver emails to developers, we try sending them an SMS as a backup plan.

Known problems

Huge number of logs

In order to prevent spamming our members with notifications, we use frequency limit checks. These checks rely on the notification logs to work properly. This requires:

  1. To have one log for every notification we send
  2. To have consistent log state (pending, sent, failed, or canceled)

This has a huge maintenance cost since we have more than 35M logs, growing ~1M per month.

Difficult to test manually

Notification system is complex, and one of its most complex pieces are the notification chains. Each chain requires an additional set of criteria to be met before being sent. It also requires a certain time, a delay in most cases, before it's actually sent.

This makes testing these types of notifications almost impossible. Although we have a lot of automated tests in place to verify these conditions, we still need more tests to make sure the underlying system works properly.

Backward compatibility

Making changes, even very small changes, to notifications can have drastic side effects. When notifications are being queued, we use php serialization to serialize the notification, it's properties, and the recipients. Making changes to the notifications, like renaming, moving, changing params, etc. can break the serialization and causing an avalanche of failures on production.

Dependency between FQCN, logs, and notification chains

Notification logs & chains rely on the actual FQCN of the notification & notification chain. This becomes problematic when we try to rename/move a notification.