Spec: Activity Stream Notifications - ckan/ckan GitHub Wiki

We want both on-site notification (bubble in top-right of site that tells you how many new activities you have) and email notification of new activities in your dashboard activity stream. They're together in one spec because the two features interact with eachother.

See this thread on ckan-dev for discussion, especially about how to implement the email notifications: http://lists.okfn.org/pipermail/ckan-dev/2012-October/003297.html

Interaction

How should email notifications and on-site notifications interact, in terms of which activities are considered seen or unseen?

  • If the user has viewed her activity stream on the site then we know that she has seen the activities and it would be annoying if we then sent her a notification email about them.

  • But if we have sent an email notification for an activity we do not know if the user has received and read the email so we might not want to mark the activity as seen on the site.

Suggested solution:

  • Viewing your activity stream on the site automatically marks all of your activities as "seen" (you do not have to click a "mark as read" button)

  • If activities have been marked as seen by viewing them on the site, then email notifications of those activities will not be sent

  • But, sending an email notification for an activity does not mark the activity as seen on the site, when the user logs in to the site it will still say that she has unseen activities, she has to view her activity stream to mark them as seen (but the email will contain a link directly to the page with the activity stream, clicking the link will automatically mark the activities as seen)

  • If CKAN has sent an email notification for an activity it will not send another notification for the same activity, even if the user has not logged in and looked at her dashboard so the activity is still marked as unseen on the site. The reason is that it might be annoying if CKAN kept sending repeated notifications for the same activities.

This means that behind the scenes we need two separate concepts, seen/unseen activities for the on-site notification and unsent activity notifications for the email notification. Viewing your activity stream on the site marks all of your activities as seen and deletes any unsent activity notifications that were waiting to be sent by email, without sending them. Sending an email notification for some activities deletes them from the unsent notifications queue but does not mark them as seen.

On-Site Notifications

We want to add a "bubble" to the top-right of the site when logged in, that shows the number of "unseen" (or "new") activities you have in your dashboard activity stream.

The main problem with implementing this is that there is currently no concept of "seen" and "unseen" activity streams in the model.

We need to record the seen/unseen status of an activity on a per-user basis, the same activity may be seen by some users and unseen by others, so a simple boolean seen/unseen column on the activity streams table is not an option.

Suggested solution:

  • Add a datetime column to the user model, recording the time they last looked at their dashboard activity stream. (Alternatively add a new dashboard db table to hold this.)

  • Add a logic/action/get.py function user_new_activities_count() (or dashboard_new_activities_count()), returns the number of unseen activities in a user's dashboard by looking for activities newer than the time the user last viewed her dashboard.

  • Add a logic/action/update.py function user_mark_activities_as_read() (or dashboard_mark_activities_as_read()) that updates the time that the user last viewed her dashboard.

  • Then we just need to add the frontend for it. user_new_activities_count needs to be made available to the templates (every page), and loading the user dashboard page needs to call user_mark_activities_as_read (probably implement this in the user controller).

Email Notifications

User Stories

  1. User: I want to receive notification emails when there are new activities in my dashboard, so I don't have to login to the site to find out about new activity.

    1a. I want to receive these emails in digest form, not one email per activity.

    1b. I want the emails to contain links to my dashboard where I can see my activity stream.

    1c. I do not want to receive more than one email about the same activity.

    1d. If I have already seen an activity by visiting my dashboard page on the site, I do not want to receive an email notification about that activity.

  2. Sysadmin: I want to be able to disable or enable email notifications for an entire site. (Config file setting and/or setting in database.)

    2a. If email notifications were turned off then a sysadmin turns them on again, do not send users email about activity that happened while email notifications were turned off, only send emails about new activity.

  3. Sysadmin: I want to set the maximum frequency emails will be sent out at (hourly, daily, weekly...)

  4. User: I want to be able to turn off email notifications just for myself by setting an option on a user preference page.

    4a. When I receive an email notification I want it to contain a link to this user preference page.

    4b. If email notifications are disabled site-wide by a sysadmin, I do not want to see the email notifications on/off preference in my user preferences because it would be useless.

    4c. If I had email notifications turned off and then I turn them on again, I do not want to receive a lot of notifications about activities that happened while I had email notifications turned off. Only send me notifications about new emails.

Questions

What goes in the email subject?

What goes in the email body?

What is the email's From address?

Suggestion, at least for the initial implementation, is that the email would not contain the actual activity streams content, it would just say "You have new activity from {WEBSITE}" in the subject, repeat in the body with link to dashboard page on website.

Beyond the initial implementation, if there is only one new activity, then we could put the activity in the email subject, e.g. "seanh created the dataset foo". What if there are multiple activities wrapped into one email? Could we make the subject more interesting? Some sort of summary e.g. "7 new datasets in group foo; 13 changes to dataset bar" etc. Needs more thought.

"Push" model implementation [deprecated]

This approach won't work unless we can figure out some way to decide at activity_create time which users an activity should be addressed to, i.e. for which users will dashboard_activity_list() return this activity?

Add a notifications_outbox database table and model class.

  • The create_activity() logic action function and the activity streams session extension need to add rows to the notifications outbox (these are the only two places where new activities are created). Add one row to the table per activity.
  • What columns should the unsent_notifications table have? user (id or name of the user to notify), type (the type of notification e.g. activity stream, determines what notification renderer is used to render the email body), content

Add a notification mailer job that consumes rows from the outbox table and emails them:

  • After an email is sent successfully, the rows are deleted from the outbox table.
  • Implement the mailer job as a paster command, and have a cron job call it.
  • The notification mailer can then be reused for sending notifications about other things besides activity streams.
  • Don't send one mail per activity, instead send multiple activities in one mail. Each time the mailer job runs it will send at most one email per user. The mailer job will group all rows from the notification table that have the same user into a single email.
  • The maximum frequency of emails can be configured by the site admin by setting how often the cron job runs.
  • The mailer job should check if the user has email notifications turned off, if she does just delete all her rows from the unsent notifications table without sending anything.

CKAN already has a Mailer class that sends emails e.g. for password resets, organization application emails, see:

  • ckan/controllers/user.py
  • ckan/tests/lib/test_mailer.py
  • ckan/tests/mock_mail_server.py
  • ckan/tests/misc/test_mock_mail_server.py
  • ckan/tests/functional/test_user.py

"Pull" model implementation

This approach avoids having to solve the problem of figuring out at activity_create time what users an activity is addressed to. But it does mean that the email notifier job has to call dashboard_activity_list() for every user (whether the user has any unsent email notifications or not) each time it runs.

  • get_notifications(user_id, since)

    Returns a list of notifications, probably just dicts with keys 'subject' and 'body', that need to be sent to the given user.

    This is where we handle grouping multiple emails into a single digest email and deciding what the subject and body of the email should be. In the initial implementation always group all activities into one email. Later we could make it put certain kinds of important activities into their own individual emails with more specific subjects and bodies e.g. "Rufus Pollock is now following you on datahub.io!"

    since should be datetime of the last email notification that was successfully sent to this user.

    This would call dashboard_activity_list() for the user, filter the list to just those newer than since, then format an email subject and body for them.

  • send_notification(user, notification_dict)

    Try to send the given email notification and raise an exception if sending fails.

  • get_and_send_notifications()

    The periodic job that loops over all users in CKAN.

    Calls get_notifications(user_id, since) for each user. This means that it needs to store since for each user somewhere.

    If get_notifications() returns any notifications for the user, loops over the notifications and calls send_notification(notification_dict) for each.

    If send_notification() returns successfully, updates since for that user.

since: The user's since time is slightly more complicated than shown, it needs to be the time of the last successfully sent email notification or the last time the user viewed her activity stream on the site, whichever is newer.

send_notification: A preference needs to be added (e.g. to the user model) where users can turn email notifications on or off. If a user has email notifications turned off then send_notification() should just return successfully without doing anything so the user's since date will be updated without any email being sent. It should do the same thing if the user does not have an email address.

send_notification failures: What should get_and_send_notifications() do if send_notification() fails? If for example sending an email is failing because the user's email address is wrong, we don't want unsent notifications to pile up forever so when the user fixes her email address she suddenly gets a lot of emails (or one gigantic email). At the same time if sending fails for a temporary reason, we might want to keep the emails and try again later. Maybe try up to 3 times.