Login and Authentication internals - ckan/ckan GitHub Wiki

This was written as part of the review process for #6560. It might not be up to date

Process Overview

CKAN < 2.10 (Repoze.who)

Log in a user

  1. User visits /user/login (user.login endpoint)
  2. Plugins implementing IAuthenticator.login() are called, if a plugin returns a response, it is returned
  3. If not, the default login form is rendered, pointing to the Repoze Login Handler path. This is defined in who.ini (/login_generic)
  4. We use three custom pieces of repoze plugins. I believe they are called roughly in this order:
  • ckan.lib.repoze_plugins.friendly_form: Plugin use to handle the Web login logic
  • ckan.lib.authenticator: The plugin that checks the user name and password against the database
  • ckan.lib.repoze_plugins.auth_tkt: We override the custom auth_tkt plugin to be able to customize the cookie settings
  1. At the end of the process we end up with an auth_tkt cookie that is what sets the user as logged in. There is no server state (ie session info) that affects if a user is logged in.

Identify a user on a request

  1. The Repoze Middleware reads the auth_tkt cookie and puts the value in the WSGI environ REMOTE_USER key (if a user is logged in)

  2. Flask calls ckan_before_request() which in turns calls identify_user():

    2.1. Calls plugins implementing IAuthenticator.identify(), the process stops if 1) a plugin returns a response, or 2) a plugins sets g.user

    2.2. If we still haven't identified the user we fallback to the default logic: a. Check the WSGI environ REMOTE_USER key (for browser requests) b. Check the Authorization header (for API requests)

    2.3. All these paths end up setting g.user (and g.userobj)

CKAN >= 2.10 (Flask-Login)

Log in a user

  1. User visits /user/login (user.login endpoint)

  2. Plugins implementing IAuthenticator.login() are called, if a plugin returns a response, it is returned

  3. If not, the default login form is rendered, pointing to this same endpoint (/login_generic)

  4. If it gets a POST request we call ckan.lib.authenticator.ckan_authenticator().

    4.1. Plugins implementing IAuthenticator.authenticate() are called, if a plugin returns a user object, it is returned 4.2. If not, it falls back to ckan.lib.authenticator.default_authenticate(), which checks the user name and password against the database

  5. Back to user.login, if the authenticator returned a user object, we call Flask-Login's login_user(). This will store the identified user id in the Flask session object (session["_user_id"], which is the user object id). If the server session data (by default stored in files in /tmp/{site_id}) is deleted, the user is logged out.

Identify a user on a request

  1. Flask-Login will read the user identifier from the session (session["_user_id"]) and load a user object using the login_manager.user_loader if it's present, otherwise it will try identify the user from the request (ie Authorization header) and load the user using the login_manager.request_loader. Regardless of the method, it will set the flask_login.current_user proxy to the user object.

  2. Flask calls ckan_before_request() which in turns calls identify_user():

    2.1. Calls plugins implementing IAuthenticator.identify(), the process stops if * a plugin returns a response, or * a plugin calls flask_login.login_user (or sets g.user)

    2.2. If not already set, we set g.user and g.userobj for backwards compatibility

Cookies and Sessions

CKAN < 2.10

As mentioned in the previous section, the logged-in user identifier is stored in the auth_tkt cookie. We customize repoze's defaults to ensure that this cookie is set with Secure=false, HttpOnly=true, SameSite=Lax by default and to allow to change these based on config options.

Additionally, Flask creates its own ckan session cookie, but by default nothing is stored in it, as CKAN uses Beaker as session backend with the local file backend enabled by default to store the session contents.

CKAN >= 2.10

With Flask-login, the auth_tkt cookie does not exist any more, the logged-in user identifier is stored in the Flask session. Because of the default Beaker backend, this session information is stored on the server. This means that if the session data is deleted, the user will be logged out.

As the default backend for Beaker is the local files one (on /tmp), this can happen for instance:

  • if the CKAN container is redeployed in a Docker / cloud setup
  • if the sessions are preiodically cleaned by an external script

Here's a summary of the behaviour changes between CKAN versions:

Action CKAN < 2.10 CKAN >= 2.10
Clear cookies User logged out User logged out (If remember_me cookie is deleted)
Clear server sessions User still logged in User logged out

The way to keep the old behaviour with the Beaker backend is to store the session data in the cookie itself (but this stores all session data, not just the user identifier):

# ckan.ini
beaker.session.type = cookie
beaker.session.validate_key = tT0ka0aOAWEPbvSOSog7c4ZFu

# These should probably be defaults anyway
beaker.session.httponly = True
beaker.session.secure = True
beaker.session.samesite = Lax # or Strict

One thing to note is that by default Flask without the Beaker middleware would store the session information in the cookie.