External IDs - CDCgov/prime-simplereport GitHub Wiki

External IDs generally

Many of our entities have some identity outside of our system. Ideally, we could use these identities as durable IDs (e.g. student ID, staff ID, LOINC Code, CLIA ID). Sadly, it turns out that most of these are neither unique nor required, so we can only use them for user convenience, not for the data model.

Mostly this is OK, but when in the specific case of entities that are tied to authorization, we need something more reliable.

The behavior of organization_external_id

Why do we even have this field? Orgs have a database ID already!

The point of the organization external ID is to tie the organization in the database to something in the outside world. The internal ID is generated (randomly, at present) when the organization record is created in the DB; an external ID can be passed in at creation time and be non-random. The practical benefits of this are

  • It is possible to set up users and authorization in the IDP before actually setting up the organization and facilities and patients in SimpleReport itself
  • More importantly in real life, this makes it possible to write tests for the authorization system that are not completely bananas and impossible to read or maintain.

The only actual requirement for an external ID in this context is that it be a unique string, so we could perfectly easily have it be a UUID string as well, just one we generate outside the API. The advantage to a purely random external ID is that you will pretty definitely never have a collision, and there is no chance of a name containing misleading information (for instance, if there’s an organization with an external ID that resembles a different organization name). The disadvantage is that it contains no information at all: if you are in the Okta console trying to quickly see who the site admins are for Via Elegante, you will have to go find the external ID in the API and then use that to search the group list, rather than just scrolling down until you see “PROD-TENANT:VIA-ELEGANTE-TUCSON:ADMIN” (or possibly "PROD-TENANT:viaelegante.org:ADMIN"? That might be simpler and more deterministic).

So why do organizations have them and not facilities?

When it comes time to add facility-level authorization, some of the logic around human-readable IDs will be moot because we will likely use a user profile attribute rather than groups to store the list of facilities, but if we want to be able to pre-provision authorization lists, a similar field on facilities will be needed, and if we want to easily debug problems with assignments, human-readable IDs would not be terrible there either..