5 6 2024 Tech Team Report - QualitativeDataRepository/TechnicalTeam GitHub Wiki

5-6-2024

Logged Tasks

                            Date             Task Hours (Main) Hours (EOLS) Hours (PII) Hours (QDAS)
29-Apr-2024 Reporting, meeting, add DV /sso endpoint to trigger passive login check, explore Drupal options for triggering call to /sso 5
30-Apr-2024 Add just_logged_in cookie in Drupal, try adding js to check cookie and call /sso endpoint, try handling CORS issue w.r.t. calling /sso from a Drupal page, HEAL Workshop call, Generate list of prod accounts with potential username collisions, coord w/HEAL re: data access API calls. 4
1-May-2024 Test no-cors mode, switch to only resetting flag to avoid CORS on OIDC redirects, fix CORS handling in Dataverse, add session to fetch call and allow creds in CORS, test. Fix nginx CORS re fonts - using * to avoid caching issues, fix android fav icon 404s from site.manifest file, add publisher code to DataCite metadata, coord w/HEAL re: getting file ids/metadata. 7
2-May-2024 Fix missing use stmt, investigate/fix null in redirect cookie code, investigate odd session sharing issue/~fix, continue writing/refactoring DataCite xml code 5
3-May-2024 Revert/redeploy w/o regex change that made an sso loop, work through more DataCite fields. 6

Operations

  • Fixed a CORS/caching issue (on dev/stage so far) w.r.t. nginx that caused failures getting fonts. Drupal hosts the fonts. If you go to Dataverse or Keycloak first, things were fine as we added the required CORS headers for those hosts. However, if you hit Drupal first, you'd cache the fonts and the fact that they didn't have CORS info (not needed and not sent when called from a Drupal page) and the browser would fail to load them for future Dataverse/Keycloak pages. The fix adds CORS info for all three hosts now, and set the allowed hosts to '' so that it works across all three. ('' - all - is required because CORS doesn't allow specifying more than one host. CORS headers won't be sent if the request comes from anywhere else, but nominally if you go to a QDR page first, you'd now get CORS headers which would allow them to be retrieved by other sites.)
  • Generated the list of prod accounts that will have conflicts with shortened usernames derived from email addresses. I have a list back from Sebastian of some accounts that can be deleted. The rest will need to be modified, e.g. by getting a '2' at the end of the name when we update.

SSO

  • Developed a fix for passive login not being triggered in Dataverse. Drupal now has a client side script that calls Dataverse after login to reset Dataverse to try passive login on the next call. The Drupal script is triggered by a cookie, which is configured to be readable in JavaScript, set on successful login and has to send the Dataverse session cookie in it's call. Dataverse has to respond with appropriate CORS headers and then remove the passive login checked attribute from the session so that the first time the user goes to a Dataverse page, they will be logged in. (FWIW - the one scenario where this can still fail is when the OIDC session expires but the Drupal one hasn't. Right now, Keycloak as ~30 minute idle time, so if a user logs in to Drupal, waits 30+ minutes and then goes to Dataverse, the passive login would fail. We could lengthen the idle time if we see this in practice, or add to the sync mechanism to handle this case.)
  • Fixed ~simple issues identified in testing, loop introduced with a bad regex change

Drupal

  • Updated dev/stage to Drupal 10.2.6, latest upgrade module

Dataverse

  • Fixed issue with android icon warnings in the browser console. The paths were incorrect in a manifest file.
  • Getting close to a first pass through the DataCite XML generation code. Keeping track of places where the current DataCite/QDR DataCite/OpenAire code/DataCite XML schema/examples differ and documenting what I've chosen so people can agree/request changes.
  • Investigated a weird ~bug where one user would see the warning message for a previous user (with a deactivated account). It appears to be due to the fact that we trigger the warning in a passive login attempt which doesn't refresh the page and the warning just appears on the first page refresh, which in this case was when the new user logged in. Investigating, it does not appear to have security implications and given the rarity of the use case, and that it is JSF specific and won't apply with the SPA, I decided not to pursue a fix at this time.

HEAL

  • Attended the first workshop call and forwarded the slides/paper draft
  • Coordinated w.r.t. using the Dataverse file access API calls, getting file ids/metadata

TKLabels

  • FYI: I was cc'd on an email w.r.t helping LocalContext use the ext. vocab api for integration - no action yet.

AnnoRep

Discussion

  • FYI - I'll be out some/all of next Monday - may not be available for the 5 PM call.
  • NSF POSE - suggesting interested parties join the Sustainability WG to coordinate, esp. w/ international sites.
  • HEAL travel - have a flight and hotel for 6/6-6/7.
  • FYI: There are more security notices for Keycloak - should raise the priority of getting to 24.0.3+
  • (From last week) Any reason to keep old branches? (They take space on the Jenkins machine). Unless someone can think of a reason, I'll go ahead and delete anything that's been merged. Most of them are for PRs to IQSS since I've been making most QDR changes directly on the develop branch.
  • Any old accounts (e.g. ones with collisions) to delete from prod before the SSO update? Any preferred mechanism for resolving the collisions (nominally for most I've added '2' at the end of the older account but this pattern (or deleting) doesn't always work, e.g. there are three different info@ addresses that are truly different accounts and not cases where a person has changed institutions (and presumably won't be using their old account).

Plans

  • Keycloak update to 24.0.2
  • Work on MFA w.r.t. on authentication issue #43(MFA, etc.)
  • Work on metadata issue #44 (more metadata to DataCite, etc.)
  • Fix Stata-14 ingest by allowing file inspection during direct upload or adjusting the Stata ingester.
  • Fix #113 if possible
  • Matomo - investigate event-level tracking via tag manager, remove non-working google scripts
  • AnnoRep - explore round-trip, configure auto-start and log rotation
  • Ops
    • check missing globalidcreationdates and fix via /modifyRegistration or alternative
  • Dataverse
    • Make PR for guestbook adding datasetversion fix
    • Popup info accessibility - IQSS likes the recommendations from the source I linked to, so this can be implemented along those lines.
  • QDAS Previewer
    • Updates per request
    • Investigate writing aux file/previewing lower-sensitivity version and/or other write options
  • TBD: FRDR Security