10 17 2022 Tech Team Report - QualitativeDataRepository/TechnicalTeam GitHub Wiki

10-17-2022

Logged Tasks

                            Date             Task Hours (Main) Hours (EOLS) Hours (PII) Hours (QDAS)
3-Oct-2022 Report, meeting, coord re: guestbook changes 1
4-Oct-2022 Investigate SSO issue, fix placeholder overruns via CSS, create #9022 2
5-Oct-2022 Final 5.12 merge, create 5.12-qdr branch, add CSS fixes/handle parent field, deploy to stage, update guestbook estimate PR 2
10-Oct-2022 Drupal core, honeypot update on dev, SSO improvement, fix curate command, investigate google archive fail, CKEditor 5 on dev 7
11-Oct-2022 Investigate Firefox css fail, continue trying to find google conflict 5
12-Oct-2022 Revert changes from investigating, try avoiding problematic call, review/coord re: google search report, find javaagent conflict, rebuild statsd-jvm-profiler with updated guava, deploy, test, coord with Seba. 6
13-Oct-2022 Clean-up pom changes, deploy updated statsd* to stage, add common-text sec fix and redeploy to dev/stage 3
14-Oct-2022 Update dev/guestbook branches from dev 1

Dataverse

  • Merged all v5.12 updates, create/deploy v5.12-qdr branch to stage
  • Developed a CSS solution to metadata edit field overruns with long placeholders. This doesn't currently work for FireFox which has yet to finalize it's support for the :has selector which I had to use.
  • Fixed the Curate/Update-current-version functionality in the case of a dataset using one of its files as a thumbnail
  • Discovered/eventually fixed issue causing Google Archiving to fail. After first assuming that underlying class version conflict had to come from changes to which libraries are included in v5.12, I systematically started deleting 5.12 updates. I also checked for the conflicting class in Payara itself (not included). After not seeing any other possibility, I tried to rewrite some of the archiving code to avoid the problematic method without success. Ultimately, I wrote some debugging code to list physically where the offending class was being loaded and discovered it was in a monitoring program in /opt, completely outside the Payara folder tree, and buried in the statsd-jvm-profiler.jar (which includes all of its required dependencies rather than using a separate jar.) Once I found this, I was able to download the open-source source code and update it to use the latest version of the common class which then removes the conflict with Dataverse also using the latest version. I verified that Google archiving now works and tried to check that out monitoring, i.e. via Grafana still works. Since I know little about the monitoring setup, I also checked with Seba to see if he could detect any problems (still awaiting his analysis).
  • In trying to identify the cause of the issue above, I made multiple changes to the Dataverse pom.xml file. Some of those had to be reverted but others simplify the dependencies. As part of the fix, I cleaned these changes up and have them in our dev and v5.12 branches.
  • Last Thursday, a potential security issue in the Apache commons-text library was announced. Although subsequent review suggests we are probably not vulnerable, I went ahead and updated QDR to the latest/fixed version prior to the fix being incorporated into the IQSS version.
  • Created a branch for guestbook-at-request development
  • Investigated SSO issue leaving us with page views where only one of Drupal/Dataverse are logged in. I discovered that some time ago we had changed the id of a login element from 'login' to 'qdr-login' which broke the SSO script detection of when Drupal was logged in. I fixed this and deployed to dev/stage. I have not seen the issue since but, immediately after the fix was deployed, Michael reported seeing it again. Since the SSO script gets cached, it is possible that was due to an old/unfixed copy being used but it is also possible that another problem remains.

Drupal

  • Update core (security - only exploitable by content creators), honeypot modules, deployed to dev/stage
  • Turned on the CKEditor5 for some page types on develop. V5 will be the only option in Drupal 10, so switching will be required at some point. CKEDitor5 is configurable and has plugins so if there are issues in using it for QDR purposes, I can look into options.

Discussion

  • CKEditor - v5 is on develop and can be tested
  • With community interest (ASU, JHU, others), I've updated the sort by tag/folder PR that was initially closed when Dataverse was planning other dataset page changes.
  • Sebastian has been active adding/updating IQSS issues related to metadata/DataCite, etc. It would be good to review which ones are ready for QDR development - I've tried to follow but not sure I've caught everything.
  • Discovered that there's a reported issue related to datasetversion not being recorded in the guestbook. (see https://github.com/IQSS/dataverse/issues/5864) I've had a background note to check into that and this issue provides a repeatable use case to look into.

Plans

  • QDAS Previewer
    • Updates per request
    • Investigate writing aux file/previewing lower-sensitivity version and/or other write options
  • AnnoRep - continue to explore/fix docx/pdf github issues
    • Deploy updates to dev/stage/prod
  • Dataverse
    • Develop guestbook at request based on ADA's original work
    • Popup info accessibility - IQSS likes the recommendations from the source I linked to, so this can be implemented along those lines.
    • Still want to investigate the guestbook responses re version info not being included.
  • TBD: FRDR Security
  • Other tasks as discussed in strategic planning