8 03 2020 Tech Team Report - QualitativeDataRepository/TechnicalTeam GitHub Wiki

8-03-2020

Logged Tasks

Date Task Hours (Main) Hours (EOLS) Hours (PII)
27-Jul-2020 Report, meeting, smtp module update - dev/stage, investigate reporting apps source code/download from github, develop query for unique downloads & per month, update /metrics to show 26 months (QDR DV lifetime) 6
28-Jul-2020 Debug/fix prod memory issue, help anonymize dataset, debug 502 timeouts, explore adding tsvs/graphs to dataverse-metrics 4
29-Jul-2020 Review/test DANS metrics 3
30-Jul-2020 Adapt SP and DANs to create file metrics api, comment on IQSS#6766 4
31-Jul-2020 Rewrite metrics api to add per Dataverse, include MDC metrics, add unique downloads query, debug/test 8

Summary

My focus this week was on Dataverse metrics:

  • Updated dataverse-metrics app to show full 'QDR on Dataverse' lifetime (26 months)
  • Developed an SQL query to get 'unique download' metrics (#users per dataset making a download), and added a metrics api endpoint
  • Investigated how to add tsv files/graphs to dataverse-metrics (then tested with GDCC support with simple query to get dataverse versions)
  • Reviewed/tested the additional metrics developed by DANS (sourcecode linked to IQSS/6766 but no implemented in IQSS Dataverse yet). I documented these in our google spreadsheet - starting a row 31. I also looked into how DANS managed per Dataverse responses.
  • Added a file size and mimetype metric api based on Scholar's Portal query
  • Adapted the overall metrics API to allow per Dataverse reporting using DANS code and updating the metrics caching code and database table to manage per Dataverse results
  • Added MakeDataCounts metrics at a per Dataverse level to the metrics API
  • Started debugging, testing new API endpoints/queries/caching, etc.

I also worked on a few tasks related to operations:

  • updated the Drupal 8 SMTP module and deployed to dev and stage
  • debugged a memory issue on prod (missed update of a Glassfish jar during our migration to prod-2)
  • investigated the 502 errors on dev/stage to confirm they're from the load balancers and identify that they occur when the time-to-first-byte is > 10 seconds
  • removed an author name from the contributors to a data project to help prepare for anonymous review

Plans

  • Sync stage from prod to help with testing
  • Complete dataverse metrics - fixing API updates as needed and developing graphs/tsv export buttons for them (with per Dataverse capabilities on the metrics page)

and possibly:

  • file DOI reservations
  • file replace in draft datasets
  • Drupal 9

For Discussion

  • How to sync content for stage - read-only look at prod S3? copy some content to stage S3 bucket?
  • FYI: Some Potential time away over the next 2-3 weeks