7 18 2022 Tech Team Report - QualitativeDataRepository/TechnicalTeam GitHub Wiki
7-18-2022
Logged Tasks
Date | Task | Hours (Main) | Hours (EOLS) | Hours (PII) | Hours (QDAS) |
---|---|---|---|---|---|
11-Jul-2022 | Report, meeting, investigate Zip previewer re: access pattern/performance | 1 | 2 | ||
12-Jul-2022 | Continue comparing internal vs javascript internal zip access, start debugging seek issue | 3 | |||
13-Jul-2022 | Coord/fix v5.11-related Drupal outage on prod, and title issue, deploy update to dev/stage as well, investigate low download counts w.r.t. db vacuum/analyze, start refactor/test qdas seekable channel, debug/propose fixes for missing checksum value (#8841), update metatags module | 4 | 1 | ||
14-Jul-2022 | PR #8844 to fix missing checksum, debug zip file error | 2 | 4 | ||
15-Jul-2022 | Find apache bug | 2 |
Operations
- Fixed delayed upgrade issue on prod (see Drupal item 2)
- Investigated download counts ~2500 low - recommend vacuum/analyze for guestbookresponse table
Drupal
- Updated metatags module
- Updated to reflect changes in OAI_ORE export field names
Dataverse
- Investigate/report(#8841)/fix issue with checksums missing in file display - created PR#8844
##QDAS
- Started comparison of javascript and Java/Apache Commons Compress retrieval from within zip files
- Ran into intermittent problem in getting files via QDAS previewer - discovered simple bug in Apache class - tested, reporting it today
Discussion
- The fast estimate for total downloads added in a recent Dataverse release (for when direct count of table rows gets too slow) depends on accurate table statistics. A community report indicated an undercount of 1-2% with 150K downloads. I checked our table and found ~3.5% undercount (~2500/69K). Testing on dev/stage (which had much smaller undercounts) suggests that doing a manual analyze (or vacuum/analyze) on the guestbookresponse table makes the estimate much more accurate. Discussion with IQSS made me realize that the table is handled by auto-vacuum, but since it is mostly write once, that essentially never triggers. I'd suggest at least a manual vacuum/analyze, but want to make sure there's no issue with that. Assuming this works as expected we may want to monitor/periodically vacuum/analyze.
Plans
- AnnoRep - continue to explore/fix docx/pdf github issues
- Deploy updates to dev/stage/prod
- Drupal
- Make homepage resilient vs. Dataverse failures
- Dataverse
- Popup info accessibility - IQSS likes the recommendations from the source I linked to, so this can be implemented along those lines.
- QDAS planning/design/prototyping
- Explore switch to Javascript retrieval of files within zip.
- Still want to investigate the guestbook responses re version info not being included.
- TBD: FRDR Security
- Other tasks as discussed in strategic planning