09.27.2021 TT Agenda - QualitativeDataRepository/TechnicalTeam GitHub Wiki

People

@nniicc @adam3smith @stahs @qqmyers

Discussion Topics

Admin

Do we know/can we quickly find out since when the email opt-in has been part of login?

Seba & Jim Coordination

Software updates: mysql - --force deploy still needed to resolve issue reported in slack? Ready for php 8 deploy, with postgres 9.6->13 at any time.

Seba

Deploying from Docker image - To will / can merge the prototype branch.

Jim

Report

Roadmap Discussion QDR Infra is stable, and most of our roadmap development has been complete for the year. We should start to prioritize new things on the roadmap that we can achieve to continue stability of QDR, and improve curation processes.

QDAS project... In the narrative proposal we described the following "...the tool’s eventual design will be shaped by the results of our research under this grant, a significant amount of development can begin in advance. For example, from previous research we know that an archiving tool must have the following basic functionalities:"0
- the tool will need to be able to read and understand REFI-QDA’s XML format,
- unzip the contents of the .qdpx project files and decide how they are best stored in a repository
- and build basic exploration functionalities based on .qdpx files.
Allow links to datasets outside of Harvesting context [still on hold, yes?]
Integrating curation tags and assignments to github workflow [maybe collaboration with Michael?]
FRDR Secure data prototype - I'm still keen to do this but waiting for a response from Alex's collaborator before next steps might be clear

Notes

Can we tell when the opt-in email was sent?

Seba deploying from docker image - to understand how it operates correctly.

If we can do this on Dev ... we'll eventually take the same approach to Prod
Vercel / Anno-Rep will act be a simple sub-domain app - won't need author / user validation through SSO (nic was wrong about this :/ )

Where to deploy vercel app?

Medium term we have to run this off of a personal account bc of cost
Long-term we will have to do something different...
- SK to look into Vercel open-source option again

AWS lost a bunch of East coast servers ...

VPN got taken as a result - this caused outage alerts - but we were back up relatively quickly

Long-term Planning...

In discussion of our QDAS project...

The initial idea is that a use comes to QDR with QDPX package...

At deposit - An intermediate dialogue appears where user can unzip and select what they want to upload (deposit)

We then

Repair any changes of removed files in QDPX manifest
Rezip the package - so that anyone can download the codebook, files, etc. as a bundle of stuff
Also archive individual files ...

Eventually ...

We want to build a QDAS explorer tool ...

This would allow a user to do simple things like query by code name and see all portions of a transcript that had been coded...
Additional features will be fleshed out with QDAS interviews / surveys

Dataverse - whats the right design in process of deposit.

Would you do this (that is unpack and select files to deposit or not deposit) with other Zip files ...?
Or is this unique to QDAS ... [Probably not - but lets ask IQSS first]
Storage - lets duplicate this so that we allow individual files and a re-zip of the files that get deposited
- Auxillary Files in DV are being used for Diff Privacy (we should understand a little better how those might be of value at deposit and storage phase of QDAS archiving)
QDPX is relatively simple - we would have to modify the manifest (which is an XML) ... But we can validate that
Two things for the rezip process...
- Validation of XML is relatively trivial ..
- XML file still validates if you are pointing at external files that don't exist (e.g. Interview 3 is in XML and needs to be removed)

Next Steps on QDAS

Jim has a QDPX file as an example
We need design work at deposit and storage stage before we can figure out order of operations
SK to work on Jims contract at SU for the IMLS work

Harvesting

Background research to figure out what the right way to handle harvesting a set
- Dataset IDs is what gets selected - so as long as we can generate / discover those we should be able to do this...
- May be fairly trivial ...
- OAI-PMH feed - there is a dataset created in harvesting repository but points at original repository ...

Next Steps on Harvesting

Look at what we already have form OAI-PMH harvesting of sets in Dataverse
Write out use cases that we can / can't solve with existing functionality

Michael and Github integrations

Once SK introduces project - have a conversation about software architecture

Less fun stuff from SK

As issues in the Dataverse repository
Email registration and usernames - need to do more design work to understand where user names appear - Jim is on this

09.27.2021 TT Agenda - QualitativeDataRepository/TechnicalTeam GitHub Wiki

People

Discussion Topics

Notes

Assigned Tasks / Decisions