DASH6 - nmdp-bioinformatics/dash GitHub Wiki

DASH6

Data Standards Hackathon

DaSH 6: Heidelberg, pre-EFI

In order to focus the activity at this event we have developed a list of “scrum-team” topics with tangible goals. We will form teams around some of these topics. The team should plan to produce code collaboratively and have a functional demo at the end of the event.

Prepare “feature service” for production use • Implement a versioning system for the feature service • Add authentication and security layers • Add curation capability linking sequences to o IPD-KIR and IMGT/HLA Database accession release version and Accession number (a more accurate way of saying the allele name) o the sequence submitter and their “de-novo” identifiers for sequences they have submitted, associated with a given GFE notation/set of locus/feature/rank/accession coordinates • Populate feature service with all versions of HLA.xml and KIR.dat • Improve documentation; make sure that the documentation is sufficient to inform an API for analytics; expand on the BTOP-like pairwise difference annotation
GFE service enhancements • Experiment with AWS configurations • Add a database cache • Volume test • Test installation process • Add clients and tools • Rename (service-gfe-submission -> service-gfe) • Connect to feature service with authentication
GFE Service validation • Validation of alignments using “features” from HLA.xml and KIR.dat files • Develop automated tests; possibly using Neo4j
Improve current capabilities of the annotation pipeline (within GFE service) • Make it a maven project; push code to maven central • Allow GFE service to utilize different version of annotation • Add capability of running ABO
Haplotype Frequency Curation Service (HFCu) development • Build the shell of the backend based on the population service and check the code into GitHub