Project Structure - AndersenLab/CAENDR GitHub Wiki

Overview

This document outlines the high-level structure of the CaeNDR repository, with notes on external data sources.

The codebase primarily consists of two types of components, contained in the src directory. These form almost all of the source code for the project - 99% of the work happens in this directory.

Name Path Description Code Wiki
Modules src/modules Individual applications that may be run separately. Many different modules across the project. link link
Package src/pkg/caendr A library of shared code used by all modules. One single package shared across the project. link link

In addition, there are a few other top-level folders that might be of interest:

  • env - Environment files for configuring variables, secrets, etc.
  • tf - Terraform configuration, for deployment. (Might be deprecated?)

Service Modules

Directories containing source code, Dockerfiles, Makefiles, etc. for each service module ("module"). Modules are individual applications that may be run separately, either constantly in the background or on-demand.

Site Module(s)

These modules are used to build the user-facing CaeNDR website. NOTE that only site-v2 should be used!

Name Path Description Code Wiki
Site v1 (DEPRECATED) src/modules/site The original codebase for the old "CeNDR" site (not a typo). link link
Site v2 (Current) src/modules/site-v2 The codebase for the current CaeNDR site, as of early 2025. link link

Tool Modules

These modules are used to build & run the site tools, i.e. the lab-produced tool codebases that perform specific data operations.

Name Path Description Code Wiki
Database Operations src/modules/db_operations The parent module that runs all tools. link link
Heritability Proxy src/modules/heritability_proxy Code to pull, build, & run the Heritability tool Docker image. link link
Indel Primer src/modules/indel_primer Source code for the Pairwise Indel Finder tool. link link
Nemascan Proxy src/modules/nemascan-proxy Code to pull, build, & run the NemaScan tool Docker image. link link

Note that the Heritability and Nemascan tools are built & run entirely from separate repositories, whereas the Indel Finder tool is defined in this repository.

Background Modules

These are miscellaneous modules that run in the background of the site. They are typically specialized for a single task or set of tasks, and are run as needed.

Name Path Description Code Wiki
API Pipeline Tasks src/modules/api/pipeline-task API that handles requests for jobs - scheduling marking as complete, etc. link link
Image Thumbnail Generator src/modules/img_thumb_gen Automatically generates thumbnail-size images on strain picture file uploads. link link
Maintenance src/modules/maintenance Fallback "maintenance" page if site is down (?) link link

Package

The CaeNDR python package which contains, among other things:

  • libraries for accessing cloud services
  • models defining CaeNDR internal data
  • utility functions that are commonly reused
  • service libraries for interacting with CaeNDR data

The service modules (described above) include the CaeNDR package in addition to their requirements.txt file.