IMPROVEMENT PLAN - AtlasOfLivingAustralia/documentation GitHub Wiki

Living Atlases Documentation Wiki β€” Improvement Plan

Created: 2026-04
Status: Active
Spec: See CONTRIBUTING.md


Context

This wiki is the primary technical documentation for the Living Atlases (LA) platform, used by GBIF nodes and ALA adopters worldwide. Over time it has accumulated:

  • Untracked draft files mixed in with published pages
  • Legacy content (especially biocache-store era) without deprecation notices
  • Binary files (videos, PDFs, exports) that don't belong in a wiki repo
  • Underrepresentation of la-toolkit as the primary install method

The goal of this plan is to clean up the repository, mark legacy content clearly, improve English quality, and make the documentation reflect the current recommended workflows.


Phase 0 β€” Spec and governance βœ…

  • Create CONTRIBUTING.md with editing spec
  • Create drafts/ folder
  • Create old-probably-safe-to-delete/ folder
  • Create this IMPROVEMENT-PLAN.md

Phase 1 β€” Reorganise untracked files

Move untracked files into the appropriate folder without breaking any existing tracked wiki links.

Move to drafts/

File Reason
Ongoing-Maintenance.md Empty draft
Troubleshooting-docker.md Informal notes, useful in future
images-service-migration-to-1.0.md Technical notes without wiki format
agents.md Internal AI knowledge base, keep but not as a public page
2025-03-Madrid-Workshow-documentation-intro.md Workshop presentation draft

Move to old-probably-safe-to-delete/

File/Dir Reason
SBDI-pres.md / SBDI-pres.html External presentation, outdated
Presenting-the-La-Toolkit.html Binary export
Presenting-the-La-Toolkit.pdf Binary export
Presenting-the-La-Toolkit.pptx Binary export
Presenting-the-La-Toolkit/ (dir) Binary/export directory
conflu/ Confluence exports, historical reference only
bin/ Loose scripts, out of place
*.webm (all Kazam recordings, connectivity, etc.) Large binary files
*.mp4 (upgrading.mp4) Large binary file
creation.gif, creation.webm Binary media
monitor.webm, monitor-small.webm Binary media
pre-deploy.webm, deploy.webm Binary media
templates.webm, templates-small.webm Binary media
test.webm Binary media
signal-desktop-keyring.gpg System file, should never be here
Add-Search-Engines-to-Chrome.md2 Duplicate with wrong extension
image-2022-06-01_16:01:07.png Loose image, likely orphan
image-2022-06-01_16:23:30.png Loose image, likely orphan
Untracked img/ files Audit references before moving

Note: Presenting-the-La-Toolkit.md (the .md file) stays in root β€” it is tracked in git. Link to the la-toolkit GitHub repo directly in Quick Start, not to this presentation.

Status: βœ… Done (2026-04)


Phase 2 β€” English and typo corrections (tracked files only)

Safety rule: Body text only. Never touch anchor names, wiki links, code blocks, or example URLs.

Process:

  1. Audit script lists candidates per file
  2. Manual review of each suggestion
  3. One commit per file

Known issues already identified:

  • Github-good-pratices.md β†’ body text: "pratices" β†’ "practices" (filename stays)
  • LA-Quick-Start-Guide.md β†’ various minor issues to audit
  • General: inconsistent capitalisation in headings across multiple files

Status: βœ… Done (2026-04) β€” 11 files corrected


Phase 3 β€” Mark legacy / deprecated content

Add blockquote legacy notices at the top of deprecated sections. Do not remove content.

Template:

> ⚠️ **Legacy:** This section describes the legacy `biocache-store` backend.
> For new installations, use [pipelines](Pipelines-process-overview) instead.

Files needing legacy notices:

File What to mark
Jenkins-For-Biocache-Store-And-Other-LA-Tasks.md Entire page β€” biocache-store era
Data-ingestion.md biocache-store sections
Data-management.md biocache-store sections
Sample-And-Index.md Verify: biocache-store or pipelines?
Infrastructure-Requirements.md biocache-store aka biocache-cli mention
LA-Quick-Start-Guide.md Remove ala-demo recommendation; replace with la-toolkit
building-from-source.md Audit for outdated build instructions β€” file is empty, no action needed

Status: βœ… Done (2026-04)


Phase 4 β€” la-toolkit prominence in Quick Start

Current state: LA-Quick-Start-Guide.md Install section mentions:

  1. ala-install ansible (as primary)
  2. generator-living-atlas (as helper)
  3. ala-demo playbook (obsolete β€” to be removed)

Target state:

  1. la-toolkit β€” recommended primary method, linked to https://github.com/living-atlases/la-toolkit
  2. ala-install + generator-living-atlas β€” described as the underlying mechanism la-toolkit automates (for advanced/manual use)
  3. ala-demo β€” removed from recommendations

Key constraint: Do not change the filename or any anchor URLs. Edit body text only.

Status: βœ… Done (2026-04)


Phase 5 β€” KB-assisted audit

Using the living-atlas-kb MCP (collection la_toolkit_kb, ~470k documents from ALA/LA source repos) to verify documentation accuracy against real source code, detect dead links, find coverage gaps, and mark remaining legacy content. Each edit cites KB evidence; one commit per file.

Methodology: for each target page β€” enumerate verifiable claims β†’ query KB β†’ record finding in drafts/kb-audit-findings.md (not committed) β†’ manual review β†’ edit body text only β†’ commit.

⚠️ CAS note: CAS is still in use. Do not mark CAS as legacy. OIDC is documented as a newer alternative, not a mandatory replacement.

5.1 β€” Installation & deployment

Pages: LA-Quick-Start-Guide.md, Infrastructure-Requirements.md, Before-Start-Your-LA-Installation.md, Requirements.md, LA-Deployment-Types.md, Common-challenges-in-LA-new-deployments.md, Installation-of-ala-demo.md, Setup-a-LA-demo-using-microstack.md

Focus: supported OS versions, Ansible versions, la-toolkit role, Vagrant deprecation status.

Status: βœ… Done β€” Ubuntu version updated, Vagrant deprecated, image-service path fixed, Ansible version gap filled.

5.2 β€” Data ingestion: pipelines vs biocache-store

Pages: Data-ingestion.md, Data-management.md, Data-mappings.md, Sample-And-Index.md, Pipelines-process-overview.md, pipelines-extra-steps.md, Spark-hadoop-commands-for-pipelines.md, Name-indexer.md, Jenkins-for-Pipelines.md, Jenkins-For-Biocache-Store-And-Other-LA-Tasks.md

Focus: verify Phase 3 legacy notices are correct; find remaining biocache-store mentions without notice.

Status: βœ… Done β€” Legacy notices confirmed on Data-ingestion.md, Data-management.md, Sample-And-Index.md, Jenkins page; added to Data-mappings.md (STEP 5) and Name-indexer.md (Vagrant tutorial + Reindexing section).

5.3 β€” Auth & security (⚠️ CAS still in use)

Pages: ALA-AUTH-with-CAS-2.md, CAS-postinstall-steps.md, OIDC.md, API-Keys.md, Basic-Auth-in-your-LA-node-without-CAS.md, Secure-your-LA-infrastructure.md, User-Roles-and-Services.md, CAS row in CONTRIBUTING.md

Focus: CAS and OIDC as parallel valid paths; verify config params against source; fix CONTRIBUTING.md CAS row.

Status: βœ… Done β€” CONTRIBUTING.md CAS row corrected (CAS still active, OIDC is alternative not replacement); CAS pages verified accurate; OIDC.md framing confirmed correct; User-Roles-and-Services.md GitHub link verified live.

5.4 β€” Troubleshooting & i18n

Pages: Troubleshooting.md, Troubleshooting-biocache‐service.md, Known-issues-in-LA-Internationalization.md, Introduction.md (i18n), Translate.md, Know-issues.md

Focus: accuracy of error/flow descriptions, log paths, commands.

Status: βœ… Done β€” Legacy notice added to Troubleshooting.md (biocache-store debug subsection); BuildConfig.groovy legacy notices added to ALA-Internationalization-(i18n).md and Introduction.md; Troubleshooting-biocache-service.md and Known-issues pages needed no changes.

5.5 β€” Dead link sweep (cross-cutting)

Check all tracked pages for broken GitHub and external links. Use KB for GitHub paths, WebFetch for external URLs.

Known candidates: biocache-store/blob/... links, tinyurl.com/la-bootstrap-sessions, generator.l-a.site, crowdin.com/project/ala-i18n.

Status: βœ… Done β€” tinyurl, generator.l-a.site, crowdin, biocache-store blob, ala-cas-5 commit all LIVE; Spark 0.9.1 link (Infrastructure-Requirements.md) was dead (HTTP 404) β†’ fixed to /docs/latest/.

5.6 β€” Gap analysis

Topics present in source code but absent/thin in docs (e.g. step-by-step la-toolkit guide, full OIDC setup). Drafts only β€” no new committed pages unless filling a clear 404 or gap.

Status: βœ… Done β€” Ansible version requirement (2.17.3 core / 10.3.0 community) added to Before-Start-Your-LA-Installation.md; Java/Grails version gap noted but skipped (insufficient direct KB evidence).

5.7 β€” Apply & close

Apply reviewed edits, update sub-phase statuses here, add KB methodology note to CONTRIBUTING.md.

Status: βœ… Done


Files to audit for dead links / 404s

  • Any github.com/AtlasOfLivingAustralia/biocache-store/blob/... links β€” repo may have changed
  • https://tinyurl.com/la-bootstrap-sessions β€” external URL, verify alive
  • https://generator.l-a.site/ β€” verify alive
  • https://crowdin.com/project/ala-i18n/ β€” verify alive

Out of scope (explicit)

  • No new content pages unless filling a clear 404 or gap
  • No renaming of tracked files (breaks URLs)
  • No removal of existing content (deprecate, don't delete)
  • No binary files added to wiki root