1 GitHub in gov - Openscapes/2021-noaa-nmfs GitHub Wiki
GitHub has been used by government agencies and staff for many years; some agencies more (GSA) and other are more newcomers. But recently there has been a bigger shift to using GitHub, not just for code and project tracking but also as an application or website delivery platform. How agencies, organizations, teams and individuals organize their GitHub projects is evolving. Currently (Sept 2021) the following is roughly how things are organized in NOAA.
Terms
Product = software (like R package, say, or Python package), a dataset or database, a report with all the various pieces (data, code, text), a report template, a collection of scripts for some task, an application like an API or mobile app.
Repository = GitHub repository that may or may not be a product. Repository is like a folder on your computer. It is an organizational component but not necessarily, and actually often not, a product. Along with the repository comes important management tools at the repo level: issues tracking, project boards, releases, landing page, automation of tasks.
Organization = A collection of related repositories (products or not) AND the organization-level project management tools associated with those (project boards, team discussions, landing page). Anyone can create a GH organization. Think of it like a collection of folders on your computer. Individuals can use these or teams, e.g. Eli has 3 individual GH orgs for different projects, 3 team orgs for team projects and is a member of a few branded GH organizations.
Branded organization = An organization that is associated with a particular official organization, like the Pacific Marine Environmental Laboratory. An organization might be at the agency level, like US Geological Survey, but really most of the action is at at lower organization level with the higher level either mirroring or linking to a lower level organization or acting as the platform for a official release while development happens lower. This isn't always the case, for example U.S. General Services Administration has hundreds of repos and contributors.
Individual GH account = A individual's GH account. You can have personal repositories that are not in a GH organization. For many people, their personal repos are a mix of sandbox/junk stuff or stuff you've copied and real work, like stuff you are not planning to delete. For example, I (Eli) have 43 individual repos in my personal account but they are all sandboxy things. Anything that is project related is in a GH organization. Note, keep personal and work GH accounts separate. For example, for me anything math or fisheries science- or climate science-related is within my work realm and is on my work GH, but I also toy with sports statistics and that is in a personal GH account.
General structure
-
Official public products
- A product used widely or for critical analyses. These will go through a structured review and testing process of some sort (how that looks depends on the product. How formal this process is really varies depending on agency and what the product is.
- A minor product that is public but not for critical analyses and might just be for a small, specific task. These tend to go through testing by the individual who prepares the repo.
-
Unofficial public repos
- GH used to track code and project for an individual or team. Most GH repos that you see on NOAA branded GH organizations fall into this category. The product is not a finished product but the material is open source.
- Personal, but work-related maybe tangentially, repositories. For example, during Openscapes, you'll be making repos and that would fall into this category.
-
Private repos
- Use of cloud storage for sensitive or confidential data is rapidly evolving within the federal system, so who knows how this will look in 2-5 years. But right now, do not put sensitive or confidential data or information into github.com even into private repos. For NWFSC folks, we have an internal GH server for that purpose.
- Many teams or individuals use private repos for development work or not for the public.
GitHub in NOAA & NMFS
Be aware that this structure is rapidly evolving. Currently NOAA doesn't have a single organization site for official products. NOAA is a branded GH org that has links to other NOAA-affiliated GitHub orgs. The list of NMFS-affiliated GH orgs is not even remotely complete.
-
NOAA Fisheries Integrated Toolbox is a cross-center group that has been working for a number of years to provide a branded GH organization for NMFS public GH products and resources, training and tools for creating products. The resources are in the left nav bar on that page, so scroll down.
-
Fish and Fisheries Tools is an example of a branded set of GH hosted NMFS tools. The GitHub org with a resource folder for common elements and organization.
-
Within NWFSC, branded GH orgs have the name
nwfsc-xyz
. CB, CB Math Biology, CB Math Bio time series, CB OA Lab, FRAM, FRAM Assessment -
NEFSC: EDAB
-
PIFSC: PIFSC
-
AFSC: AFSC assessments, NMML
-
SEFSC: SEFSC
-
SWFSC: Major open source code and data producer (many CRAN packages, e.g.). But I can't find any GH organizations. I did find many individual GH accounts.
GitHub in other agencies
- US Geological Survey has a major GitHub presence. USGS USGS-R
- NASA does too
- U.S. General Services Administration has close to 900 repos on its org.
- Cybersecurity and Infrastructure Security Agency
- Environmental Protection Agency
- Department of Labor
- US Forest Service
- The US Department of Education's College Scorecard is all open source and on GitHub. Note, this was an early open-source in government project and is not on the USED GitHub organization (which is minimal) but on the contractor's GitHub organization, RTI International.
- USDA
- National Parks Service
- See also this list by GitHub. Scroll down to the bottom for the US Research Labs. GitHub in Government