[DevDoc] Git and GitHub Best Practices - Rothamsted/knetminer GitHub Wiki

TODO: many things to be added

Normal SCM workflow

There are various workflow models to manage git updates and branching ref 1, ref 2.

In a small team like KnetMiner, usually something as simple as the following works.

  • For external collaborators and newcomers, it's best that they fork our repositories and, when ready, send pull requests back. The aligning and merge practices described below should be applied in this case too.

  • Closer collaborators normally commit/push on the main branch of the repo they work with. The idea is if you have some changes that can be sent to the main branch reasonably quickly (a few days) and they won't break everything (eg, won't change common interfaces radically), then it's reasonable to send that to the main branch, once you have done a local merge and tried to verify it works (see below for details).

  • When pushing to the main branch, a good practice is to ensure, for what possible, that you commit code with minimal functionality/quality.

    • For instance: Maven, NPM or similar build systems can complete a build, including passing the code automated tests (do write tests, develop by mixing main code and tests).
    • This implies that you try the same build locally (on your own PC), before a push.
  • In many repos, we have continuous integration workflows that automatically try to build the most recent version of the repo code. This is where we can spot build problems. After your own pushes, pay attention to possible failure notifications coming from a CI workflow.

  • Of course, this is not absolute: a CI build might fail despite your diligence when pushing changes, after all, it's there for that, the point is trying to minimise the frequency of it.

  • When you start making some changes, including when you start a new branch (see below), do a 'git pull' (or fetch/inspect/pull), that is, start making changes from an up-to-date version of the code base

  • Similarly, before pushing a round of changes you wrote, do a git pull again and, if necessary, check how things were merged and verify the result (build, test), before pushing.

    • When in doubt (ie, you see commits on the main branch via github/web), you can do a 'git fetch' first, compare your changes to the remote changes that the fetch operation downloaded in a separated branch, and then merge yours with the latter. 'git pull' is actually the combination of fetch and then merge to the current branch. The fetch operation align a separated branch (eg, refs/main) which keeps track of the corresponding one in github. See the git/github documentation for details.

When to open branches

  • Open a branch when you have to make more significant changes. For instance:
    • a big bugfix, which you expect to need days (a week or more) and affects large portions of the code
    • a new feature that is likely to affect how the rest of the code works significantly (eg, changing much-used interfaces or many interfaces)
    • experimenting with radical new changes or features that aren't immediately planned as part of the main code
  • These kinds of branches are usually named feature branches

Please, talk to involved developers before pushing a new branch on a repository. We should avoid the proliferation of many branches and tags, as well as confusing, untidy and unclear branch/tag names.

When to merge branches from/to main

  • All of what follows aims at minimising the pain of putting together diverged changes

  • If you're working on your branch, try to merge from main/master often, that is, try to not let your feature branch to diverge from the main line of development too much: having to re-align 1000 changes made through a month is usually much more difficult than re-aligning every 2-3 days

    • There might be exceptions, eg, you're making such radical changes, that such frequent alignment isn't easy to do
  • Do such alignment before merging back to main and verify it (build, test, manual tests of UIs, etc). Possibly, at the time you want to merge things back to main, ask other developers to stop pushing for a moment (eg, a few hours)

Aligning from main to your branch (and vice versa)

I do this by using an IDE or equivalent (eg, Eclipse, IntelliJ). An IDE allows you to inspect differences between your (say, current) branch and another branch like 'main'. They also allow you to selectively bring differences in the remote branch (say, main) into your current one, at different levels (single rows of a file, a whole file, an entire directory). So, you can have a high degree of control on what you merge and how you change the working copy of your current branch.

You can do that before the final merge of your own branch into main. You can do that in two steps:

  1. Commit the changes you have made to re-align your local branch to the main one. At this point, your branch should be exactly like the main
  2. Build, test, test and test
  3. Do what I call a 'formal merge', that is: do a 'git merge' while working in your branch, by using the 'ours' mode, ie, telling git that, in case of differences/conflicts, the valid version is the one in the current (your) branch (see below for details on how to do that with git). In practice, you're keeping your branch as you have modified it in the previous steps (eg, manual merge using the IDE) and the merge operation doesn't actually make any changes (to your branch), but it marks the git history in a useful way: by telling you did an alignment (this is clearly visible in the log graph)

Merging back to the main branch

When your feature/bugfix/whatever is ready, ie, after you have done the steps above, you can merge it back to the main branch. This can be done this way:

  1. Switch to the main branch
  2. Ensure no further commits were pushed to the main repo
  3. Do a merge from your to-be-merged branch into main (the current branch), and do it in 'theirs' mode (see git commands below). In other words, previously you have aligned your branch to main, now you're bringing those changes and updates back to main, so the valid branch is yours
  4. Build, test, test and test
  5. Push

Naming branches

  • Please, name new branches like this: YYYYMMDD-<specific-meaning>.
  • Use the year-to-day format, so that branches are easily sorted by git and other tools
  • Omit the day or even the month, depending on how long time you expect to work on the branch.
  • After the date prefix, use a semantically-meaningful and specific name, so that everyone knows what it's about, and everyone knows what's about even in weeks, months, years (including yourself)

Good Examples:

  • 20220512-lucene-boost-experiment: probably I'll work on this for one or a couple of days at most
  • 202308-spring-migration: maybe it will require am month or longer, day omitted
  • 2021-java-9-migration: will require months and won't be done again in 2021, month and day omitted
  • 20230731-table-sorting-fix-issue-27: branch related to a specific issue

Bad examples

  • ui-gene-selector: prefix them with date markers
  • 202308-changes: which changes? It's too generic, not clear at all what it means
  • 202308-ui-changes: better, but still too generic
  • 20220418-my-tmp-code-rev: if it's temp and yours, then don't push it to github.

"Closing" branches

Please, do the following 'closing operations' when you think the work on a given branch is concluded, at least for the time being (as you'll see, a branch can be re-opened if needed).

In older versioning systems like SVN, it was possible to close a branch: ie, the branch was marked as 'closed' and no commit could be added to it until it was possibly re-opened.

This is useful to have a record of the repo evolution, to check what changes were introduced by eg, a feature branch, to keep the list of current branches short and possibly, to re-open an old branch.

Since doesn't allow for closing branches, this has to be emulated, as it is explained here

# Create a tag in the archive/ namespace, referring to the same commit of the branch to close 
# The same branch name is used, use always the same archive/ prefix
git tag "archive/2023-example-branch" "2023-example-branch"

# Delete the branch
git branch --delete 2023-example-branch

# Propagate everything from local to upstream (ie, github)
git push origin --delete "2023-example-branch"
git push --tags

This is easy to do with this script.

The same approach (and the linked script) can be used to archive tags too and clean up the tag list.

Please, ensure you've merged the branch before closure (with a 'formal' commit, see TODO).

Please, do not leave too many branches opened for a long time, please, do not delete branches when they aren't needed anymore, since keeping a historical record using the approach above is preferable and can be useful in future.

Useful git commands

Merge with "hard ours"

git merge --no-ff -s ours main

Your local branch wins on all the differences, the commit history results in a merge, but nothing from main was actually taken.

As explained above, this can be useful to mark an alignment that you made manually from main onto your local branch.

This is different than -X ours, cause it doesn't look at all at dev, and takes our version only.

Merging with "hard theirs" and resetting a branch with a previous version

This takes the remote branch and replaces the current one completely (be careful). This works the same for either the case when you're bringing the incoming branch into the current one, or when you want to completely roll back your current branch to a previous version/commit (that commit has the role of the incoming branch).

The option git -s recursive -Xtheirs isn't enough for this, cause it doesn't pick all the remote changes. You need the following instead (Ref):

git checkout main # assuming you want to do it on this
git reset --hard dev-branch # dev-branch is bring into main, use a commit hash to rollback to that

At this point, you might also need to pull from main without cancelling the above again (eg, when you're rolling back changes coming from remote/upstream):

git merge -s ours origin/main

Now, git push, and remember you'll lose the (remote) commits that changed the branch.

As explained above, this is useful when merging back to main, once the incoming branch has been aligned manually.

Creating a branch from local changes

git switch -c <new-branch>

Your working tree modifications are committed into the new branch

Downloading a branch from remote

git checkout -b experimental origin/experimental

The remote branch is copied onto the local 'experimental' branch. Remember that a git clone contains both 'experimental' (where you will work locally) and origin/experimental (a mirror of the remote/github) branch.

Source

This might not work, when the branch is recent, try this before:

git remote update
git fetch

Alternatively:

git fetch
git checkout <branch>

Changing the remote pointer

Your local git repo clone knows where the main repo is by means of the 'remote' commands:

git remote set-url origin https://github.com/user/repo2.git
git remote -v will show you the results

When you do a fork, there is also an 'upstream' pointer, see documentation about forks.

Tags

Ref

  • List tags: git tag
  • Add a tag: git tag -a 'v1.3.1' -m 'Version 1.3.1'
  • Push all, tags included (not done by default): git push --tags
  • Delete a tag on the remote repository
    • First delete it from the local repository: git tag --delete xxx
    • then issue: git push --force origin :refs/tags/xxx
  • Checkout a given tag: git checkout tags/<tag_name>

Deleting a tag or branch locally, after remote deletion

Just do this for removing local branches that were deleted remotely:

git fetch --prune

Amending a Commit

This is to change a commit message (eg, in case of typos):

git commit --amend

then, edit the old message. If the commit is already pushed, after that you need git push --force [branch]

⚠️ **GitHub.com Fallback** ⚠️