Dividing existing repository history in submodules - sauter-hq/git-guidelines GitHub Wiki

Often it is so that you begin with a non-modular structure and that during the life of a project modules get developped and uncoupled from their root project. There also is the case when you import an SVN repository and have different module in different folder, and want to make them separately cloneable.

There are different ways to divide a repository in multiple repositories, this is possible with git thank to different systems :

But the most standard way are submodules, and it's also the solution supported by any Git client. To divide a repository in multiple one you can use the following script:

#/bin/bash
#
# generate-repo-from-subdir.sh source-repo subdirectories target-repo
#

if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
        echo "Usage : "
        echo -e "\t generate-repo-from-subdir.sh <source-repo> <subdirectories> <target-repo>"
					echo -e "\t \t source-repo \t \t this is the local path to the repository containing subdirectories and it's history."
					echo -e "\t \t subdirectories \t \t this is the subdirectories to extract as repository, for multiple put them in \"\" separated by space, this is useful when it was renamed at some point in the history."
					echo -e "\t \t target-repo \t \t Local path to the repository which should math the subdirectories history and state."
        exit 1
fi

source_repository="$1"
subdirectories="$2"
target_repo="$3"

git clone --no-hardlinks ${source_repository} ${target_repo}

cd ${target_repo}/
git filter-branch --index-filter "git rm --cached -qr --ignore-unmatch -- . && git reset -q \$GIT_COMMIT -- ${subdirectories}" --prune-empty -- --all

git remote rm origin
git gc --aggressive
git prune

cd ${source_repository}/
git rm -r ${subdirectories}
git submodule add ../${target_repo}.git ${subdirectories}
git commit -m "ADMIN: Replace ${subdirectories} by ../`basename ${target_repo}`.git submodule"

For example, to extract a framework-bac-bac.git repository from visionModules, we would execute the following command :

./generate-repo-from-subdir.sh visionModules "bam/bac/ src/com/sauter_controls/vision/bac/ test/com/sauter_controls/vision/bac/" framework-baf-bac

This has the effect that the visionModules repository get cloned as framework-baf-bac and that in this repositroy all commits non affecting the folders bam/bac/, src/com/sauter_controls/vision/bac/, test/com/sauter_controls/vision/bac/ are removed from the history, so that in the end of the script execution the repository only contains files changes affected by commits made in these directories.

In cases where a directory was renamed, this should be made with all the name the interesting folder had. In cases the names were reused for something else, then this should be executed on commit ranges or if this is too complex, then working with grafts is the best way.