Git statistics - AD-EYE/AD-EYE_Core GitHub Wiki
Using Hercules
Hercules (https://github.com/src-d/hercules#installation) is a tool that allows to extract statistics about a git repository and to plot them using the labours tool.
Analysing and plotting directly
./hercules --granularity=1 --burndown --languages="python" /home/adeye/adeye_temp/AD-EYE_Core | labours -m burndown-project
The language can be replace by c++
or by matlab
.
Analysing and saving the results
./hercules --granularity=1 --burndown --languages="python" --pb /home/adeye/adeye_temp/AD-EYE_Core > analysis_results.pb
Combining saved results and saving the combination as pb
./hercules combine results1.pb results2.pb results3.pb > results123.pb
Combining saved results and plotting the combination
./hercules combine results1.pb results2.pb results3.pb | labours -m burndown-project
Plotting results save in pb format
./hercules combine results.pb | labours -m burndown-project
Skipping certain folders/files
./hercules --burndown --first-parent --pb --skip-blacklist --blacklisted-prefixes="prefix to skip" /repo_folder | labours -f pb -m burndown-project
Plotting with better time resolution (must have analysis results comman piped)
labours -m burndown-project --resample=month
labours -m burndown-project --resample=raw #shows commit granularity
Cleaning the repositories from noise
Some commit added a lot of lines of code that were not written but duplicated or added external projects. To remove this noise in the history the history must be rewritten to a clean state.
Note: if the files to be removed still exists in the HEAD, they need to be removed in a new commit before the history can be rewritten
Do not push the cleaned repository
Removing files and folders (will remove just base on name, regardless of path)
BFG is a tool that allows to rewrite history to remove files or folder based on name.
java -jar bfg-1.14.0.jar --delete-files file_name
java -jar bfg-1.14.0.jar --delete-folders folder_name
Removing specific files (specifying the path)
filer-branch command allows to remove specific files or folders from history.
git filter-branch -f --tree-filter 'rm -f path_to_file_or_folder' HEAD
AD-EYE_Core
Folders to remove:
mjpeg_server
web_video_server
robot_gui_bridge
GUI_server
experiments
Data
Prescan_models
Files to remove:
SSMPset_2018-1-3--11-58-48.csv
KTH_3D_KTH3d_20191008.org.dae
TemplatePexFile.pex
Pex_Data_Extraction:
Folders to remove:
Tests
(https://gits-15.sys.kth.se/AD-EYE/Pex_Data_Extraction/commit/c8ca858ca8686b058223983b38b4b56cbecf7eed)__pycache__
(https://gits-15.sys.kth.se/AD-EYE/Pex_Data_Extraction/commit/6465b42d8716809f129c3ef6ce89f3ef1eb74f32)csv
Files to remove (those files were duplicated from pex2csv folder): (command : git filter-branch -f --tree-filter 'rm -f preproc.py' HEAD)
main.py
path.py
parse.py
preproc.py
road.py
staticalobject.py
utils.py
vmap.py
Finding what should be removed
Plotting using labours with resampling option can help know when the noisy commit happened.
The following script allows to find the biggest blobs in the history (source).
git rev-list --objects --all |
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
sed -n 's/^blob //p' |
sort --numeric-sort --key=2 |
cut -c 1-12,41- |
$(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
Using gitinspector
Install with sudo apt-get install gitsinspector
.
gitinspector -l -r -m -T -f=",js,c,cpp,h,hpp,py,m" --format=htmlembedded > gitinspector_page.htm
Using gitstats
Insstall with sudo apt-get install gitstats
.
gitstats git_directory output_directory