Development - UCLOrengoGroup/cath-tools GitHub Wiki

Development

This page contains extra information that may be useful for anyone working on the development of cath-tools. For information on how to set up a standard build of the code, see Build.

To Do

rapidjson

Clang's UndefinedBehaviorSanitizer reports during a run of build-test:

../rapidjson/include/rapidjson/internal/stack.h:117:13: runtime error: applying non-zero offset 16 to null pointer

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../rapidjson/include/rapidjson/internal/stack.h:117:13 in 
../rapidjson/include/rapidjson/internal/stack.h:117:13: runtime error: applying non-zero offset 1 to null pointer

This smells like genuine UB (ie like ubsan's report is a true-positive).

It seems this area of code has been updated in more recent versions of rapidjson so this may have been fixed. It would be good to upgrade. Perhaps via Conan? Or perhaps move to "JSON for Modern C++ version", though that would probably involve more work.

Doxygen Code Documentation

Much of the code is documented inline with Doxygen. To view it, install doxygen, run doxygen in the cath-tools root directory and then view doxygen_documentation/html/index.html in a browser.

Compiling Multiple Versions (release/relwithdebinfo/debug)

To compile multiple versions, you can put them in individual directories. First make sure that you don't have any build files in the root directory and then run something like:

mkdir release
cd release
cmake -DCMAKE_BUILD_TYPE=RELEASE ..
cd ..
make -C release

Available types are: release, relwithdebinfo and debug

NOTE: the debug build requires a debug version of Boost built with _GLIBCXX_DEBUG enabled otherwise the resulting code will have all sorts of horrible, non-obvious errors. If you really want to build a debug version without doing this, you can work around this issue by removing all mentions of -D_GLIBCXX_DEBUG from CMakeLists.txt.

Example CMake Commands

clang_debug

/usr/bin/cmake -DCMAKE_BUILD_TYPE=DEBUG          -DBUILD_SHARED_LIBS=ON -DBOOST_ROOT=/opt/boost_1_68_0_clang_build -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_CXX_FLAGS="-stdlib=libc++" ..

clang_release

/usr/bin/cmake -DCMAKE_BUILD_TYPE=RELEASE                               -DBOOST_ROOT=/opt/boost_1_68_0_clang_build -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_CXX_FLAGS="-stdlib=libc++" ..

clang_relwithdebinfo

/usr/bin/cmake -DCMAKE_BUILD_TYPE=RELWITHDEBINFO                        -DBOOST_ROOT=/opt/boost_1_68_0_clang_build -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_CXX_FLAGS="-stdlib=libc++" ..

gcc_debug

/usr/bin/cmake -DCMAKE_BUILD_TYPE=DEBUG          -DBUILD_SHARED_LIBS=ON -DBOOST_ROOT=/opt/boost_1_68_0_gcc_build                                                                                                              ..

gcc_release

/usr/bin/cmake -DCMAKE_BUILD_TYPE=RELEASE                               -DBOOST_ROOT=/opt/boost_1_68_0_gcc_build                                                                                                              ..

gcc_relwithdebinfo

/usr/bin/cmake -DCMAKE_BUILD_TYPE=RELWITHDEBINFO                        -DBOOST_ROOT=/opt/boost_1_68_0_gcc_build                                                                                                              ..

Consider using ninja instead of make

If you're developing, consider using ninja instead of make. It's a drop-in replacement that's much quicker to get started, which can make a big difference on incremental builds. To use ninja, install a suitable package and add -GNinja to your CMake commands.

Shared Versus Static

At present, the builds are all completely static. This should be made more configurable in the future so that (at least debug) builds can be run in static mode using the CMake flag -DBUILD_SHARED_LIBS=ON.

Clang Static Analzyer

Want to search for bugs in the code? These instructions aim to help you run Clang-based static analysis. (TODO: Get this running on a CI server, such as Travis-CI.)

Ensure you have clang installed. Find the analyzer programs ccc-analyzer and c++-analyzer. For example, on Ubuntu you can do something like:

dpkg -l | grep clang | awk '{print $2}' | xargs dpkg -L | grep analyzer

Then substitute their locations into the following commands and then run the commands, starting in the root of the cath-tools project:

mkdir build-analyze && cd build-analyze
setenv CCC_CC  clang
setenv CCC_CXX clang++
# For Clang 3.6
/usr/bin/cmake -DBOOST_ROOT=/opt/boost_1_68_0_clang_build -DCMAKE_C_COMPILER="/usr/share/clang/scan-build-3.6/ccc-analyzer"         -DCMAKE_CXX_COMPILER="/usr/share/clang/scan-build-3.6/c++-analyzer"         -DCMAKE_CXX_FLAGS="-stdlib=libc++" ..
# For Clang 3.8
/usr/bin/cmake -DBOOST_ROOT=/opt/boost_1_68_0_clang_build -DCMAKE_C_COMPILER="/usr/share/clang/scan-build-3.8/libexec/ccc-analyzer" -DCMAKE_CXX_COMPILER="/usr/share/clang/scan-build-3.8/libexec/c++-analyzer" -DCMAKE_CXX_FLAGS="-stdlib=libc++" ..
scan-build make

To get parallel compilation, you can append -j # to the scan-build make (where # is the number of threads).

Checking headers compile independently

Clang

find source -iname '*.hpp' | sort | grep third_party_code -v | xargs -P 4 -I VAR clang++ -DBOOST_LOG -std=c++14 -stdlib=libc++ -W -Wall -Werror -Wextra -Wno-unused-const-variable -Wno-unused-local-typedef -Wsign-compare -Wcast-qual -Wconversion -Wnon-virtual-dtor -pedantic -ftemplate-backtrace-limit=0 -c -o /tmp/.comp_clang.dummy.header.o -isystem /opt/boost_1_68_0_clang_build/include -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version VAR

GCC

find source -iname '*.hpp' | sort | grep third_party_code -v | xargs -P 4 -I VAR g++     -DBOOST_LOG -std=c++14                -W -Wall -Werror -Wextra -Wno-unused-const-variable -Wno-unused-local-typedef -Wsign-compare -Wcast-qual -Wconversion -Wnon-virtual-dtor -pedantic -ftemplate-backtrace-limit=0 -c -o /tmp/.comp_gcc.dummy.header.o   -isystem /opt/boost_1_68_0_gcc_build/include   -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version VAR

Using clang-tidy checks and fixes

Use a recent clang-tidy

It's worth using a recent version of clang-tidy because it's being improved rapidly (eg before v4.0 it didn't handle .hpp file suffixes for header guards). Consider downloading the latest from releases.llvm.org.

Fixing trailing namespace comments

find source -iname '*.hpp' | sort | grep third_party_code -v | xargs -P 4 -I VAR /bin/tcsh -c "clang-tidy -fix -checks=llvm-namespace-comment VAR -- -std=c++14 -isystem /opt/boost_1_68_0_clang_build/include -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version || true"

Fixing header guards

find $PWD/source -iname '*.hpp' | sort | grep -vw 'third_party_code' | xargs -I VAR -P 4 /bin/tcsh -c "clang-tidy -fix -checks=llvm-header-guard VAR -- -x c++ -std=c++14 -isystem /opt/boost_1_68_0_clang_build/include -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version || true"

Assessing with all clang-tidy checks

Prefer using more recent version of clang-tidy; it's changing fast.

Would like to use:

  • misc-use-override but (in version 3.6.2), it erroneously fires for method declarations that do use override
  • google-readability-function but (in version 3.6.2), it fires for declarations, for which I don't want to always name all params
clang-tidy '-checks=*,-llvm-header-guard,-llvm-namespace-comment,-google-readability-namespace-comments,-google-build-using-namespace,-misc-use-override,-google-readability-function' -list-checks --
clang-tidy '-checks=*,-llvm-header-guard,-llvm-namespace-comment,-google-readability-namespace-comments,-google-build-using-namespace,-misc-use-override,-google-readability-function' -dump-config --
find source -iname '*.?pp' | sort | grep third_party_code -v | xargs -P 4 -I VAR /bin/tcsh -c "clang-tidy VAR '-checks=*,-llvm-header-guard,-llvm-namespace-comment,-google-readability-namespace-comments,-google-build-using-namespace,-misc-use-override,-google-readability-function' -- -std=c++14 -isystem /opt/boost_1_68_0_clang_build/include -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version || true"

find source -iname '*.?pp' | sort | grep third_party_code -v | xargs -P 4 -I VAR /bin/tcsh -c "clang-tidy VAR '-checks=*,-llvm-header-guard' - -- -std=c++14 -isystem /opt/boost_1_68_0_clang_build/include -isystem rapidjson/include -I source -I source/src_clustagglom -I source/src_common -I source/src_test -I source/uni -I source/third_party_code -I ninja_clang_debug_shared/source/cath_tools_git_version || true"

Dumping class/struct memory layouts

clomp source/file/pdb/pdb_atom.hpp -Xclang -fdump-record-layouts > /tmp/clang_pdb_atom_layout.txt
grep pdb_atom -A60 /tmp/clang_pdb_atom_layout.txt

Simplifying hmmsearch output files for cath-resolve-hits

grep -A500 '>> O27798_698e555d8cb2c0ea3979641198e527bf' temp_0.hmmsearch.evalcoff0.001 | tr '\n' '@' | sed 's/@>>/\n>>/g' | sed 's/@Internal pipeline statistics summary/\nInternal pipeline statistics summary/g' | grep -v 'Internal pipeline statistics summary' | grep O27798_698e555d8cb2c0ea3979641198e527bf | tr '@' '\n' > bob
echo '\n\nInternal pipeline statistics summary:\n[ok]\n' >> bob

Investigating issues with Travis-CI CMake v3.2.2

TODO: Consider removing this now that Travis-CI has upgraded its CMake to 3.9.2, as of 12th December 2017

cd /tmp
wget "https://cmake.org/files/v3.2/cmake-3.2.2-Linux-x86_64.tar.gz"
tar -zxvf cmake-3.2.2-Linux-x86_64.tar.gz
~
rm -rf /cath-tools/cmake-3.2.2-test
mkdir  /cath-tools/cmake-3.2.2-test
cd     /cath-tools/cmake-3.2.2-test
/tmp/cmake-3.2.2-Linux-x86_64/bin/cmake -GNinja -DBOOST_ROOT=/opt/boost_1_61_0_clang_build -DCMAKE_BUILD_TYPE=RELEASE -DBUILD_EXTRA_CATH_TOOLS=ON -DBUILD_EXTRA_CATH_TESTS=ON ..
ninja mod-test-common
ninja

Graphing Build Dependencies

mkdir /cath-tools/build-deps-graph
cd    /cath-tools/build-deps-graph
echo 'set( GRAPHVIZ_GRAPH_HEADER "node [ color=navy, fontname=\"Helvetica bold\", fontcolor=white, fontsize=12, style=filled ]" )' > /cath-tools/build-deps-graph/CMakeGraphVizOptions.cmake
/usr/bin/cmake --graphviz=test.dot .. -DBOOST_ROOT=/opt/boost_1_68_0_gcc_build
ls -1 *.dot *.dot.* | xargs -I VAR dot -Teps VAR -o VAR.eps

Notes on improving module-level structure

Proposed rules for auto-generation of CMake variables:

  • fixture is part of TESTSOURCES
  • detail is separate library that (in CMake) is a private dependency of the corresponding library

Seriously consider making separate modules for gsl/rapidjson so that it needn't be targeted by everything that uses common.

Violations:

  • biocore/residue_id.hpp includes structure/structure_type_aliases.hpp

Source files that include rapidjson:

  • common/rapidjson_addenda/rapidjson_writer.hpp
  • resolve_hits/read_and_process_hits/hits_processor/write_json_hits_processor.hpp

Source files that include <gsl/[...]>

  • common/gsl/get_determinant.hpp
  • common/gsl/gsl_matrix_wrp.hpp
  • common/gsl/gsl_permutation_wrp.hpp
  • common/gsl/gsl_vector_wrp.hpp
  • structure/geometry/orient.cpp
  • structure/geometry/pca.cpp
  • structure/geometry/superpose_fit.cpp
⚠️ **GitHub.com Fallback** ⚠️