Using semgrep - SethBodine/audit-tools GitHub Wiki
Semgrep is a static analysis tool for finding security vulnerabilities, bugs, and policy violations in source code. It supports a wide range of languages and uses pattern-matching rules that are readable and easy to write. The container includes a set of custom rulesets in addition to the standard Semgrep registry.
- Supports 30+ languages including Python, JavaScript, Go, Java, Ruby, and C/C++
- Uses pattern-based rules that are human-readable and customisable
- Custom rulesets are bundled in the container under
/opt/semgrep/custom-rules/ - Can run against a local directory or pull rules from the Semgrep registry
Semgrep runs in a Python virtual environment.
cd /opt/semgrep/
. semgrep.sh # activates the venv and downloads custom rulesetsNote: The script downloads custom rulesets and removes any invalid YAML files on each run.
This runs all bundled custom rulesets against a target directory, saving one output file per ruleset.
code_fol=<path-to-scan>
for ruleset in $(find custom-rules/ -maxdepth 2 -mindepth 2 -type d -not -path '*/.*'); do
unbuffer semgrep scan -f ${ruleset} --metrics=off ${code_fol} \
| tee semgrep-$(echo ${ruleset} | sed 's/\//-/g').txt
doneScan with Semgrep Registry Rules (WARNING: These commands may connect outbound and may send code to semgrep)
semgrep scan --config auto <path> # auto-select rules for detected languages
semgrep scan --config p/owasp-top-ten <path>
semgrep scan --config p/secrets <path>
semgrep scan --config p/terraform <path>semgrep scan --config auto <file>semgrep scan --config auto <path> --json > /output/semgrep-results.json
semgrep scan --config auto <path> --sarif > /output/semgrep-results.sarif- Semgrep may exhaust available memory before completing a large scan. Breaking the work into subdirectories is a workaround.