First contributions - Rust-GCC/gccrs GitHub Wiki

Contributing to this project might appear daunting to some, here is a quick guide that will hopefully help newcomers.

Fork & Clone

It is required for people to work on their own fork of the project because it is impossible to create a new branch without being part of the GCC team. To do so you need to fork the repository and clone it locally on your computer. You will be able to push any branch you want on this forked remote.

Once your are ready to share your work, you can open a PR on the real GCCRS repository and select the branch from your fork. Github usually suggest to create a new pull request from your most recently pushed branch. Keep in mind you'll need to synchronize your fork often.

Even though you can't push directly to the main repository, you can use it to read & fetch latest commits. You can add a new remote using the following command.

git remote add upstream [email protected]:Rust-GCC/gccrs.git

The next steps we advise are usually the same for everyone, you need to configure and compile the project.

Setting up a development environment

The following dependencies are required on Ubuntu

$ apt install build-essential libgmp3-dev libmpfr-dev libmpc-dev flex bison autogen gcc-multilib dejagnu cargo

cargo is required because some components of GCC are written in rust. In the future this will change as GCC will be able to compile those rust components.

⚠️ Every GCC commit shall compile properly.

Commits

GCC requires a specific format in commit messages. A commit message shall contains the following elements:

  • Short title
  • Description (why the commit exists)
  • Changelog

Every commit shall have clear copyright information. When using Developer's Certificate of Origin, a commit may need a Signed-off-by line:

  • Commit sign off (git commit -s)

Changelog

The changelog is quite tricky to get right, a git command is therefore provided to generate most of it. This command can be installed by launching the git customization shell script in contrib:

$GCC_TOP_DIR/contrib/gcc-git-customization.sh

This script provides a new git command to replace git commit.

git gcc-commit-mklog -s

Alternatively, a direct invocation of the python script can be used to scaffold the changelog to be used in the commit message:

$ git show | $GCC_TOP_DIR/contrib/mklog.py

Developer's certificate of origin

We require people to sign their commit using the -s option. This option will append a signature to the commit's message. This signature acknowledges the developer owns the right of the code. You can find more information about this on the official DCO website as well as the gnu website. Note that we also require everyone to send a message in their first pull request stating they understood and agreed with the DCO.

There is an alternative to the DCO system but it may be harder to setup, check the GNU website for more information.

Code format

We use clang-format to check and format the patches. The configuration can be found at $GCC_TOP_DIR/contrib/clang-format. You may copy this file to the root directory using the following command to ensure clang-format works properly.

cp $GCC_TOP_DIR/contrib/clang-format $GCC_TOP_DIR/.clang-format

Version 10 of clang-format shall be used as newer versions are incompatible. All commits shall be properly formatted, "format commits" shall be avoided at all cost because it makes the commit history harder to work with.

The formatting script does not trigger but GCC's coding style specifies that if containing a single instruction shall not use enclosing braces.

if (condition)
  {
  single_statement();
  }
✔️
if (condition)
  single_statement();

Configure

The configure script will check that your computer has all required dependencies to compile the project. To configure the project, you need to create a directory somewhere that will hold all build artifacts. Note that the root directory of the project should not be used as GCC's build system doesn't really like this but you may create a directory within the project's tree.

mkdir $GCC_TOP_DIR/build && cd $GCC_TOP_DIR/build

This guide will assume you created a build directory at the root of the project's directory.

../configure --disable-bootstrap --enable-multilib --enable-languages=rust

You may want to use some additional flags or specify the compiler (Yes, GCC can be compiled with clang!). Here is another example with some extra configuration:

../configure CC="ccache clang" CXX="ccache clang++" CFLAGS="-O0 -g3" \
   CXXFLAGS="-O0 -g3" \
   LD_FLAGS="-fuse-ld=mold" \
   --disable-bootstrap --enable-multilib --enable-languages=rust

This configuration will require additional tools but may speedup your workflow.

  • ccache to cache build artifacts.
  • clang because output is less convoluted and there are more warning.
  • Debug informations are enabled.
  • mold is used instead of ld for linking stage.
⚠️ Archlinux may require further configuration.
  • CXXFLAGS -fno-pie
  • CFLAGS -fno-pie
  • LDFLAGS -no-pie

Build

You may then build using make. You can specify the amount of jobs that should be used.

make

gcc is large enough to warrant exploring compiling with multiple threads : consider using the -j flag with make

Testing the compiler

The next step is usually to check wether the compiler works correctly and learn how to launch tests. This can be achieved with the following command.

make check-rust

Note that you can launch the tests on multiple jobs but the output will be a little bit messed up.

  • XFAIL - Test is expected to fail
  • XPASS - Unexpected success
  • FAIL - Test failed
  • PASS- Test passed
  • UNSUPPORTED - Test is unsupported on this target

Tests are using the DejaGNU framework.

Test results and options used are available in $GCC_TOP_DIR/build/gcc/testsuite/rust/rust.sum. You can also access test output in $GCC_TOP_DIR/build/gcc/testsuite/rust/rust.log if you wish to quickly investigate why a test is failing.

Invoke a subset of the testsuite. For example, to only run tests that are currently known/expected to fail:

make check-rust RUNTESTFLAGS="xfail.exp"

There are the following sets of tests:

  • compile.exp - compilation tests
  • execute.exp - execution tests
  • xfail.exp - tests that are currently known/expected to fail

Invoke only a specific test :

make check-rust RUNTESTFLAGS="--all compile.exp=continue1.rs"

Running the compiler

The compiler driver is called gcc but the rust compiler is crab1, the binary is located under $GCC_TOP_DIR/build/gcc/crab1.

A few options are quite handy:

  • -frust-debug - prints a lot of debug information during the compilation.
  • -frust-dump-* - dumps various internal representations (AST/HIR/GENERIC/GIMPLE/...).
  • -frust-compile-until= - allows the compiler to stop at a given step.

You may want to add a variable in your environment to avoid passing the experimental flag every time you invoke crab1.

export GCCRS_INCOMPLETE_AND_EXPERIMENTAL_COMPILER_DO_NOT_USE=1

First contribution

We usually advise people to choose a recent issue (some may be outdated) with the tag good-first-pr to begin with. You can ask a maintainer to be assigned to the issue, it will prevent people from doing the same work. You may then create a branch on your fork and work on identifying and fixing the issue before opening a PR.

Add a new file

It may be required to add a new file, this can be done easily by modifying $GCC_TOP_DIR/gcc/rust/Make-lang.in. GCC uses .cc extension for C++ files.

Writing a test

Do not forget to add a new test to highlight the fixed behavior. The rust testsuite can be found in $GCC_TOP_DIR/gcc/testsuite/rust. Most test only run the compiler and not the resulting binary. The former can be found under compile while the latter can be found under execute.

A test usually runs the compiler with a given set of flags on some rust code and expects some output such as an error message. You may specify additional flags using the following directive (refer to the full documentation on DejaGNU or the GCC specifics for more information)

// { dg-additional-options "-fXXXXX -fYYYYY" }

Compiler output shall matches the exact error location this can be achieved easily when only one error is emitted on a given line. Dots can match any character, we're using it here to match quotation marks around the item name and the error number.

use foo::bar; // { dg-error "unresolved import .foo::bar. .E0433." }

Sometimes multiple errors can arise on the same line, we therefore cannot match those using this simple syntax but we can move the DejaGnu directive to another line.

use foo::{bar, baz};
// { dg-error "unresolved import .foo::bar. .E0433" "" { target *-*-* } .-1 }
// { dg-error "unresolved import .foo::baz. .E0433" "" { target *-*-* } .-2 }

If your test is an "execute" test, you can match it's output using the following DejaGNU directive.

// { dg-output "Hello world\r*\n" }

Note the additional \r* to prevent the test from failing to match on platforms where the cariage return character is required.

Draft PRs

You can open a PR early if you want some feedback, please use the "draft" status until your PR is ready for review and merge.

Navigating the codebase

Most of the code you'll have to work on is in $GCC_TOP_DIR/gcc/rust but you may in some rare occasions have to modify code in $GCC_TOP_DIR/libgrust.

  • gcc/rust/
    • metada/ - Crate metadata export
    • ast/ - Abstract syntax tree definitions/operations
    • checks/ - Checking passes on the AST/HIR
    • hir/ - High level intermediate representation
    • expand/ - Macro expansion
    • lex - Lexing
    • backend/ - HIR to GENERIC translation
    • typecheck/ - Typechecker
    • util/ - Misc utilities
    • parse/ - Parser code
    • resolve/ - Name resolver (both versions)
  • libgrust/
    • libformat_parser/ - Format args parser (builtin format! macro)
    • libproc_macro/ - Rust procedural macro interface
    • libproc_macro_internal/ - Procedural macro business logic

We advise newcomers to identify which part of the compiler their issue rely on before working on it. Sometimes it might not be obvious, for example the parser let many edge cases slip in the AST because of macros expansion, those invalid snippets are rejected at a later stage during AST validation.

Moving around

We recommend using either a language server protocol or ctags to easily jump on/find a given definition and navigate the code. They're not infallible though, some big files such as rust-parse-impl.h may resist. You may therefore require grep/rg sometimes.

In case you want to use a language server protocol, a compilation database can be generated using bear, older versions of bear are incompatible with multiple jobs. You either need to compile bear from source or use it using only one job. You don't need to regenerate the database every time.

bear -- make -j1

Misc

Keep in mind that even if you can modify common files in GCC, other parts of the compiler may follow some unmentioned rules. For example, no common directory shall be modified during the freeze period between November and the release to prevent things from breaking.

⚠️ **GitHub.com Fallback** ⚠️