Vectorization in Rust: October 16 status report - lkuper/rust GitHub Wiki

Vectorization in Rust: 10/16 status report

Our project: implement vectorized, data-parallel loops in the Rust programming language.

What we've done so far

JP has gotten Rust and LLVM building and running, figured out basic Rust syntax, and taken a reading pass through parts of the Rust compiler, including most of the code in https://github.com/lkuper/rust/tree/vectorization/src/comp/syntax and https://github.com/lkuper/rust/tree/vectorization/src/comp/middle. JP's next immediate goal is to start hacking on the Rust compiler by trying to add a new piece of syntax and seeing what happens.

Since Lindsey was already familiar with Rust and the compiler, she's instead been doing background reading on vectorization and vector processors and has created a branch to work in (https://github.com/lkuper/rust/tree/vectorization). She has started discussing vectorization on the Rust mailing list (https://mail.mozilla.org/pipermail/rust-dev/2011-October/000872.html), without much of a response yet. She's figured out how to see LLVM output from Rust and written up some notes (https://github.com/lkuper/rust/wiki/Seeing-LLVM-output-from-Rust) on how to do this, which should come in handy soon.

We've both been reading general LLVM documentation and keeping up with the Rust mailing list and IRC channel.

What to do next

Shorter-term plans

  • Decide on a syntax for vectorized loops (like the #simd pragma) and write some "fantasy land" programs that use it. Check these into the test suite with the xfail-test directive turned on (that is, set them to be ignored). (Deadline: October 24)
  • Implement parser/lexer support for that syntax in Rust; figure out how "hey, this loop ought to be vectorized!" information gets passed through the AST. Presumably there should be some more (optional) information hanging off the expr_for or expr_for_each AST nodes (the AST is defined in https://github.com/lkuper/rust/blob/vectorization/src/comp/syntax/ast.rs). (Deadline: October 31)
  • Learn about vector types in LLVM so we can start to get a sense of what kind of LLVM codegen we ought to be doing. (Deadline: October 31)

Longer-term plans

  • Actually implement this codegen in https://github.com/lkuper/rust/blob/vectorization/src/comp/middle/trans.rs. The trans_for and trans_for_each functions are the ones that are translating expr_for and expr_for_each AST nodes, respectively. They're pretty hairy, and this is going to be the hardest part of the project and will take most of the semester. (Tentative deadline: December 2)

    • To be determined: do we have to add another type of AST node to deal with parallel things?
  • Write some benchmarks that seem like they would benefit from data parallelism and implement them in Rust. Create a directory under https://github.com/lkuper/rust/blob/vectorization/src/test/ as a place for these to live. Then, run them, time them, and compare vectorized and unvectorized versions. (Tentative deadline: December 9)

    • Some people (specifically, Patrick Walton on the Rust team) have done a lot of profiling of Rust code -- it would be a good idea to find out how that was done.
    • Writing these benchmarks wouldn't be a bad way for JP to get started writing Rust programs.
    • We shouldn't underestimate how long this part of the project should take, and it could take infinite time.
    • Comparing against vectorized/unvectorized code in another language (probably C/C++) to see if we get comparable speedup would be interesting. (We can't necessarily get comparable performance, but the speedup would be nice.)

Integration into mainline Rust

For the time being, we'll both just commit our work to https://github.com/lkuper/rust/tree/vectorization. Every now and then, we'll fetch and merge from https://github.com/graydon/rust to keep our work in sync. At some point in the future, we can make a pull request to integrate what we're doing into mainline Rust; it's probably going to require more discussion then, and we don't know yet if the Rust team will want it. Either way, we're hoping to learn something from the experience.