Where Contributors Have Problems - oilshell/oil GitHub Wiki

Back to Contributing

This page documents where Oil contributors typically have problems. As of May 2022, the project has existed for 6 years, and Github says there have been 51 contributors. I would say at least 30 of them contributed code (as opposed to fixing typos, which is welcome).

Thank you for all the help!

The goal of this page is to attract the right kind of contributor, by being up front about what to expect.

1. Trying to use OS X to develop Oil.

I've noticed that many contributors try to work on OS X. I tried this at first too, but now I use a Linux VM under VirtualBox when I'm on OS X.

A primary reason is that Spec Tests run against many shells, some of which aren't built for OS X. A secondary reason is that it's not clear what version of bash and Python are installed on OS X (although this is also an issue on Linux).

Follow-ups from comments on this:

2. We Have a Shell-Based Workflow and Development Environment

I use vim, grep, and tmux to navigate the code. Everything is highly automated with shell scripts.

However I've noticed that some people don't feel productive with this shell-based workflow. There is a lot of code generation done by custom tools, which means that it is somewhat fundamental. (That is, IDEs often don't work well with code generation.)

However:

  • There were times when the build scripts got messy. And I fixed them in response to contributor feedback. If you're having problems please complain about it! :)
  • I'm not a zealot about this: I also use JetBrains CLion to debug C++, because I need a GUI to debug productively!
    • In a distant utopia, there will be a GUI that is complementary to shell ... I just don't want to lose language-oriented composition and low latency with that evolution.

I wrote Oil Dev Cheat Sheet to help with this problem. Let me know if you are confused by it!

Hint: The Code Mostly Has a 2-Level Structure

When I need to find something, I simply use grep commands like this:

$ grep 'oil:basic' */*.py

$ grep CommandParser */*.sh

ctags and vim also work well on the codebase, although I often forget to set it up! devtools/ctags.sh has some shell functions. And I use grep -n pattern */*.py > out.txt, then open up out.txt in Vim and jump directly to locations. (Ctrl-])

3. A Very Test-Driven Workflow

The shell workflow is also test-driven. Interpreters and compilers are very intricate, so it's useful to isolate a specific piece of code with a failing test (often Spec Tests), and then make it go green.

Then make sure that nothing regressed in Soil, our CI system. These tests run on every commit, and they're published with every release on the quality.html page.

I find that this type of workflow makes collaboration easier, and progress very objective. However, I've noticed that not every contributor is comfortable with it.

I should add that I think it's valuable for any programmer to learn! It might seem slower in the short term, but it's faster in the long run.

4. C and C++

There are no "guard rails" for oil-native! I had a lot of problems with the garbage collector and mysterious seg faults.

I spent a lot of time in the debugger.

I learned not to ignore certain compiler warnings! (e.g. issue 1128: -Wreturn-type)

A garbage collector is inherently unsafe code. The point of writing it is so that the rest of our code can be safe Python!

  • BUT for the initial stage of the NLNet grant project, I suggest we work on pushing the 1496 test/spec-cpp number up to the 1775 test/spec.sh number, without garbage collection (as I've been doing). This will move the project forward, and prevent us from getting stuck in the mud.

  • Update 2022-10: We now have less than 7,000 lines of C++ code in the whole project! This is compared to bash's 140,000+ lines of C. The goal was to keep the "unsafe core" of the project to a minimum, and I'm happy with the result! (e.g. compare with every Rust program using jemalloc -- more than 30K lines of hand-written C -- prior to 2018)

5. Abstraction and Metalanguages

This is a little fuzzy, but I've noticed that it can be a stretch for some programmers to view Python syntax as something other than Python.

We are thinking of it differently. It's a subset of statically typed Python, used as a metalanguage for Oil. This is so it can be translated to fast C++.

Oil Is Being Implemented "Middle Out" (March 2022)

  • However, if you make something cool work in Oil, in Python, you don't have to translate it to C++! (It often works with no effort, though this is not guaranteed.) You should send it to us first, and then we'll talk about the translation. I feel like not enough contributors have taken advantage of this nice property!

6. The Git Repo and Release Tarball Are Different

Several contributors had issues with this difference. We have both a "dev build" and "release build", which is explained on Contributing.

  • The dev build is very easy to make. You should be able to get bin/osh -c 'echo hi' running on a Linux machine in less than a minute.
  • Making the release build involves a big build process. It transforms the git repo into the release tarball. It's mostly automated in the Soil continuous build.

Conclusion

If none of that fazes you, then you might be a great person to work on Oil! I'm still engaged with this project after many years because it's allowed for an extremely high pace of learning. I have become a much better programmer by working on it.

And of course the reason I started the project is that I became a much better programmer by automating everything with shell! I'm able to tackle many subprojects at once because everything is automated and written down.

When I stop working, my mind is able to forget the details of what I did that day. When I start working, my shell scripts show me where I left off. They tell me what to do next.

Related:

2022-09-06 Update

Our first "grant contributor" Jesse had problems with the build system and Soil CI. Some of that was due to needing to "unify" two separate C++ GC runtime experiments, which is inherently messy.

I made over dozen fixes to each of them, see #oil-dev on Zulip.

  • #include paths are all relative and consistent, e.g. mycpp/common.h and _gen/frontend/syntax.asdl.h
  • Likewise, Python scripts run with PYTHONPATH=. from the root
  • Oil Ninja Build was greatly overhauled and is very incremental / parallel / fast
    • consistent _gen/ dir structure
  • {asdl,cpp,mycpp}/TEST.sh is more consistent (but could still use improvement)
  • Soil CI
    • Merge to master only on green from soil-staging. The master branch should not be broken!
    • It's more consistently containerized, and the UI is better. It points you to the command to reproduce locally.
    • Flakiness drastically reduced

2022-10 Update

The improved build system and CI seem to be working great for Melvin, who got up to speed very quickly.

The C++ unit test harnesses in */TEST.sh also run very quickly now -- under a second for an incremental run.

  • We practice thorough (and friendly) pull request review. We can do it over video conference if contributors prefer higher-bandwidth communication (and we've done it this way, although Github PRs are still the default)
    • Some contributors haven't worked this way before, and aren't used to "talking" so much about code. My view is that we should be able to "explain the code with a straight face" :-) If we can't, then it probably isn't right.
    • The goals are to improve the project, spread knowledge, and learn from one another!
  • We do a lot of code reading of other projects (bash, dash, CPython), and try to learn from their scars (e.g. race conditions in signal handlers)
    • I often post design ideas, references to code, and references to papers on #oil-dev on Zulip.

2023-04 Update

"There's Too Much to Read" / Zulip is Hard to Understand

I've gotten the feedback that keeping up with the project is difficult. There are many parts, and they touch many areas (the kernel, signals, terminals, parsing, data structures, GC, a boatload of test harnesses, a boatload of performance tools, etc.)

And there are many inter-related threads on Zulip about these things.

I agree this is a problem, but I would say a few things:

  1. We have ongoing lists of self-contained tasks that for people to get started. I just put some on #oil-dev > More Places to Get Started. So if a specific piece of work seems interesting, it's OK to just focus on that. Asking questions on Zulip is very welcome, and I might often point you to another thread about it!

  2. We're trying to achieve a coherent design, not one where things are bolted on haphazardly, and Zulip is actually useful in that regard. It allows deep linking and history. I sometimes dig up old threads from 2020, and they help with the project! I also get valuable feedback on Zulip.

  3. I think this problem is fundamental to many open source projects. If I were to try to follow everything going on in CPython or Rust, I would get very confused too!

Not Knowing Shell / Not Being Able to "Reverse Engineer" It

Many people agree with the idea of a new shell YSH. We are trying to design a small and consistent language that has all the functionality of shell. It shares the same runtime as OSH, so this is possible.

But to work on the project, you'll have to know / learn some POSIX shell / bash, and be able to reverse engineer it. Many small details are not documented in the bash manual, for example. (FWIW I use a pretty restricted subset of bash myself, but I had to learn a lot more for this project. Only a few more things made it into my subset :) )


So even the experienced contributors have had problems with this. They've remarked that there are so many bash scripts in our repo!

But they also remarked that these scripts do a lot, often things that aren't available in other projects:

  • Keep the master branch green. We merge from soil-staging only when 12 different CI tasks pass -- see http://travis-ci.oilshell.org/github-actions
  • Run C++ unit tests with ASAN (memory safety), UBSAN (undefined behavior), 32-bit, etc.
  • Extensive spec tests against other shells
  • Being able to write typed Python, and getting C++ performance for free!

Recent Dev Friction

I keep track of the places where people have problems: #oil-dev > Dev Friction / Smells

And I summarized a few issues here: Recent Dev Friction