Testing Notes - HarlanH/julia GitHub Wiki
Notes on Julia's testing facility and thoughts on improvements
Now
Julia core currently has
- throw/try/catch
- error/exception
- assert function, assert macro (captures expression for output)
Julia test/runtests.jl currently has
- runtests(filename), which does a pretty ANSI prompt then a load(filename)
- assert_approx_eq(a,b) macro/fn which confirms that a=b within a (fixed) 1e-6
- timeit(expression,name) which runs the expression 5 times and prints the minimum elapsed time
- assert_fails macro that fails if the expression doesn't throw an exception at all
- if called via "julia runtests.jl file.jl", will run runtests(file.jl)
Julia test/Makefile (and related Makefiles) currently has:
- process for using runtests.jl to pretty-print test files that ran successfully
Goals
- ability to run a test suite outside of the Julia build environment (i.e., test functions are in core, not in a separate directory, and they don't rely on Make)
- support for setup/breakdown of temporary objects
- failures cause an error message to be printed, but not an uncaught error to be thrown
- should be inspired by PyUnit, perhaps with contributions from test_that and others
- simple
Things to perhaps use from PyUnit
- separation of test running from output
- explicit setup/cleanup
- test discovery (execute all test_*.py files below specified directory)
Things to perhaps use from test_that
- hierarchy of context(), test_that(), expect_that() (rename these, though?)
- the expect_that(2+2, equals(4))syntax and set of built-in relations is pretty good and easy to use
Juliaish things
- probably tests should be run by a producer, and output be generated by a consumer
- probably macros would make even cleaner syntax: expect_that(2+2, @equals 4),expect_that(f(x), @is_false),expect_that(show(z), @prints_text "hi mom"), etc.
- could maybe even do: @expect_that 2+2 equals 4,@expect_that f(x) is_false,@expect_that show(z) prints_text "hi mom"?
Thinking this through
- Julia doesn't do lazy evaluation, so the outer call has to be a macro, to do things like capture output and indicate what failed. But equals,is_false, etc., probably can't be bare words, because otherwise @expect_that will have variable arguments, which is not going to work. Probably need@expect_that 2+2 equals(4). First argument gets evaluated with result, stdout/stderr, and exceptions all being captured. Second argument gets evaluated, then gets parsed to determine what the result is compared to from the first argument.
- First (observed) argument evaluation results get put in a TestObserved type with slots for each type of output.
- expect_thatinternally generates a TestResult type
- expect_thatdoes a- produce(tr::TestResult). So, there's a Task that executes each file in a list. Within each file, along with setup/shutdown code, there are these implicit producers that generate the output to the consumer.
- context("asdf")or- @context "asdf"should probably just do- tls(:context, "asdf"). The process in- expect_thatthat generates the TestResult then just does- tls(:context)to fill a slot.
- I don't much like test_that's syntax for groups of expectations, although I like the idea. test_that("the sky is blue", { expect_that(...) }). What's wrong with just another labeler?@testing "the sky is blue"seems good enough to me.
- Currently thinking labelers should be test_contextandtest_group, and maybe the actual test should betest_that? Jeff suggested minimizing macros, so only@test_thatwould be one...
So, a simple file might look like:
test_context("String Processing")
setup = "The quick brown fox jumps over the lazy dog."
test_group("whitespace trimming functions")
@test_that strip("\t  hi   \n") equals("hi")
@test_that strip("hi") equals("hi")
@test_that strip("") equals("")
test_group("string length and size functions")
@test_that length("hi mom") equals(6)
@test_that length("") equals(0)
teardown = "noop"
Executing tests
- runtests("filename", consumer)
- runtests(["f1", "f2", "f3'], consumer)
- runtests("dirname", consumer)-- recursively find- test_*.jlfiles within specified directory
- runtests(consumer)-- default is above, with current working directory
- consumeris a function name, defaulting to a simple text-based outputter of TestResults
- julia -e "runtests()"should work
outputters
- the consumer argument should be a function that takes a Task object and consumes TestResults objects until they're gone, generating some sort of output
- the default method will display the context and one . per successful test, or an E for an unsuccessful test. At the end, all failed tests will be output, along with summary stats.
- other outputters could use graphical displays or activate lava lamps or whatever.
expectations
If test_that evaluates its first argument and collects the results in a TestObserved type object, then the expectation functions return a closure/function of one argument that takes a TestObserved object and returns a TestResult.
Tentative expectations:
- is_true,- is_false
- is_close_to-- tests using some numeric slop
- equals/- is_identical_to-- tests using- isequal
- is_a-- tests using- isa
- matches-- string regex match
- prints_text-- stdout matches string (and/or regex?)
- throws_error
- takes_less_than-- performance testing
Note: Boy, namespaces would be useful here. In theory, I'd like it if runtests were in the core namespace, but when it runs, it imports in all of the expectations, so they can be used transparently without the user having to do anything.
Parsing Expressions vs Expectations
Stefan suggests using Julia's ability to manipulate expressions as a way of improving output with the funny syntax of expectations. So @test f(x) == 7 gets processed by the macro three ways:
- It gets evaluated to see whether the test succeeds or not.
- The string is kept for possible output purposes.
- The AST is examined and compared against several built-in standard forms. These conditions allow the parts of the expression to be separately evaluated and stored for output purposes.
In this case, the logic is something like:
if (ex.head == :comparison) 
  result.lhs_eval = eval(ex.args[1]))
  result.rhs_eval = eval(ex.args[3]))
end
Other built-in cases could intelligently deal with thrown exceptions and so forth.