Journal: Weekly - prl-julia/julia-type-stability GitHub Wiki

Feb 2021

2021-02-22

I looked some more into possible correlations between stability as I could measure it (the ratio of stable instances) and other metrics. Last time we discussed the number of jumps as a possible correlated metric -- unfortunately, I haven't figured yet how to analyse them (I have the code I just need to learn how to process it). Instead, I looked at the number of instances (a possible conjecture is that unstable methods could generate less specializations). Unfortunately, I didn't find anything interesting looking at a couple of packages. I probably neeed to learn how to generate graphs programmatically so I can look at more data easily. Right now I'm charting various things in Excel.

Stumbled across different "definitions" of stability in Julia's manual: ("just") stable type of the output vs. all intermediaries are stable (here stable means concrete after input types are concrete). I posted a question about definition of stability on Julia's bug tracker and got somewhat confusing answers:

https://github.com/JuliaLang/julia/issues/39752

I think they suggest that @code_warntyped (implementing the latter, stricter version of the "definition") is merely an over?-approximating tool making it easier to the user to get to the less strict version of the notion by showing them info about all types.

To get a better idea, I grepped a bit over source code looking for stability-related things. I didn't get a breakthrough yet: mostly short test cases related to various bug reports. One interesting pattern for testing stability I saw concerns the @inferred macro, which, quoting the manual,

Tests that a call expression returns a value of the same type inferred by the compiler. It is useful to check for type stability.

https://docs.julialang.org/en/v1/stdlib/Test/#Test.@inferred

An example from the manual shows it really well how it's supposed to be used, and it's representative of the test cases I saw:

julia> f(a) = a > 1 ? 1 : 1.0
f (generic function with 1 method)

julia> typeof(f(2))         # that's the actual return type
Int64

julia> @code_warntype f(2)  # this will show inferred type of the body
Variables
  #self#::Core.Compiler.Const(f, false)
  a::Int64

Body::UNION{FLOAT64, INT64} # here it is
1 ─ %1 = (a > 1)::Bool
└──      goto #3 if not %1
2 ─      return 1
3 ─      return 1.0

julia> @inferred f(2)
ERROR: return type Int64 does not match inferred return type Union{Float64, Int64}
[...]

julia> @inferred max(1, 2)
2

Other things I was doing mostly concern the text of the paper we started: starting a repo; reading and commenting on your proposal for the formalism (I have some more in Slack); setting up a CI for the repo so that we have a stable link on the current draft (required some GitHub actions kung-fu):

https://www.ccs.neu.edu/~artem/paper.pdf

Jan 2021

2021-01-27

Revising failures in the pipeline (including after some talk with Tim Holy over on Discourse). Result: I discard constructors now. This makes numbers way cleaner (i.e. less failures). A small number of type inference failures remains, mostly form @generated functions. Maybe we should revise @generated at some point.
Figuring out how to restructure data about stability with its relation to method sizes. Previously I keyed the data on function names, but a function can have several methods with completely unrelated implementations (including in terms of sizes). For now I decided to change for keying on individual methods.

Note that there are more layers in this hierarchy: every method can have several (compiled) method instances; every method instance can have several "specializations".
I had some more thoughts about the formalism. In a nutshell, key components are:
- a sound language with no static typing, dynamic dispatche on run-time types (no overloading), so every function call can be either dynamically resolved or produces an error;
- a type system that calculates static approximation of types of arguments;
- using the information computed by type inference we can replace dynamic dispatches with static calls.
- proof that this transformation produces a program that simulates the original program.
What remains to be worked out is how exactly "dynamic dispatch" and "static calls" look in the language.

Notes form one-to-one with Jan

Look up how they (Julia) define type stability:

in the docs
in the impl

Do abstract hurt stability as much as unions?

Notes from Juliette meeting

Ben: which kind of types cause unstability?
Ross: is it possible to record the frequency of stable calls? e.g. attach counters manually.
Ben: Maybe running times would be more useful? "real" apps?
Julia: how many instable slots are there?
Jan: get an idea about the number of methods, not instances.
Jan on formalism: surface language:
- every var is any
- every call is dispatch
- no ANF [Ben: having ANF there is easier; Jan: agreed]

Dec 2020

Week 15-21

Last week I've done some more work on harnessing analysis of type stability. First, I had to figure out scoping rules with regards to modules: the MethodAnalysis package that I use to enumerate compiled methods requires you to know which module you're interested in and have it evaluated (as a Module object). As trivial as it sounds, it wasn't completely straightforward. E.g. when writing tests for my code, I had to acknowledge that everything you define on the top level in Julia becomes a part of the Main module, and then if you happen to define a module (which I do for tests), its name becomes prefixed with Main. (e.f. for a module Foo it becomes Main.Foo).

This worked out, and now I can define a module with several functions, call them so that Julia compiles them, and then call my analysis utility built on top of MethodAnalysis (and Julia's own @code_typed) to get back the stats. I designed a simple data structure for stats: it holds the analyzed module, and a map from function names to a pair of numbers: number of instances of the function compiled, and the number of stable instances of that function.

Lastly, after figuring out modules, I went one level up and started designing a pipeline for processing packages. The basic scenario I'm implementing now: I have a package name as a string, and I want to:

load the package in a clean environment (so that adding it to the environment will not conflict with anything else);
run its tests;
call the above mentioned utility with the module same-named as the package. So, 1)-2) went pretty smooth using Julia's Pkg module. But 3) was tricky: first, I had to do some eval-fu again to get the module object, but that seemed to work, second, the analysis found no functions :-( THe working hypothesis now is that Pkg.test (implementing p. 2) above) forks a separate Julia thread, and I can't get data about compiled methods out of it.

Now, I have to figure out how to run package tests in the current process, so that I get access to the compiled methods.

Week 8-14

Last week we discussed probable strategy to study type stability in Julia. The idea is to start with looking how much of stability is there in practise, i.e. measure how much type-stable calls we see in test suites of Julia packages. One thing we noted about preformance is that it's hard to measure the inpact of (in)stability (based on what I've seen with the example from a 2013 blog post).

I was adapting Julia's @code_warntyped, which prints information detecting (in)stability, to get the result as a value rather than console (color) output. This wasn't hard. The resulting tool I call is_stable_call. Just as @code_warntyped, it accepts a method name and the tuple of actual arguments with which the method is being called.

As to collecting information from packages, talking to Ben confirmed that Cassette.jl can fail on some packages (although Ben did not have a feeling of how much of packages Cassestte failed on). Instead he suggested to work with another, potentially more robust package, called MethodAnalysis.jl. It has a clean interface for quering Juila about methods it compiled so far. For every compiled method you can get the name and the type of arguments, so this can be connected to is_stable_call. It was a little tricky to extract that information: I had to open an issue about Julia documentation of these things in the main Julia repository (MethodAnalysis.jl uses Julia's built-in types to represent compiled methods), got a hint from there and submitted a fix for it.

My plans are to figure out how to carefully loop through packages and run all testsuites. Also, I'm thinking which exact numbers to count (e.g. total number of methods compiled, ratio of stable ones, maybe ratio of stable methods per function, etc.).