Journal: Daily - prl-julia/julia-type-stability GitHub Wiki
Winter 2023
2023-12-04
Jan is improving his parser.
Artem added a switch to enable printing of type stability checks in julia-sts. Jan wants the output to be reformatted to compose with parser. So, for a method foo
being checked against input type (Int, Bool)
, Jan wants
function foo(::Int, ::Bool)
2023-11-27
Jan&Jan are working on new approaches to exploring Julia's subtyping hierarchy, which remains unwieldy.
Jan J repurposed julia-sts' exploration tool originally meant to discover method definitions in a given Julia module so that a user can "check the module for type stability". The update makes it into method and type dumping tool. Dubbed The Dumper.
Jan V wrote a parser in Java for the format of The Dumper to make it handier to explore the types. Jan V thinks that it should be possible to start generating types stability checks from these data alone. Artem is sceptical because those types will still have plenty of existentials.
An interesting idea Jan V floated at the meeting for fighting under-constrained types. Many possible input types are just wrong
and the type inferencer will return Union{}
. Imagine you have a "valid" input type (Artem: e.g. from tests). Is there a way to leverage that to make sure that we don't go for wrong inputs? Perhaps, we should look for types that "close" to the good one in the hierarchy? A neat example with a function taking Vector{T} where T
and a "known good" type T=Int
is easy to construct...
Jan V's repository (see above) also have a beginning of a paper.
Summer 2023
Artem's graduating in early Fall and moving to Purdue, so there'll be a break.
2023-07-21: [meta] moving to the new prl-julia server
Spring 2023
2023-05-04: resurrecting types from top-10 works (incl. parametrics)
After some fiddling with the algorithm for figuring out the package/module that a type originated from, all ~10K types from top-10 packages seem to be resurrecting all right.
2023-05-01: resurrecting parametrics
When I said "hooking up the DB into enumeration engine", I forgot that currently I have an intermediate step to try out: just resurrecting types (julia-sts/scripts/resurrect-types.jl
). A try of resurrecting a DB from 10 packages quickly showed that parametrics are not really supported there. Need to work on it.
2023-04-28: Types DB for 10 packages
Trying to build a DB for 10 packages: currently segfaulting on Knet
while doing the analysis (after the test suite is done), in particular inside type inference (and even more precisely: in subtype.c
!).
I had to update Knet
and Flux
because the versions pinned in the 2021 paper didn't install anymore with today's Julia. As a countermeasure now, I may need to revert the version of Julia... In the meantime, proceeding without Knet
...
I really need to start hooking up the DB into StabilityCheck
.
2023-04-26: Parametric Types for Types DB
Finally, collecting parametrics is shipped in 2eebf34
. Figuring out the support in the resurrect script...
Currently even the collection of types is failing after I moved to the server. Some sort of environment problem (Julia version, LLVM version, etc.).
2023-04-25
Git histories project
Over at ulysses4ever/julia-sts, Honza is processing Git histories of packages to find out about changes in stability over time. He found a curious commit to Plots.jl where 3 lines "broke" stability of 21 methods:
95d1fa00019172ec598ce3f214d1e69adcc0e1a2 Set max_methods=1 (#4010)
diff --git a/src/Plots.jl b/src/Plots.jl
index fcf76991..5717a37e 100644
--- a/src/Plots.jl
+++ b/src/Plots.jl
@@ -3,6 +3,9 @@ module Plots
if isdefined(Base, :Experimental) && isdefined(Base.Experimental, Symbol("@optlevel"))
@eval Base.Experimental.@optlevel 1
end
+if isdefined(Base, :Experimental) && isdefined(Base.Experimental, Symbol("@max_methods"))
+ @eval Base.Experimental.@max_methods 1
+end
See https://github.com/JuliaPlots/Plots.jl/pull/4010 and, more importantly, upstream PR about module-wise max_methods
: https://github.com/JuliaLang/julia/pull/43370.
Types DB project
In this repo, I'm finishing filtering existentials when creating a Julia types DB. The DB is supposed to be used by julia-sts
to sample subtypes of Any
(or instantiations of type variables).
2023-03-22
Spent whole day on fixing CI. On the way:
- learned about annoying change in Julia API for
Method
s (https://github.com/JuliaLang/julia/pull/49071); - found a Julia
nightly
regression on Julia's GitHub action (https://github.com/julia-actions/julia-runtest/issues/76); - started with
juliaup
; annoyingly, they don't havenightly
.
2023-03-21
-
We just merged
collect-types
branch that adds another table: types seen during tests (stability-stats-intypes.csv
)464b43
.We were wondering if you need to record modules that host those types, but it seems like the types are always fully qualified, unless they come from the
Core
module (in which case no need for extra qualification anyway). -
I improved batch processing of packages: output logs are stored in a file if the pipeline is run in batch mode (befaec9)
Meeting
- We eyeballed the union of types collected from top-10 packages. There's relatively few groups:
- simple ground types;
- simple parametric types instantiated with ground types;
- big parametric types;
- anonymous type variables (
var"#5#6"
stuff, e.g.:Chain{Tuple{var"#183#217", var"#184#218", var"#185#219"}}
) and generally unbounded variables in instantiations; - bounded abstract instantiations (e.g.
AbstractArray{<:Number}
);
June 2021
2021-06-09
My plan is to look into JuMP functions and write a report for the past week's findings.
2021-06-08
Implemented a fix for JuMP.
Gadfly and Pluto have no data, so they're next to fix.
Implemented a fix for both: my bug in the stats logic (some methods don't have the return types metrics because we failed to type-infer).
2021-06-07
This issue about printing types from method instances seems relevant to the current problem with JuMP
:
"Error showing value of type: type TypeVar has no field var" - error displaying MethodInstance encountered from MathOptInterface
2021-06-02
Many interestingly stable methods are indeed parametric. Main examples are: maps, identities of sort e.g. smart constructors (i.e. wrapping the arguments into an instance of a polymorphic struct).
Several packages are failing since last update (added registering of input and return types): JuMP, Gadfly.
2021-06-01/02 more "interestingly" stable methods
Continue the list of Interestingly Stable Methods: Flux and DiffEquations.
May 2021
2021-05-25 "interestingly" stable methods
Observation: stable methods can be divided into two large groups: ones that always return the same type and the rest ones. Same-return type methods seem less interestingly stable. The others, more polymorphic ones, may have something in common. Light inspection suggest that they seem to behave parametrically.
Conjecture 1: most stable methods are uninteresting (always return the same type).
Conjecture 2: most of interestingly-stable methods (ones that return more than one type on different calls) are "parametric" for some definition of this word.
Inspecting JSON.jl and putting observations in the list of Interestingly Stable Methods.
Apr 2021
@nospecialize
2021-04-12 A Method
stores a bitmap field nospecialize
marking nospecialized parameters.
I tried grepped for nospecialized in top-10 packages, there's not much but there's some (3 packages had it). I figured connecting results of a grep with results of dynamic analysis will take non-trivial amount of time, so I added this feature into dynamic analysis. Luckily, it was easy to do. Dynamic analysis capture more that just stuff from the packages but also from their dependencies, so there's a bit more of this nospecialized business. (edited)
I took one package that had it most (both, by static and dynamic measure) -- Plots. I had 33 methods stable out of 75 that had the nonspec marker. This is less of stability comparatively to what we have for this package globally (76% of the methods recorded appear stable).
So far nothing very surprising... (except I learned that you can put @nospecialize
as the first statement in the function body -- this will make all params nospecialize. Especially useful, if you have variable number of parameters (...), e.g. Plots use it this way.
Half of those 75 is either a macro or codegen'ed stuff (e.g. via eval). For macros Julia seem to default on no-specialize (maybe because if can generate a function with most specific types at the call site?), i.e. users don't put the marker, it seems. For codegen, it's all sorts of weird stuff.
Interesting case is @nospecialize(f::Function)
which makes a lot of sense. Recall that every function has a unique type. So, every higher-order function (receiving a function as an argument) will get a specialized version for every previously unseen argument!
Mar 2021
2021-03-08
- the pipeline seems to be unstable under parallel execution no matter what. Let's try sequential.
- (not) surprisingly, sequential works just fine!!!
- made histograms more granular (added more "bins").
2021-03-05
-
Changing scatter-plot for histogram (I learned the hard way, i.e. reading the bug tracker, what's the difference).
-
Move CSV printing out of test runner.
2021-03-01
- Figuring out what's wrong with Pluto and Gadfly. Failed to.
Feb 2021
2021-02-27
-
Even more plotting: added titles for pictures (holding package names), created a collage:
montage *.png -geometry 1200x800+0+0 -tile 4x2 all-by-size.png
(
montage
coming from ImageMagick.)
2021-02-24
- More plotting. Solved yesterday's problem with
:sym
passed in a variable -- it requires wrapping incols
(for some reason...).
2021-02-23
Things I'd like to try at the moment:
-
cleanup latex so that we don't have garbage in metadata
Annoying. It break things because of redefinition of
\empty
. Bisection was slow and painful :-( At least, I learned a bit how to latex in emacs. Also, how to magit there. -
generate graphs of possible correlations with stability programmatically
Half-way through: I took the wrong package (Plots instead of Gadfly) and now it behaves weird: don't want to accept a symbol (for a column name) in a variable -- only literally (
:sym
). -
learn to count jumps to test the control-flow vs. stability conjecture
-
work on formalism
2021-02-16
-
Fixed constructors issue (again) 6fec7a2
-
The
source
field in theMethod
is a compressed Julia IR, see@code_lowered
.- Need to find a way to get to Julia's internal
uncompressed_ir
. This'd allow access to IR for subsequent control-flow analysis (should amount to computing the number ofgoto
s).
- Need to find a way to get to Julia's internal
2021-02-03
-
[Getting more detailed data] CSV package conflicted with some packages under analysis, so I had to set up a (yet another) fresh sandbox.
-
[Looking into data more closely] What stands behind those numbers?
Take Pluto.jl with 80K+ instances. Those came from ~7.5K methods, but only a handful of them were defined inside Pluto itself (most are just standard).
2021-02-02
- [
@generated
cleanup] finally, filtering @generated functions works.