testing monorepo - benclifford/text GitHub Wiki

monorepo contains "lots" of stuff , for some definition of "lots" - think "I don't want to recompile my linux kernel & reboot because I have changed a typo in a doc comment in some random util" / only want to partially re-deploy components per commit.

how to separate this?

modules/packages? plenty of packaging systems exist, able to express dependencies. in my project, multiple ones are in use (python/pip, apt, cabal/stack, docker images)

how to know when a package has "changed" - what does it even mean to "change" in the context of building arbitrary commits? in the presence of package managers having their own caches? and not wanting to write "yet another" package manager?

in the case of eg apt, a particular build results in a single file, the dpkg which we can put in some dependency-style directory, and it is built "from scratch" each time. other things, like stack, want to manage their own cache (badly...) and need that cache to make partial rebuilds happen at a decent speed.

can nix deal with this? i think probably not but I'm not sure - I think it is too much about defining its own packages rather than re-using what already exists.

given a workdir, perhaps allow each package (in dependency order) to decide if it needs rebuilding or not (due to a change), and if so force the rebuild.

maybe also allow detection of "we ran a build but the output hasn't changed"? that's used in the beautilytics css generation, for example. but does that need to happen at this level? or is it sufficient to have said "if we told you to build, we're going to assume all your downstreams need to be rebuilt, and the individual downstream build tools can optimise rebuilding themselves" ? i think the latter

also tests are separate modules in the sense that we need to re-run beautilytics/python-rest-server integration if either is rebuilt, not just if beautilytics is rebuilt (because the dependency is a test-time/run-time dependency)

trigger redeployment appropriately too - some stuff that is built is a deployment time dependency (eg if we rebuild the beautilytics dpkg, we need to redeploy that; and if we rebuild spork, we need to rebuild stuff that depends on it).

what does it look like for database stuff: if we add a migration, that goes into the beautilytics dpkg - does that mean when we redeploy beautilytics we need to redeploy everything that uses the database, because that's how we express that the database has been changed? if so, should "the database (schema)" be factored out into a separate module?

beware building an overly elaborate system here. KISS.

explicit goal for what I want to do is re-use existing build/packaging systems and drive those - because rich ecosystems already fairly well packaged. don't want to re-package all that stuff. and output artefacts look all different - might be a dpkg, might be a docker image ID in a local docker repo ... want a system that knows how to order these builds appropriately.

describe dependency graph of 'modules'. somehow know if those modules have 'changed' (like shake does, for example).

how do I know a module has "changed" with respect to what we already "have"? Database which maps: from module content+dep IDs - could be git commit IDs or hash of file tree contents if we are out of git, of the current "module contents" plus the content/dep IDs for all dependencies. to "module build result" - which will be polymorphic, depending on the type of the module - might be a dpkg. might be a docker image ID. might be what? True for if it is something that isn't "built" directly (eg python source code, or a package like spork that is compiled as part of a downstream build).

could these be shake oracles?

does a "deploy" look like this? it isn't "built" but when an upstream changes, the "build" action (which is 'redeploy') needs to happen...

UI should be running "make" in tld (with appropriate parameters for what the end goal is - eg deploy to production)

what stuff is written about this already?