Unopinionated Philosophy - Lambda-Mountain-Compiler-Backend/lambda-mountain GitHub Wiki

λ☶ is a fragment assembler. As such, you as a user are enabled to generate whatever binary file you want. This tool exists to transform data from the provided input to a desired output. The language exists to enable transformation, not judge implementation.

Opinionated Perspective, Neutral Objective

LM aims to provide an intersection of language features rather than a union of features. Individuals involved in the project are not discouraged from having opinions or from advocating for them. However, when seeking consensus, the individuals should understand that others' perspectives may be rooted in different values or assumptions. The object called LM then becomes a playground for new more diverse ideas, to build on top of to reach new more difficult extremities.

Abstraction As Appropriate

LM provides high-level abstractions that may be familiar from Functional or OOP programming. However, nothing is provided that can't be removed by the programmer. Abstraction is a burden that always needs to justify its cost.

Yak-herding Not Yak-shaving

Yak-shaving is a computer science community term similar in usage to "going on a wild goose chase". The story starts with a small task, leading to a related dependent task, eventually leading to the Himalayas shaving a Yak. Yak-shaving is used to describe a dependency chain that was unintended, undesired, and effectively unrelated to the original task.

However, the Himalayas can be quite nice this time of year. Also, Yaks are quite docile creatures and maybe you want to befriend a herd. Some people may enjoy tending to Yak business, who's to deny that line of work.

LM encourages active cultivation of dependency chains. Pay the man for some Yak hair and be on your way. Don't ask why the Yak herder chooses to lead such a humble life. Sometimes software dependencies are hairy, and we all need something to work on.

Compile-Time Abstraction > Runtime Abstraction

LM explicitly encourages the compiler to take on responsibility for generating good code objects. If LM can do more, then that often means the resulting object code can do less.

Much Like C

C is a portable assembler, and does a pretty good job at that. LM is a portable assembler, and tries to do a better job at that.

Things removed from C

weird syntax and AST quirks
text-based preprocessor

Things added to C

specialization
hygienic macros
functionally sound type system
generics / templates
truly platform agnostic output (MP3 / HTML / etc.)

Much like Perl

LM is a living language. Learning LM should be more than just a chore, it should be mind-opening. To get to be a great language, however, we first need to be a good language. To get to be a good language, however, we first need to be a mediocre language.

LM is definitely at the "mediocre" stage of development. There are lots of rough edges, but hopefully some of the potential will also show through. This is all part of being a living language. Things change, hopefully for the better.

Unlike [Net]Script

JavaScript was the first programming language to become widely adopted on internet browsers. While JS was being developed there was a corporate arms-race to add features but also to break the competitors interpreter. This mine-laying and sabotage created the often unfortunate feature set that we now call ECMAScript.

If LM contributors were transported back to that time, we may not be sure what to do differently. However, we are here now, and on friendly terms; so let's build something nice together.

What history has taught us moving forward is that corporations like to deliberately break stuff that may be of use to their competitors. Herein lies one of the fundamental technical contributions of LM, which is an ability to "mine sweep" effectively. Diverse tools and architectures create arbitrary hurdles to software development. LM provides the ability to declare architecture-specific concerns without burdening other programmers with more tedium or pain.

Libraries Exist to Support Common Transformations

The most opinionated that λ☶ will ever be is to decide what libraries to support and encourage. Our process is entirely metrics driven and whenever possible we try to quantify opinions. Patterns of code manipulation are only said to be desirable or undesirable when compared against some metric. If you disagree with any design decision, please try to quantify it. This helps move the discussion forward.

Principle of Nominal Trust

The type system also tries to be unopinionated to a certain extent. The language is strongly rooted in the semantics of System F<: with a small extension to support ad-hoc specialization. However, there are several ways to possibly circumvent the type system.

LM is an assembler, with the programmer defining the instruction set at some level. The rest of the toolchain is then built on-top of these low-level building blocks. This does mean that at some level the compiler needs to trust the programmer. This basis of trust is called the "Principle of Nominal Trust" because certain definitions are believed without further verification. One design goal would be to minimize the surface area of this trust, however it would be impossible to completely remove this boundary.

No Hidden Layers

The compiler should be 100% user defined to the capacity of System F<: with Specialization. There may be some small conveniences added (like hygienic macros), but every transformation should be transparent to the user.

Bootstrap Features Exist to Enable the Compiler Functionality

Some features are considered "untouchable" to never be removed entirely. These features are required by the internal compiler infrastructure and can't really be removed without crippling the tool itself. These features have no runtime cost. They are only exposed at compile time.

Platform Agnostic

LM compilation should produce the same output, no matter what platform you run it on. This means that

Windows builds should be the same as Linux builds
sudo builds should be the same as user builds

This principle only applies to hermetic builds. Non-hermetic scripts or code objects can be very platform dependent.

Standards

A Standard is a set of semantic tests that can be run with doby validate [folder]. A Standard will simultaneously validate a compiler version and associated platform definitions. The standard library is validated by the standard/lm/v1.0 language standard. Language standards should have a major and minor version specified, but no development version (the third number).

LM can be run in --standard [version] mode which requires that all features used must be explicitly defined in the standard. This is not a 100% guarantee that no platform specific bugs would be present, but it can be a useful lint step in a larger build process.

Security

LM is fundamentally an assembler. We strongly encourage static analysis, and bundle a default Coq-based verifier, to make sure that you get what you want from your program. Beyond static analysis, LM aims to provide intuitive abstractions that transparently generate the program code that you are looking for. The LM project itself however will only accept security vulnerabilities when "the generated code is surprising." This could be in the nature of overly complex abstractions, strange interactions, or plain old bugs. We welcome all security related communications publicly or privately at your discretion.

Dog Fooding

LM wants to be supportive of all other language projects, however we also want to build all of our tools on top of LM. The main reason for this is called "dog fooding". When developers eat the dog-food that they serve, then they learn whether it is good or bad. Then the dog-food gets better, because developers are incentivized to make it so.

Bad Habits Make Good Tools

LM explicitly encourages a certain amount of laziness, clutter, and poor developer processes when building and using the tools. This counter-intuitive philosophy encourages the creation of robust mechanical processes that help keep development on track. Automation should compensate for a certain portion of stupid developer problems.

Unprofessional

LM is a hobbyist project. If someone wants to build a product on top of LM, then that is great and we'll try to support that. However, we will not tolerate hostage taking behavior. Different people have different priorities so unless you are paying a retainer fee then you shouldn't expect contributors to share your sense of urgency.

Incrementalism

LM is based on an idea that is elegant in theory. That doesn't automatically mean that all subsequent engineering decisions will be perfect. The current development process can be described as

Write a bunch of ugly code
Reflect on what makes that code ugly
Fix the ugly code
Go to Step 1

Always Subtract, Rarely Add

LM design is minimal. To stay that way we will rarely add to the language. However, we are always looking for ways to downsize.

This rule doesn't apply to libraries.

Diff-based Collaboration

Development of LM uses "diff based collaboration". Practically, LM is hosted on Github and Git itself is sort of a gold-standard for this style of collaboration. However, wholesale support of every Git feature is frowned upon. If your integration request doesn't fit into a diff, then it doesn't belong here.

Concurrency

Not everything should be wrapped in a Mutex. If everything is wrapped in a Mutex, then something has gone wrong. The LM proof-of-correctness model does not account for unrestricted threads running amok fighting for atomic writes. Just because the physical hardware supports that model of concurrency doesn't mean that you should do it.

We would like to approach concurrency in the most unrestricted / unopinionated manner possible. However, formal proofs are hard to negotiate against. If you have any ideas on this front, please share them somewhere. Lots of people would be interested to hear more about that sort of thing.

Confidence in Correctness Creates Opportunities for Optimization

LM is meant to be like a low-level ML. ML was one of the first truly formally specified programming languages that could run on a computer. LM is meant to be a much more permissive and efficient runtime that is still formally specified.

LM achieves this by attempting to completely eliminate Undefined Behavior from an otherwise C-level language.

Performance

The fastest code should be the most idiomatic code. We have the opportunity to make a new language with new features. So if performance tuning requires spooky rituals, then the language is poorly designed. Clear readable code should be fast and the language should be designed with that in mind.

UTF-8

LM programs can use any character encoding that they want. However, LM source files must be utf-8 encoded. If your favorite character is missing then take that up with the Unicode Consortium.

Barrier Aggression

Aside from bare-minimum respect for copyleft law, LM is an anti-centralized project. If you want to participate in the LM community be prepared to work with people who only share 15% of your codebase. Look for common ground and stay positive or you might be disappointed with how work happens here.

When encountering even the smallest shared code snippet that you don't fully endorse, just fork it. Fork LM please. Fork twice if you can. When more diversity of thought is applied to consider the code interfaces, the result tends towards more mathematically precise language and design. This precise language is precisely what we strive for in LM core.