ASR Design - lcompilers/lpython GitHub Wiki

Abstract Semantic Representation (ASR) Design

How to Contribute

You can modify online at https://github.com/lcompilers/lpython/wiki/ASR-Design by pressing the "Edit" button, or you can modify locally using git and push:

git clone [email protected]:lcompilers/lpython.wiki.git
cd lpython.wiki
vi ASR-Design.md
git commit -a
git push origin master

Quick History and Motivation

The LFortran compiler was initially designed using AST (Abstract Syntax Tree) and annotating the tree with types and a symbol table. Then the AST + annotations were lowered to LLVM. The semantic analysis code that took AST and appended type annotations and symbol table, together with lowering to LLVM was very complicated, because it was not clear what exactly the output of the semantic analysis should be, and if a bug is in the AST+annotations lowering stage, or in the semantic analysis itself (or both). The code was getting complex and hard to develop. We searched for a solution and came up with an intermediate representation that we call ASR (Abstract Semantic Representation). The semantic analysis takes AST and returns ASR. The LLVM backend takes ASR and creates LLVM. Adding ASR to LFortran cleaned up the code and set firm boundaries what the input and output of each stage of the compiler is.

The previous paragraph is the basic motivation for ASR, but there is still a lot of freedom left in the details how the ASR should look like. This document systematically discusses the ASR design, and provides documentation how to modify it or add to it as the needs of LCompilers evolve.

Requirements

High Level

A simple intuitive definition of ASR:

We can think of ASR as the simplest core symbolic representation of the surface language. All the cruft is abstracted away, but all the important things stay, nothing was lost. We can then use ASR to manipulate the language. To think of the language, we can simply think of ASR. ASR was designed to be equivalent to the language. At the same time, it can immediately be lowered to machine code (and obviously anything higher level like LLVM), everythig is simple and clear how to do so.

To achieve this, we need to go deeper.

Middle Level

There are two main requirements:

We want ASR to be a standalone representation, as far from the original surface language (Python, Fortran, ...) as possible, without losing any semantic information, so that it is always possible to convert ASR back to the original language, but we do not need to deal with often complicated surface level semantics (such as scope rules, what type is 1/2, and so on).
We want ASR to have all the semantics figured out locally, so that that the backends (LLVM, WebAssembly, etc.) can generate code immediately by visiting each node without having to do extra passes over ASR to figure things out. Everything must be accessible either directly in the node, or via a few pointer dereferences (for example it is ok to go from an ASR node to its parent symbol table, then up to the module symbol table, then to the ASR node of the module, to determine its name; as long as it is just pointer hops; it would not be ok to loop over ASR to examine each module to find which one contains the given node).

Low Level

In addition to the above, there are several other requirements that guide the detailed structure of ASR, such as:

In Debug mode we want quick compilation. So we want each language feature to be faithfully represented close to the original form and not do any kind of complicated transformations, so that we can quickly generate LLVM or other code. In particular, in theory, just source code -> AST -> ASR -> LLVM passes are needed. We should not need to do any kind of rewriting of ASR after it is constructed. Everything should be in the form that we can immediately lower to LLVM (and other representations).
So as an example of the prior point: in Fortran (and possibly later in Python too) to determine the type such as real(dp) we need to know the compile time value of dp. So we always force the frontend to provide a compile time value of any expression if it exists and store it in ASR. Then we can quickly determine the value of dp when it is encountered and determine the type. But we do not want to be doing any more than that, for performance reasons in Debug mode.
In Release mode we want the best optimizations, represented as ASR -> ASR passes, so we want to be able to represent all high level optimizations in ASR. We should eventually do function inlining, code rewriting, dead code elimination, loop unrolling, moving things around, promote allocatable arrays to stack arrays, etc. The optimizations left for LLVM: promoting variables to registers, some simple cleanup of generated LLVM code (like "i + 1 - 1"), and any low level optimization that LLVM can do, as long as it plays nicely with ours. We should generate such LLVM code so that LLVM can finish the job.
We want a verify() pass that we run after we construct ASR and after each ASR -> ASR pass to check that ASR conforms to our semantic requirements for it.
We want any valid ASR (i.e., ASR that passes the verify() check) to always be possible to lower it to LLVM. In general all backends should always generate correct code for any ASR that passes verify(). The only possible error a backend can say is "Not implemented yet". In practice, a given backend might choose to call some ASR passes to rewrite some ASR nodes with some other ones (for example the LLVM backend calls an ASR pass to rewrite for loops to while loops, so that we only have to implement and maintain while loops in the LLVM backend), but that is just an implementation detail of the backend, from the user perspective the LLVM backend can accept any valid ASR code.
We do not prescribe how the given ASR node / feature is implemented in the backend. We are not tied to any ABI (Application Binary Interface). The backend can choose any representation it wants. Some nodes have a flag like abi = BindC, in which case we are requiring the node to be ABI compatible with C; this can then be used for C interoperability. We can add more abi, like GFortran, Python, etc.
Symbol table: each symbol (such as a variable) is just a pointer to the symbol "declaration" in the symbol table. The symbol table must be scoped to allow variables being declared in different scopes. The backend can thus lookup everything about each symbol, as well as maintain a separate hashtable with backend specific information for each symbol. The alternative design is to duplicate the information from the symbol table into each symbol, but that would make ASR bigger and it would be harder for the backend to figure out if two symbols are the same.
We try to not duplicate information in ASR, keep it unique (that is, have a symbol table, not duplicate it).
Within the constrains above, we try to keep ASR as simple and abstract as possible. For example we merged Subroutines/Functions into just Functions (one less node to implement and worry about in the backends).
We often evaluate a change in ASR based on if it simplifies code in the LLVM and other backends. For example, we used to have a BinOp that handled all types (real, integer, complex), and each backend had a switch based on type. So we split BinOp into IntegerBinOp, RealBinOp and ComplexBinOp. This simplified the backends.
Backends that lower (LLVM, WebAssembly, x86) can use the compile time value as well as can internally apply any ASR -> ASR simplification (such as for loops -> while loops) to make things easier and to keep the final code more optimized.
Backends that compile to surface languages (Python, C, C++, Fortran, Julia) should keep the code as readable and as close to the original surface language (that the ASR was generated from) as possible. That means, they should not apply compile time value for expressions (e.g., they should print (2+3)*5 and not 25 even though 25 is available in ASR as the value) and they should not rewrite ASR.
At a technical level, we want to represent ASR in some simple high level format (we chose ASDL) that is on the order of 100 lines long and automatically generate highly efficient C++ code to create, walk/visit, serialize/deserialize, print etc. (on the order of 10,000 lines).
See also these ASR requirements: https://github.com/lcompilers/lpython/wiki/Intrinsic-Functions-Design#asr-requirements

This still leaves many possible design decisions, such as:

Have statements and expressions vs just expressions
How arrays are handled (Array(Integer) vs Integer(dimension * ))
Should we use basic blocks (that do not contains branches).
Should we enforce canonical ASR, and let the frontend always create ASR in canonical form
How should function signatures be handled (separate type, etc.)
How to handle types? Currently types are duplicated / copied for each expression (each expr has a type) and each symbol (in the symbol table) as well as any other place that requires a type. An alternative (perhaps better?) design would be to have a (global?) type table (like a symbol table), and we would reference a given type by a pointer (just like a Var references a symbol), and appropriate serialization/deserialization.
Related: Should types by unique? Currently we compare types like (Integer ....) always by comparing all details each time. It would be much faster to just compare unique integer IDs. But how often does that happen? Often the types have to be compatible but not always exactly equal. Although we should make things so that types have to be exactly equal (otherwise just cast it first). For example a function call and function signature should then have exactly the same type IDs.
Structs (derived types) and classes could have an ID and we just use the ID everywhere. Classes and derived types are in symbol table, so we sort of already do this (half way)
Function signatures could be types, and stores in the type table
Idea: types could just be stored in the symbol table, some already are (class/derived type). Some common types like current Integer+kinds (such as i32) could be stored in the global symbol table always, custom types (function signatures, classes, derived types, etc.) could be stored in the symbol table where they first appear; if they are needed as part of a function definition, they could be stored inside the function's symbol table and referenced there; types are then equal if and only if they have the same pointer (serialization/deserialization machinery will handle this properly). If a type is declared in one module and used elsewhere, it has to be accessed via ExternalSymbol, just like any other symbol.

Types and Symbol Table

ASR should have a chained (scoped) symbol table. A symbol table (sometimes called compile-time environment) is composed of two parts:

a mapping of symbol names (strings) to the corresponding symbol information
pointer to parent symbol table/scope/environment

For our purposes, a type is a mathematical set of values and a set of allowed operations on those values. The set can be finite or infinite. Examples:

The i8 type is a finite set of 256 integer values between -128, ..., +127; the operations that are currently allowed in ASR include arithmetic operations, comparisons, casting.
Other finite sets: i16, i32, i64, f32, f64, c32, c64, bool
Function type, such as (i32, i32) -> i32, has infinite number of values (all possible function implementations that satisfy this signature)
Strings are an infinite set of all strings of all sizes
Other infinite sets: BigInt
Tuples, Lists, Dicts, Sets, Structs can be finite or infinite, depending on the element type (Tuple[i8,i8] is finite, but Tuple[str, str] is infinite)

All infinite sets are practically limited by machine limits.

The types in ASR are currently (January 2023) represented by the ttype nodes:

ttype
    = Integer(int kind, dimension* dims)
    | Real(int kind, dimension* dims)
    | Complex(int kind, dimension* dims)
    | Character(int kind, int len, expr? len_expr, dimension* dims)
    | Logical(int kind, dimension* dims)           
    | Set(ttype type)
    | List(ttype type)                                           
    | Tuple(ttype* type)
    | Struct(symbol derived_type, dimension* dims)
    | Enum(symbol enum_type, dimension *dims)
    | Union(symbol union_type, dimension *dims)
    | Class(symbol class_type, dimension* dims)
    | Dict(ttype key_type, ttype value_type)
    | Pointer(ttype type)
    | Const(ttype type)
    | CPtr()
    | TypeParameter(identifier param, dimension* dims)

Expressions

Every expression (expr) node has a type, by having the ttype type parameter.

Example:

| IntegerBinOp(expr left, binop op, expr right, ttype type, expr? value)

Function Types

A function type is represented by the types of the parameters and the type of the result, as well as various metadata (such as inline, static, ABI, side-effects-free, deterministic, etc.).

The function parameters or the result can be an array whose length is determined by some expression involving other parameters (such as integer, intent(out) :: X(size(A))).

The function type could be represented by a ttype node FunctionType:

| FunctionType(ttype* params, ttype? return_var,
    abi abi, deftype deftype, string? bindc_name, bool elemental, bool pure,
    bool module, bool inline, bool static, ttype* type_params, symbol*
    restrictions, bool is_restriction)

Other parameters can be referenced via a new expr node FunctionParam(int param_number, ttype type).

Open questions how to handle:

Optional parameters
Intent (in, out, inout, unspecified)
Allocatable

Idea: Allocatable could be part of the type; intent and optional can be represented as extra members of FunctionType.

Note: in the function definition the function has parameters. At the call site the expressions put into the parameter slots are called arguments.

Runtime Library

There are many designs possible, we initially opted to implement as much of the runtime library as possible as regular ASR code (in the surface language itself), that gets compiled (and optimized) without any special handling. Functions like search, trigonometric functions, most array functions (sum, mean, linear algebra), etc. can all be implemented as pure ASR code. See also https://github.com/lcompilers/lpython/wiki/Intrinsic-Functions-Design for more details of the current design.

If a given operation cannot be implemented using existing ASR nodes, in other words, it requires access to the implementation of the feature (say the size of an array, which depends on how the array descriptor is implemented), then we add a dedicated ASR node (such as ArraySize).

Some features (say clock_time) could be implemented using an ASR node, and maybe they should. We chose to keep ASR simple, and we implement these by calling into C, because each surface language has very specific, system and platform dependent functionality, so we simply implement this in C, then in Python/Fortran that calls into C. From ASR's perspective, the user facing function is regular ASR, that internally calls into C.

File IO currently is represented in ASR in quite Fortran specific way. We'll probably refactor it into a language independent way as much as we can. The rest will have to be handled via C calls.

The new iteration of this design is based on:

The ASR must be language independent
It must have all the semantics in
The backends and optimizer needs access to many intrinsics like sin, modulo, abs or sqrt to make optimizing decisions as well as to generate efficient instructions; it couples the backend to the frontend, we do not want that
When the user (or optimizer) specifies a high level special function like sqrt or LegendreP, we do not want to lower it in ASR, we want to represent it in a natural manner, without any coupling to the frontend, and without having to create a particular implementation in ASR (in a symbol table, etc.)

Consequently, the ASR must contain information about such intrinsics. We do not want to represent all intrinsics, especially those that are surface language dependent, or those that cannot be parallelized or used in declarations declarations due to side-effects. We essentially want to only do this for "math" functions: special functions and functions that we need in optimizers, which are our own "special functions". So for now we assume that all such intrinsics are deterministic, side-effects-free, are implemented using existing ASR (can be implemented in the surface language). We create a new expr node IntrinsicFunction which contains the arguments, function ID ("sin"), and overload ID (a particular generic overload of sin based on the signature, 0-10, or so: integer, real, complex, symbolic, whatever combinations we will support for a given special function).

The kinds of functions that will be represented as IntrinsicFunction are:

All math special functions: Sqrt, Abs, Modulo, Sin, Cos, LegendreP, BesselJ, Hypergeometric functions, Elliptic integrals, ...
Our own special functions, such as FlipSign, that the optimizer identifies in code, for which one can either apply special optimizations or the backend can generate more efficient instructions; we need a special function for every construct that the optimizer can recognize and backend efficiently implement
Symbolic algorithms such as: Limit, D (differentiate), N (numerical evaluation), Integrate, Sum, Apart, Together, Series, Expand, ExpandFunction, ExpandTrig, LeadingTerm, RewriteAsSum, Subs, Domain, Range

All these can be implemented in the surface langauge. If they cannot, then they belong into the "core ASR", having an actual ASR node (other than IntrinsicFunction).

As a bonus, this approach unifies numerical and symbolic computing in ASR.

We create ASR->ASR passes that can transform any IntrinsicFunction into the implementation that we will maintain in a given surface language. We think of these implementations as ASR implementations, but we use some particular surface language, so that we have the freedom to make changes in ASR without breaking the runtime library. The backend is free to call this pass and then it does not need to implement any IntrinsicFunction; or it can decide to implement some (or all) IntrinsicFunctions for performance.

This provides a hierarchical structure of ASR:

ASR core: a Scheme like subset that is self-contained and complete: lambda functions / closures, other such constructs
Higher level, we provide ASR passes to lower to the "core": DoLoop, Classes, Structs, Arrays, etc.
IntrinsicFunction is the highest level, uses the previous two lower levels

This is not the final design, we still leave the door open to further improvements, we can decide later where exactly the line goes, we can move more functions into IntrinsicFunction, such as BinOp, but for now we keep BinOp outside.

Here is how to decide if a given function X belongs into the surface language level runtime library only (the frontend inserts the proper calls), or into ASR's IntrinsicFunction or a dedicated ASR node:

Does the optimizer or the backend currently (or in the near future) require specific knowledge about X in order to optimize/generate code? If not -> surface language runtime
If yes, implement the function X using building blocks that are language independent: X1, X2, X3. For a given building block (function) X1, X2, X3, is it "pure" (deterministic and side-effect-free) and can it be implemented using existing ASR constructs in the surface level? If so -> IntrinsicFunction.
If not, see if it makes sense to have a dedicated ASR node, or figure out a new specific design

High Level Design

We want to delay choices for a particular API to implement something as long as possible. For example, we used to just use a particular frontend implementation for sin/cos (a generic function with a specific version that is used for single/double precision). This is committing to a specific API and/or implementation upfront, and it is effectively lowering of the AST, not raising. Rather, as with the new design, we want to represent the operation ("sin(x)") using the IntrinsicScalarFunction node, and much later during optimizations to decide how it will be implemented, and transform IntrinsicScalarFunction into something else, whether a particular C implementation, or ASR implementation, or perhaps some GPU backend implementation.

So we want to delay this choice. At the same time we usually want this choice to still happen at the ASR level using an ASR->ASR pass that transforms the IntrinsicScalarFunction node into some particular implementation, and then we can apply further ASR passes to further optimize the result.

Then in the backend ideally we do not deal with this at all. The only IntrinsicScalarFunction that we want to deal with in the backend are those that the backend can implement using some special instructions. Everything else, such as calling a C/API (Cuda, CPython, OpenMP, etc.) should be done at the ASR level, not in the backend, since the backend can't do anything special if we are just calling a C/API.

Relevant issue: https://github.com/lcompilers/lpython/issues/1996

Possible Ambiguities

Distinguishing between a pointer and a non-pointer variable

Posting the discussion I and @certik had about this,

Var can store both pointer and non-pointers.
And pointer can be used as a non-pointer variable (in Fortran, libasr is shared).
So, sometimes its not clear how many times you have load to the llvm::Value* associated with a Var.
Like say,

integer, pointer :: pi
integer :: i
integer :: a(4)
pi => i
a(pi) = 0

Now the ASR will have Assignment(ArrayItem(Var(a), Var(pi)), 0).
Now in LLVM a is a struct* (pointer to struct), pi is i32**.
Well when you visit Var(a), you can't load llvm::Value* tmp. Because you need a pointer.
But when you visit Var(pi) you need to load twice becuase you need i32 and not i32* or i32**.
For now I am using ptr_loads which specifies how many times to load a pointer when you visit_Var.
Keep using ptr_loads as the general solution for now as it works and is simple. Easy to override and restore between multiple calls.

This is one of the ambiguities which should be handled in ASR ideally but doing too much abstraction in ASR right now is not a good idea. Instead we should learn, experiment and then after some time when backends stabilise and we implement all the crucial features, we can make ASR handle these kinds of ambiguities.

Also later writing a complete C frontend (LC) that can compile any C code to ASR and LLVM will forces us to design this well, as pointers are inherent in C programs.

Ensuring deepcopy by default

There is one more thing but more concerned about the backend. We should make sure we are deep-copying objects by default. Right now there is some non-uniformity in that regard. We need to test and make this consistent. We can’t shallow copy objects because otherwise we will face the same performance issues as CPython and also our optimizations ASR->ASR will break.

Handling of lower bound, stride, offset and column/row order

Currently the lower bound is part of the type, and can be any runtime (or compile time) expression. In each function the lower bound can be set for any argument (or a local array), and the expression for it must be computable at the point when the function is called (effectively at the call site).

The stride, offset and column/row order is currently either not implemented, or only part of the runtime array descriptor, which is a backend choice.

We should investigate if it makes sense to move stride, offset (padding of each row/dimension) and column/row order into ASR just like the lower bound is. It seems these features only influence how the array is indexed, so the ArrayIndex (and ArraySection).

An idea is to store the lower index, stride, offset/padding and column/row order only when it is needed, that is, with each ArrayIndex, ArraySection (and anywhere else it is needed).

With runtime array descriptors with strides, many compilers generate two versions of the function: one that only operates on contiguous input arguments, and it can use efficient vector instructions, and another one that assumes general (runtime) stride, which is not vectorized (or not as efficiently). Whenever the backend does not know what to generate, it means our type system is not "ahead of time" enough, it leaves too many things unspecified. So one idea is to move things more to ASR (or even the frontend) to fully specify array indexing and other features so that the backend knows exactly at all times what and how to index an array, and consequently what exact code to generate for a function (it should not generate two versions ---- this should be a choice of the frontend (or ASR->ASR optimization passes) to generate two versions, of that is what it wants).

Pointers

The original ASDL itself does not allow to have pointers, so it can only represent trees (every node is represented by a "value"). We cannot have a symbol table (by value) and then multiple nodes like Var pointing to it. So we have created two special nodes:

The symbol node is always a pointer (non-owning)
The symbol_table is a symbol table dictionary of string -> symbol, and we use it two ways:
- symbol_table symtab is an owning symbol table, by value
- symbol_table parent_symtab is a non-owning pointer to a symboltable (owned by the parent node)

We then have special logic in serialization and deserialization to handle these pointers correctly.

Sometimes it would be very natural to have pointers to other nodes, but it becomes complicated to print, serialize and deserialize such ASR, so we do not allow that, besides the built-in special symbol type described above.

See this issue for more background and discussion: https://github.com/lcompilers/lpython/issues/1167

Continuations and Delimited Continuations

References:

Unsigned integers

A very common question is why not add unsigned integers (u8, u16, u32, u64) in addition to the existing signed integers (kind=1, 2, 3, 4; in LPython they are called i8, i16, i32, i64).

Many high level languages for scientific numerical computing such as Python or Fortran do not have unsigned integers in the language because there are many pitfalls with them:

Subtracting two unsigned integers can wrap around (e.g. 3 - 5 = -2 -> -2+UINT_MAX+1).
Consequently 3 - 5 < 3 is false, which can easily lead to many bugs in a code
The last condition in unsigned int x = 1; int y = -2; (x + y > 0) evaluated to true, even though with signed integers x+y=(1)+(-2)=-1.
Another variant: unsigned int a = 1000; signed int b = -1; (a > b) evaluates to false, even though with signed integers (1000 > -1 is true)
Infinite loop: for (unsigned z = 5; z >= 0; z--) { do_something(z); }
Complicated automatic casting in the frontend for things like comparisons of signed and unsigned integers
Hard to figure out for users when to use each type. For example it's safer to use signed integers, except when a function returns an unsigned integer (say the .size() function in C++), then one gets compiler warnings in loops like for (int64_t i=0; i < x.size(); i++) of comparing signed and unsigned integer, so one is forced to rewrite to for (uint64_t i=0; i < x.size(); i++).
Unsigned integers cannot be treated as a range-limited version of signed ones because their range of values is not a subset of the signed integers range. Neither signed, nor unsigned integers are subtypes of each other. For example -128 <= i8 <= 127 but 0 <= u8 <= 255, so x: i8 with the condition of x > 0 is not equal to x: u8, because for example x=200 is representable as a u8, but not i8.
Many developers (but not all) believe that unsigned integers should be avoided, such as the Google C++ guidelines:
- You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.
- Unsigned integers are good for representing bitfields and modular arithmetic. Because of historical accident, the C++ standard also uses unsigned integers to represent the size of containers - many members of the standards body believe this to be a mistake, but it is effectively impossible to fix at this point. The fact that unsigned arithmetic doesn't model the behavior of a simple integer, but is instead defined by the standard to model modular arithmetic (wrapping around on overflow/underflow), means that a significant class of bugs cannot be diagnosed by the compiler. In other cases, the defined behavior impedes optimization.
- That said, mixing signedness of integer types is responsible for an equally large class of problems. The best advice we can provide: try to use iterators and containers rather than pointers and sizes, try not to mix signedness, and try to avoid unsigned types (except for representing bitfields or modular arithmetic). Do not use an unsigned type merely to assert that a variable is non-negative.

From the ASR's perspective, we are trying to keep the number of types in ASR to minimum. By adding unsigned integers, it will add complexity to ASR of handling and verifying all various combinations that are allowed or not allowed and it also adds complexities to the frontends and backends.

Some use cases for unsigned integers and how to handle them using signed integers:

Interfacing with a C function that accepts or returns an unsigned integer, for example u16. Just use i16 in ASR. It has the same length in binary. If one needs to recover the full range of u16 (0..65535), one can assign the i16 to i32, then check if it is negative, and add 65535+1 if needed. The same going the other way.
The twice larger maximum integer range is required; just use one higher version, so u8->i16; u16->i32; u32->i64. For u64, approximately half the range cannot be represented as positive numbers in the type i64. However, there are still 2^63 numbers available for use! One should design scientific algorithms to not need the full range of u64, but if they're really necessary, one can use floating point numbers or arbitrary-precision integers (TODO: that we should add to ASR).
As a handle ID that's natively u64. Just use i64 instead.
Bitfield / bit pattern, say u16: use i16 and the bit manipulation ASR nodes.
Integer modular arithmetic: that is a specialized use case, typically as part of a computer-algebra system. You often need other moduli than what unsigned integers can provide (256, 65,536, etc.). One would often need other such specialized types as well, such as various polynomial classes. It is a much bigger scope than the question of unsigned integers to target the domain of a computer-algebra system. For now, we want to focus on array-oriented scientific computing.

If we are going to support unsigned integers, one clean way to do it is to introduce an opaque type, such a OpaqueType(identifier name, int size_in_bytes), where the frontend can set u8, u16, etc. for name, and the number of bytes it occupies in memory. ASR would allow casting this to signed integer type to do arithmetic, but otherwise one would not be able to do much with it. The frontend could use it for communicating with the C backend to generate unsigned integers for C interoperation.

References:

Dependencies between symbols (Variables, Functions, Derived types, etc.)

We say that a symbol x depends on a symbol y when x needs to use y in its definition. For example if a function f calls g then f depends on g. Similarly if a variable a is declared as int a = b + c then a depends on b and c. This means that dependencies of a symbol must be declared before itself otherwise it would be an error case. There are two ways to handle this,

Enforcing dependencies at the syntax level - In this approach we don't track dependencies on our own i.e., in the compiler. We just follow the order provided by the user in their code. Hence the syntax level. For example if user declares,

a: i32 = 0;
b: i32 = a;
c: i32 = b;

then the generated would look like,

int32_t a = 0;
int32_t b = a;
int32_t c = b;

So as you can see we just took the order as provided by the user. In the ASR this will be implemented by adding a vector in the SymbolTable class. This vector will store the symbols in the exact same order as they are declared in the syntax by the user.

Storing dependencies in each symbol - In this approach we store dependencies of each symbol in its node in the ASR. Variable, Function, Struct each have a dependency vector. Then in the backend we create dependency graph of these symbols, topologically sort this graph and then declare symbols in the sorted order.

One disadvantage of first approach is that we don't know how to parallelise because its a global order. That's not the case with second approach.

One disadvantage of second approach is that we need to keep track of dependencies in AST to ASR transitions. This invites more bugs. However, ASR verify is (and can be made) strong enough to raise errors in case of in-correct dependencies of a symbol.

Links

Here are some older relevant documents for reference: