GSoC 2023 Ideas - lcompilers/lpython GitHub Wiki

Below we list developed project ideas listed by priority. The "High Priority" section contains projects that we are especially interested in, as they lie on the critical path to a minimal viable product: make LPython usable for simpler projects.

However, feel free to propose any project idea that you like to improve LPython, for example by browsing open issues:

https://github.com/lcompilers/lpython/issues

If you are interested in applying, please get in touch with us at either our Zulip chat or our mailing list:

We will help answer questions and help with finding and refining a project idea. You do not need to have prior experience with compilers, we will teach you. It is fun. LPython is written in C++, but we do not use many advanced features and if you have any programming experience you will be able to pick it up.

Here are a few projects for inspiration, they contain a mix of well-developed ideas and less developed ideas. You are welcome to propose your own idea as well.

Patch Requirement

We have a patch requirement in order to consider your application. Please send a patch (Pull Request) to LPython that has to be merged by the time the application period closes (April 4). You can fix or improve anything you like. If you have any questions, please contact us (for example on Zulip) and we will help.

Mentors

Potential mentors:

  • Ondřej Čertík
  • Gagandeep Singh
  • Rohit Goswami
  • Thirumalai Shaktivel
  • Smit Lunagariya
  • Ubaid Shaikh
  • Luthfan Lubis

High Priority

Compile benchmarking code written in Python with LPython and improving LPython's performance on these benchmarks

https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python.html contains all the benchmark codes written for various problems such as n-body, sepctral norm, mandelbrot. The workflow would involve first doing bug fixes to compile the code (modifying the input code would be okay) with LPython and producing correct outputs. Then, improving LPython to perform better or equivalent to other benchmarks written in compiled languages such as C/C++.

Expected outcomes: LPython can compile as many benchmark codes as possible. Performing better than other compilers would be an additional plus.

Skills preferred: Python and C++ programming

Difficulty: intermediate/hard, 350 hours

Mentors - Gagandeep Singh (Github - @czgdp1807)

Implementing and improving advanced data structures such as dict, list, set etc.

Data structures such as dict, list have been partially implemented in LPython and some like set haven't been touched yet. This project would involve improving the support for already implemented data structures and adding new ones. We would also benchmark our implementations with the equivalents in other languages such as (list vs std::vector, set vs std::set, dict vs std::unordered_map).

Expected outcomes: Increased support for advanced data structures in LPython. Performing better in benchmarks is an additional plus.

Skills preferred: C++ Programming, Python. LLVM familiarity would be a plus but not necessary (you can learn this on the fly).

Mentors - Gagandeep Singh (Github - @czgdp1807)

Relevant PRs and issues - https://github.com/lcompilers/lpython/issues/983, https://github.com/lcompilers/lpython/issues/941, https://github.com/lcompilers/lpython/pull/1111. No issue for set has been opened yet but it has the same priority as other data structures for this particular project.

Implement Generics

LPython has initial generics implemented, they allow to write functions that accept generic arguments and are instantiated at the call site using the actual types provided by the user. In this project we will extend it to many more cases and implement enough features so that we can implement the standard library and NumPy using generics.

Expected outcomes: LPython can compile untyped code using generics.

Skills preferred: Python and C++ programming

Difficulty: intermediate, 350 hours

Mentors - Ondřej Čertík, Rohit Goswami, Gagandeep Singh, Luthfan Lubis

Implementation of features on the ASR and LLVM level

The roadmap https://github.com/lcompilers/lpython/issues/155 issue contains a list of Python features that we want implemented. Each feature should be implemented at the ASR level and in the LLVM backend to be complete. If AST is missing for a given feature, then it has to be implemented also.

Here you can pick a feature or a set of features from the list and propose it as a GSoC project. In other words, this project idea can accommodate multiple student projects.

List of resources for more information and background:

If you have any questions, please do not hesitate to ask, we can discuss or provide more details.

Difficulty: easy/intermediate (depending on the task), can be 175 hours or 350 hours

Mentors: Ondrej Certik (@certik), Gagandeep Singh

Allow running LPython in the browser

We have a demo of LPython running in the browser using WASM here: https://www.ubaidshaikh.me/lcompilers_web_frontend/lpython, the goal of this project would be to improve the user interface. Here is a list of issues that the project can work on fixing: https://github.com/lfortran/lcompilers_frontend/issues

Skills preferred: Python and C++ programming

Difficulty: intermediate, 350 hours

Mentors - Ondřej Čertík, Rohit Goswami

Language Server

This project would be used to implement language server features like find a symbol declaration and expose it to a language server written in TypeScript that works out of the box in VSCode.

Expected outcomes: LPython can be used as a Python language server that can be used in other software such as source code editors and IDEs.

Skills preferred: Python and C++ programming

Difficulty: intermediate, 350 hours

Mentors: Ondřej Čertík (@certik), Smit Lunagariya

Implement modules from the Python standard library

The Python standard library has a lot of modules: https://docs.python.org/3/library/index.html. However, the highest priority module is NumPy for enhancing array programming support of LPython. Even in NumPy we reduction functions like (numpy.sum, numpy.mean, etc) are the most important. We plan to implement them via intrinsic functions infrastructure recently added in LCompilers (see, src/libasr/pass/intrinsic_function_registry.h which will soon be a part of LPython). Rest every other module is low priority.

The project includes discussing which modules will be needed for LPython (from a scientific computing perspective, in the beginning), creating a priority list, and then implementing each module properly. The aim of this project is to make LPython work for any Python code down the road.

See #200 as a related issue. Feel free to discuss the details with us.

Skills preferred: Python and C++ programming

Difficulty: hard, 350 hours

Mentors: Ondřej Čertík (@certik), Naman Gera, Smit Lunagariya

Medium Priority

Improve WebAssembly (WASM) backend

LPython has a very fast WASM backend that can translate large parts of ASR to WASM. This project would work on making the WASM work for all of ASR, by adding tests and implementing missing features. As every backend in LPython, the backend receives the code as ASR, and it recursively walks over each ASR node and generates WASM code.

Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh

Difficulty: intermediate, 350 hours

Improve x86 code generation

LPython has WASM to x86_64 code generation backend implemented which allows very fast compiling (many times faster than going via LLVM). The x86 backend does not do any optimizations, so it is meant to be used in Debug mode only. The backend recieves WASM code and creates an x86_64 ELF binary.

The purpose of this project would be to extend this backend to cover more LPython/WASM features.

If you have any questions, please do not hesitate to ask, we can discuss or provide more details.

Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh

Difficulty: intermediate, 350 hours

Create WASM->Apple/M1 backend

This project would create an initial WASM to Apple M1 backend. It would work similarly to the WASM->x86 backend, but it would generate ARM code and MachO binary format that works on Apple M1.

If you have any questions, please do not hesitate to ask, we can discuss or provide more details.

Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh

Difficulty: intermediate, 350 hours

Low Priority

Automatic Python wrapping

Add a backend to LPython that automatically exposes (eventually all) LPython module contents to Python. That will allow using LPython compiled code to be used from CPython itself.

Related issues:

Mentors: Ondrej Certik (@certik), Rohit Goswami

Difficulty: intermediate, 350 hours