llvm - CourseReps/ECEN489-Fall2015 GitHub Wiki
LLVM (Low Level Virtual Machine)
What is LLVM?
The LLVM project is a collection of compiler tools and technology. It is an open-source project with many contributors and collaborators. It is written in C/C++ and provides tools and components for optimization as well as other stages of compilation. The LLVM project was started in 2000 at the University of Illinois at Urbana–Champaign, under the direction of Vikram Adve and Chris Lattner.
Compilers
A compiler is a program that take source code as an input and outputs compiled code. This compiled code could be a binary executable, or an intermediate low-level language. In essense, it is a translator that translates human-readable source code into something more understandable to the computer.
Issues with compilers (reasons for LLVM)
There are many issues with (non-LLVM) compilers that the LLVM project attempts to solve:
-
Compilers are relatively difficult to develop. Certain parts of compilers do not have much reusable code.
-
Many compilers are old and built on old technology (GCC was originally released in 1987!).
-
Most compilers are platform dependent. This is understandable because they are inherently producing platform dependent output, so there are parts of the compiler that cannot be cross platform.
-
Compilers do not share code. This creates a lot of duplicate work.
What is LLVM? (revisited)
-
Primary Mission: build a set of modular compiler components
- Reduces the time & cost to construct a particular compiler
- Components are shared across different compilers
- Allows choice of the right component for the job
-
Secondary mission: Build compilers out of these components, specifically very fast and optimized compilers
LLVM Components
Programming language support
Although it was originally intended to be used with C/C++, LLVM's flexible infrastructure has led to more widespread usage. LLVM currently supports compiling of Ada, C, C++, D, Delphi, Fortran, and Objective-C, using various front ends.
Clang is a compiler front end for many C-Family ( C, C++, Objective-C and Objective-C++ ) languages, that uses LLVM as the back-end. It is designed to offer an alternative to GCC, and to reduce the memory footprint and increase compile time.
LLVM Intermediate Representation
The LLVM project includes an intermediate representation language. This is a language that source code is compiled to before being compiled to executable code. The use of an intermediate representation allows for more of a compiler to be re-used. This is at the heart of LLVM, because once source code is translated to the IR, it can be plugged in to the LLVM backend tools.
LLVM Backend
LLVM includes a target-independent code generator to translate the LLVM IR into machine code. This code generator supports many and most instruction set architectures. LLVM does optimization on the IR code, which means that it can apply these optimizations to code from any language (that is compiled to IR). In addition, LLVM provides link-time optimization, which is another level of optimization that applies to all languages that use LLVM.
The Ild (Integrated Linker) project attempts to create a platform independent linker for LLVM. Right now, LLVM and Clang depend on a third-party linker to do the final step of generating executable code from machine code. The Ild tries to remove this dependency.
LLDB (Debugger)
The LLDB is a debugger that is built using many of the tools from the LLVM project. It is attempting to capture the principles of componentization and reusable components from LLVM and apply them to debuggers. It is still in early development, but LLDB can be used to debug programs written in C, Objective-C, C++ and Swift.
Pros and Cons
Pros
- Open source project with many collaborators and contributors
- More flexible because of modular design
- Novel optimization and performance improvements
- Platform independence, language independence
- Faster compile times
- Better error reporting
- Source code more accessible
Cons
- Runtime Performance. LLVM is closing the gap in performance between themselves and established compilers (GCC, for example), but it is not quite there. In particular, GCC produced code outperforms LLVm produced code on x86/x86-64 architectures.
- Not the standard
- Though LLVM has features that other compilers do not, it also lacks some features that other compilers provide (support for OpenMP).
References
- LLVM Tutorial slides
- [LLVM Code Generator] (http://llvm.org/docs/CodeGenerator.html)
- [LLVM Optimization] (http://llvm.org/docs/Passes.html)
- [LLVM Project Blog] (http://blog.llvm.org/)