Compiler - newlife-js/Wiki GitHub Wiki
by ์ฐ์ธ๋ํ๊ต ๊นํ์ค ๊ต์๋
Source Program์ Target Program์ผ๋ก ๋ฒ์ญ + Optimizer
- Increase Parallelism
'# of cores(FLOPs) โ - High communication bandwidth(CPU-GPU, GPU-GPU)
- Domain specific customization CNN์ ๋นํด RNN ๊ณ์ด์ compute-to-data ratio๊ฐ ๋ฎ์ -> Customizeํ HW ๊ฐ๋ฐ(TPU ๋ฑ)
- Approximate computing
Quantization, lower precision
DL Model์ compileํด์ FPGA์ ์ฌ๋ฆผ
ํ๋์ FPGA์ model ์ ์ฒด๊ฐ ์ฌ๋ผ๊ฐ์ง ๋ชปํด์ partitioning์ ํตํด ์ฌ๋ฌ ๊ฐ์ FPGA์ ์ฌ๋ฆด ์ ์๋๋ก ํจ
End-to-end optimizing compiler for DNN
- Nvidia Diesel, Google XLA, MLIR
- Front end: ์ฝ๋๋ฅผ ์ง์ญ
- Back end: ์ฑ๋ฅ ํฅ์์ ์ํ ์ต์ ํ
A process of breaking a sequence of ASCII characters (source) into a sequence of tokens
- recognize words by finite automata(Finite State Machine)
- Lexer(Lexical Analyzer): program that performs lexical analysis
- Deterministic Finite Automata (DFA)
Edges leaving a node are uniquely labeled - Non-deterministic Finite Automata (NFA)
Two or more edges leaving a node can be identically labeled
An edge can be labeled with ฯต - Computers can understand only DFA, but directly transforming RE to DFA is difficult
Use NFA as an intermediate step
RE -> NFA -> DFA
parse phase structure
recursive structures๋ lexer๊ฐ ์ ์ฒ๋ฆฌํ์ง ๋ชปํด parser ํ์
โป Context-free grammars
โป Parsing tree
-> Abstract Syntax Tree(๋ฉ๋ชจ๋ฆฌ๋ฅผ ํจ์จ์ ์ผ๋ก ์ฐ๊ธฐ ์ํด)
input: a set of context-free grammars specifying a parser
output: parser in target language, description of state machine
rules: pattern and action
checks if each expression is correct
- All identifiers (variable, class, functions, methods, โฆ) are declared only once
- Inheritance relationship
- Types are well defined and related
- Reserved identifiers are not misused
specifies which operations are valid for which types
- Type checking: ensures that operations are used with the correct types
- Type Inference: fills in missing type information
abstract machine language
Process finding set of machine instructions that implement operations specified in IR tree
- instruction tree patterns
instruction tree patterns์ ํตํด instruction์ optimizationํ ์ ์์
9 reg, 10 inst -> 5 reg, 6 inst
Determine how instructions are fetched during execution
Control flow graph(CFG)์ ๊ฐ node๋ค์ ์ ํ ๊ด๊ณ์ ๋ํ ๋ถ์
Node ๐ ๐๐๐๐๐๐๐ก๐๐ node ๐ if every path of directed edges from ๐ 0 to ๐ must go through ๐(dominator)
- Dominator Tree: efficient representation of dominator information
- Immediate Dominator: last dominator of ๐ on any path from ๐ 0 to ๐
register allocation์ ์ํด ๊ฐ node๊ฐ ๊ฐ์ง๊ณ ์์ด์ผ ํ๋ register๋ฅผ iterativeํ๊ฒ ์ฐพ๋ ๊ฒ
โข ๐ข๐ ๐s of node ๐ or register ๐ก: Source (RHS) registers of node ๐, Nodes where ๐ก is used as source registers
โข ๐๐๐๐๐๐๐ก๐๐๐ of node ๐ or register ๐ก: Destination (LHS) register of node ๐, Node where t is defined
โข A register ๐ก is ๐๐๐ฃ๐ on edge ๐: If there exists a path from ๐ to a use of ๐ก that does not go through a definition of ๐ก
โข Register ๐ก is ๐๐๐ฃ๐โ๐๐ at CFG node ๐: If ๐ก is live on any in-edge of ๐
โข Register ๐ก is ๐๐๐ฃ๐โ๐๐ข๐ก at CFG node ๐: If ๐ก is live on any out-edge of ๐
โ Interference Graph
Edge <๐ก๐,๐ก๐> exists if register ๐ก๐, ๐ก๐ have overlapping live range
For some node ๐, if ๐ท๐ธ๐น ๐ = {๐} and ๐๐๐ ๐ = ๐1, ๐2, โฆ , ๐๐ , then add interference edges: ๐, ๐1 , ๐, ๐2 , โฆ , ๐, ๐๐
โ Dead Code Elimination
side-effect์ด ์๋ ์ฝ๋๋ฅผ ์ง์์ ์ฑ๋ฅ์ ์ฌ๋ฆผ
A process to determine whether definition of register ๐ก can affect use of t
โข ๐บ๐ธ๐[๐]: the set of ๐๐๐๐๐๐๐ก๐๐๐ ๐๐โ๐ that ๐ creates
โข ๐พ๐ผ๐ฟ๐ฟ[๐]: the set of ๐๐๐๐๐๐๐ก๐๐๐ ๐๐โ๐ that ๐ kills
-> constant propagation, constant folding, copy propagation ์ ์ฉ ๊ฐ๋ฅ
Loop์ด๋?
There exists a header node โ in ๐ that dominates all nodes in ๐
From any node in ๐, there exists a path of directed edges to โ
header h๋ก ๋์๊ฐ๋ edge๋ฅผ back-edge๋ผ ํ๊ณ , ์ด back-edge ํ๋ ๋น natural loop ํ๋๋ก ํ๋ณ
โป Loop Invariant Code Motion(LICM): ๊ฒฐ๊ณผ์ ์ํฅ์ ๋ฏธ์น์ง ์์ผ๋ฉด์ loop ๋ฐ์ผ๋ก code๋ฅผ ๋นผ๋ด๋ ๊ฒ
โป Induction Variable: loop๋ง๋ค loop-invariant value๋งํผ ์ฆ๊ฐ/๊ฐ์ํ๋ ๋ณ์