LR(0) Parser - Oya-Learning-Notes/Compilers GitHub Wiki

LR(0)

When doing LR(0) analysis, one of the way to determine action table is directly calculating the Viable Prefix.

Viable & Deductible Prefixes

To calculate $LC(A)$, we could use the following rule:

If we have $A \to \alpha X \beta$, and $\alpha \in V_t$ (which means $\alpha$ is a single terminal), then:

$$ LC(X) = LC(A) \cdot {\alpha} $$


Check out the following example of how to calculate $LC(N)$ recursively for all $N \in V_N$:

Then we can calculate the derivable prefix $LR(0)Context$ for any production $N \to \alpha$ using the rule below (Here $\alpha \in V^*$):

$LR(0)Context(N) := LC(N) \cdot \alpha$

Determine Actions

Once we have the deductible prefix, we could use the following rule to determine the action:

  • If the current prefix is one of the deductible prefix, perform deduce operation.
  • Perform shift operation.

Construct Parse Table

Items

Just the most simple items, with no any lookahead (since we are discussing about LR(0) which just essentially means we do not care about any lookahead)

The processing could be concluded as:

  1. Find start Item set (core).
  2. Discover new Item set recursively, then expand the set using closure operation.
  3. Finish when no more new Item set found.

After the steps above, we could get a NFA which’s states contain a set of Item. If we want, we could convert it to DFA then simplify it.

Notice when simply the DFA, we should think about if a DFA node that points to nothing should be considered as equivalent.

In $LR(0)$, it's fine to merge the states that both has no outgoing edges(that is, has no transition), but it’s not the case in some other LR algorithm, for example in $CLR$.

Note: Need further discussion about if we can simplify this NFA, if we could, how?

LR Parse Controller

Before go into the discussion about Parse Table, we should know the basic structure of the LR parse controller. The controller use State Stack and Symbol Stack to achieve the similar functionality of the DFA.

Definitions

$$ Action(CurrentState, terminal) = NewState | Production $$

This means, if the top of the state stack is $S_i$, and we have current input $x \in V_T$, then there could be two possible actions:

  • $Action(S_i, x) = S_j$: Push $x$ into symbol stack. Push $S_j$ into stack.
  • $Action(S_i, x) = R_j$: Use a production $P_j$ to perform reduce ($j$ is the index of a certain production $A \to \alpha x$).

$$ Goto(CurrentState, nonterminal) = NewState $$

This means the next input $x \in V_N$, what we need to do is put $x$ into symbol stack, put what $Goto()$ returns into the State Stack.

Examples

When performing reduction, we remove the same amount of elements from both Symbol Stack and State Stack.

Let's considering the following example:

$$ \begin{aligned} StateStack &= S_0S_1S_4S_6 \ SymbolStack &= Xabc \ Waiting &=x_{n+1} \dots \end{aligned} $$

Current input is $x_n$, and we found that $Action(S_6, x_n) = R_i$, so we use $P_i$ to instruct our reduction, which is $B \to abc$. After this operation, we will get:

$$ \begin{aligned} StateStack &= S_0 \ SymbolStack &= X \ Waiting &= Bx_{n+1}\dots \end{aligned} $$

Now, the top of $StateStack$ is $S_0$, and next input is non-Terminal $B$, so we just look up if there is any valid $Goto(S_0, B)$

Note after $abc$ is deduced to $B$, the non-terminal $B$ is not directly put into symbol stack, however, it has been pushed to the waiting list.

Checkout TSU book PDFPage 150 for more info.

LR(0) DFA To Parse Table

Each DFA node would become a state $S_i$, with state index $i$

Terminal Transition -> Action(Shift)

For $S_i$ and $S_j$, if $S_i \to^{x} S_j$, $x \in V_T$, then:

$$ Action(S_i, x) = S_j $$

$S_j$ means shift to state $j$

Non-Terminal Transition -> Goto

For $S_i$ and $S_j$, if $S_i \to^{X} S_j$, $X \in V_N$, then:

$$ Goto(S_i, X) = S_j $$

Fulfilled Item -> Action(Reduce)

For $S_i$, if Item $(P_j \to \alpha \beta \cdot) \in S_i$:

$$ Action(S_i, *) = R_j $$

Note: $R_j$ means reduce using Production $P_j$

Here note that because we are dealing with LR(0) parser, so reduce action applys for ALL input characters (including ending symbol $\#$), we use $*$ to represent set $Action(S_i, x) = R_j$ for ALL terminal $x$ in this language.

However, we should keep in mind that in this case we only set the ACTION part of the table, and the GOTO part of the table is left unchanged.

Parse Table Examples

parse_table_example

Something worth noticing:

  • Add acc for augmented non-terminal with # lookahead.
  • No any reduce operation in GOTO part.