Getting Started - adamcrossland/pcomb GitHub Wiki
So, you are interested in using PComb? Doing so is very easy. Start by taking a good look at the demo app in the app/
subfolder, and when you are ready, here are the steps:
Include the file pcomb.ts
in your project. If it is a TypeScript project, adjust your build script to make sure that it gets compiled along with any other dependencies.
In any source file that is going to use PComb for parsing, add a reference or an import statement to include it, like this:
import * as pcomb from "../src/pcomb"
Of course, the path will be different depending on how you have your project files laid out.
You will need to provide two classes: one that implements the pcomb.ParserInput
interface and one that implements the pcomb.ParserOutput
interface The class that implements pcomb.ParserInput
doesn't need to do much: just provide the one property and one method required by the interface. pcomb.ParserOutput
on the other hand, should implement the required fields, but it also needs to store whatever kind of state is needed by your application to do its work.
Here is the ParserInput
from the demo app:
class ChattyData implements pcomb.ParserOutput {
matched: string[];
leftOperand: number;
rightOperand: number;
operator: MathOps;
accumulator: number = 0;
copy(): ChattyData {
let newData = new ChattyData();
if (newData.matched) {
newData.matched = this.matched.slice();
} else {
newData.matched = Array<string>();
}
newData.leftOperand = this.leftOperand;
newData.rightOperand = this.rightOperand;
newData.operator = this.operator;
newData.accumulator = this.accumulator;
return newData;
}
}
And here is the ParserOutput
:
class ChattyData implements pcomb.ParserOutput {
matched: string[];
leftOperand: number;
rightOperand: number;
operator: MathOps;
accumulator: number = 0;
copy(): ChattyData {
let newData = new ChattyData();
if (newData.matched) {
newData.matched = this.matched.slice();
} else {
newData.matched = Array<string>();
}
newData.leftOperand = this.leftOperand;
newData.rightOperand = this.rightOperand;
newData.operator = this.operator;
newData.accumulator = this.accumulator;
return newData;
}
}
As you can see, the ParserOutput
implementing class adds a number of fields -- leftOperand
, rightOperand
, operator
and accumulator
-- that are used to store the data that is collected by the parsers that the app defines. This tutorial will cover the mechanism for getting data into the object later.
Now it is time to actually create a parser. Most begin with one or more lit
parsers; they match a fixed, caseless string of one-or-more characters. Again taking an example from the demo app, let's create parsers that represent the arithmetic operators for addition, subtraction, multiplication and division.
let operandPlus: pcomb.Parser = pcomb.lit("+");
Here we are declaring a variable of type pcomb.Parser
and call the lit
function to create a parser that will match a plus sign. That is all well and good, but nothing will happen aside from parsing continuing. What we want is for the operator field in our ChattyData
object to be set correctly. To do that, we will need to define a ParserAction
.
Add a ParserAction
function that can be called by the parser that matches math operators:
private operandSet(matched: string, output: pcomb.ParserOutput): pcomb.ParserOutput {
let result: ChattyData = <ChattyData>output.copy();
switch (matched) {
case "+":
case "plus":
result.operator = MathOps.Add;
break;
case "-":
case "minus":
result.operator = MathOps.Subtract;
break;
case "*":
case "times":
result.operator = MathOps.Multiply;
break;
case "/":
case "divided by":
result.operator = MathOps.Divide;
break;
}
return result;
}
Note that this function matches the signature of the ParserAction
type as defined in pcomb.ts:
export type ParserAction = (matchedText: string, output: ParserOutput) => ParserOutput;
It is a function that takes two parameters: matchedText
, and output
. The first holds the actual text that was matched by the parser. This may seem trivial; after all you are attaching the function to a parser and you know what the parser matches, right? Well, a ParserAction
can be attached to a parser of arbitrary complexity, and it is possible that one of very many different patterns will have been matched, so knowing what was matched is necessary for writing anything other than the most trivial function.
The output
parameter is the current state of the parse: all of the texts matched to date as well as whatever application-specific data has been collected. You can see that the first thing that the function does is create of copy of that which is passed in, and it makes all changes to and returns that copy.
The function matches our expected value of +
, but it also knows about the synonym plus
and all the other operators and their synonyms. However, we haven't yet written a parser that will match any of those.
First's rewrite our definition of operandPlus
so the action will be called:
let operandPlus: pcomb.Parser = pcomb.lit("+", this.operandSet);
Now we can rewrite our parser so that it will match either +
or plus
:
let operandPlus: pcomb.Parser = pcomb.or([pcomb.lit("+", this.operandSet), pcomb.lit("plus", this.operandSet)]);
Since the or
combinator can take a ParserAction
, we can rewrite the above to eliminate redundancy:
let operandPlus: pcomb.Parser = pcomb.or(pcomb.lit("+"), pcomb.lit("plus"), this.operandSet);
Run the parser against some text. While our currently-defined parser is trivially and not very useful, we can still give it some text to parse. The pcomb.Parse
method is called like this:
let result = pcomb.Parse(operandPlus, text, new ChattyInput(), new ChattyData());
It returns a ParseResult
type:
export type ParseResult = [boolean, ParserInput, ParserOutput];
We can tell if the parser matched the input text by examining result[0], which will be true if it did and false otherwise. The second returned value, a ParserInput
holds the state of the input data once the parser is done consuming it, and the third value, a ParserOutput
has all the distinct matched values from all of the parsers that ran, as well as all of the application-specific information that was retained.
And now? Parse, parse, parse.