GraphQL Lexer Technical Documentation - slashmo/graphql-swift-repl GitHub Wiki
On this Wiki page, you can learn how I've implemented my GraphQL Lexer.
Lexing
Lexing is the process of turning a set of characters, better known as String
, into a set of tokens. A token represents a meaningful substring of the original String
with a start and an end position, and an optional value.
Overall structure
I split the Lexer into two objects: Lexer
and Lexer.Token
, both of which are structs.
Lexer.Token
As there are multiple types of tokens I started with implementing a Lexer.Token.Kind
enum with cases for each type. The Token
struct uses this enum as one of its properties. The other properties hold the position and an optional value for these types: .string
, .comment
, .int
, .float
, and .name
. I decided to name the types Kind
s, because naming a type Type
sounds like a bad thing to do 😁
Lexer
The Lexer
struct is itself split into a private and public API.
Private API
Cont
The internals use another struct called Cont
, which holds a single property called run
of type (inout Substring) throws -> (Token, Cont)?
. The implementation of this run
property is up to the creator of the instance. It's basically a way to describe the lexing of a single token and a reference to the Cont
instance that should be used to lex the next token. That's why the return value is a tuple of both Token
& Cont
. A call to run
can also return nil
to indicate that no more tokens are to be lexed.
Besides holding a mutable instance of a Cont
the Lexer
holds a mutable instance of the remaining substring.
startState & consumeToken
The startState
method is the initial value of the cont
property. It returns a Cont
and implements its run
property by removing the first character from the remaining substring and by returning the result of a call to consume(_:startingAt:)
s returned Cont
. From then on it's basically just calling other private methods that return Cont
s based on the characters UTF-8 codepoint.
Public API
The public API allows users to initialize new Lexer
s and to get the next lexed token by calling the advance()
method. More information on these two can be found in their respective doc-blocks. Internally, the advance()
method calls cont
s run
property to lex the next token. Given it finds one it assigns the Cont
contained in the return from the run
call to cont
, and returns the token
.
Error handling
Errors are handled by throwing instances of GraphQLError
. These errors are exposed to the user through the advance()
method and hold the failure's location and a human-readable message.
Example
Let's say you want to lex a very simple query like this:
{
hello
}
You'd start by constructing a new Lexer
, passing in the query, and calling advance
as long as it's return value is not nil
:
do {
var lexer = try Lexer(lexing: "{ hello }")
while let token = try lexer.advance() {
print(token)
}
} catch let error as GraphQLError {
print("\(error.start): \(error.message)")
} catch {
print(error)
}