In this post I present prim-parser, a Lean 4 total monadic parser combinator library in the parsec tradition. To the best of my knowledge, it is the first of its kind. To ensure totality, every parser is tagged with a grade recording whether it may, must, or cannot consume input, and whether it may, must, or cannot fail; the Parser type is a graded monad over these grades. The graded monad laws are proven as propositional equalities.
Introduction
Monadic parser combinators are a popular tool in functional programming. With a small set of functions called combinators and a monadic interface, you can assemble parsers for complex grammars, and since parsers are ordinary values in the host language, you get its abstractions and familiar syntax.
A typical example is parsing an identifier: a letter followed by alphanumeric characters:
def ident : Parser String := do
let c ← letter
let cs ← many alphanum
return String.ofList (c :: cs)
many is combinator that parses one or more occurrences of its argument. We could define it like this:
def many (p : Parser α) : Parser (List α) := do
let m ← optional p
match m with
| some x => List.cons x <$> many p
| none => return []
But the definition of many fails Lean's termination check because it calls itself on the same input (in the some x branch), so there is no structurally decreasing argument. Two obvious options that may come to mind: bound the number of recursive calls, or mark the definition partial and skip the check (what lean4-parser does). Neither is satisfying. The first is too restrictive because there may not be a safe bound to pick. The second drops the totality guarantee (if p accepts the empty string, many p loops forever). A better solution comes from enriching the type of the parser. In prim-parser, every parser carries a grade in its type (a pair tracking error and consumption behavior). Look at many's signature:
def many (p : Parser ⟨ge, always⟩ α) : Parser ⟨never, possibly⟩ (List α)
The left component tracks errors; the right tracks consumption. The key fact is the always in p's grade: p is guaranteed to consume input on every success, and that is what makes the recursion safe. The ge on the left is unconstrained. The result grade ⟨never, possibly⟩ says many p itself never fails and may or may not consume input. Each parser type carries a grade of this kind, and the Parser type is a graded monad over these grades.
To the best of my knowledge, the only practical implementation of a total parser combinator library is agdarsec, by Guillaume Allais (paper, 2018). The original library is in Agda but has been ported to Rocq (parseque) and Idris (tparsec). Earlier, Danielsson 2010 introduced the first total parser combinator library based on Brzozowski derivatives. I'll compare prim-parser to both in the Related work section.
Original contributions
To the best of my knowledge:
- Graded monads as a totality approach for parser combinators. The first use of graded monads in a parser combinator library. The grade tracks error and consumption necessity in the type.
- First total monadic parsec-style parser combinator library (in any total language). Parsec-style means a shallow embedding: a parser is essentially an ordinary function from input to result. agdarsec is total and uses a shallow embedding but not monadic. Danielsson is total and monadic but uses a deep embedding via Brzozowski derivatives, which is known to have exponential complexity.
- First total parser combinator library in Lean 4.
lean4-parserusespartial. agdarsec's approach could be ported but hasn't been. Danielsson's approach uses sized types and mixed induction/coinduction, which Lean does not support.
The rest of the post.
- The grade Introduces the grade prim-parser uses for the parser type.
- Graded monad shows how grades compose, making
Parsera graded monad. - Combinators is a tour of the standard parser combinators with graded types.
- Guarded recursion via
fixexplains recursive parsers. - The Parser type explains the parser type in more detail.
- The
gdomacro describes the graded version ofdo-notation. - Examples presents implementations of S-expressions and CSV parsers.
- Related work compares prim-parser to lean4-parser, Danielsson 2010, and agdarsec.
- Future work.
The grade
The grade tracks two pieces of static information about a parser:
- Errors. Can it fail?
- Consumption. Does it consume input on success?
Each axis admits three answers: never, possibly, always.
inductive Necessity where
| never
| possibly
| always
We order them never < possibly < always, so ⊔ is max over this order. Note that ⟨Necessity, ⊔, never⟩ forms a monoid.
A Grade is just a pair:
structure Grade where
errors : Necessity
consumes : Necessity
There are nine grades; seven of them are useful to name:
| Name | errors | consumes | Reading |
|---|---|---|---|
| pure | never | never | always succeeds, no input read |
| lookahead | possibly | never | may fail, no consumption |
| flexible | never | possibly | infallible, may consume |
| fallible | possibly | possibly | the most permissive grade |
| conditional | possibly | always | may fail, must consume on success |
| empty | always | never | always fails |
| impossible | never | always | uninhabited |
Grade is a monoid:
instance : Monoid Grade where
mul ⟨e, c⟩ ⟨e', c'⟩ := ⟨e ⊔ e', c ⊔ c'⟩
one := ⟨never, never⟩
I'll write g * g' for the monoid product (in this post the same as g ⊔ g'). It expresses how grades combine when one parser runs after another. For example, suppose we run a parser that never fails and may consume, followed by one that may fail and always consumes:
⟨never, possibly⟩ * ⟨possibly, always⟩
= ⟨never ⊔ possibly, possibly ⊔ always⟩
= ⟨possibly, always⟩
The result may fail (the second parser might) and always consumes (the second parser does).
The two unnamed grades (⟨always, possibly⟩ and ⟨always, always⟩) describe parsers that always fail; the consumption component is irrelevant when the result is always an error. I keep them so the monoid product is closed on Grade.
As shown by both agdarsec and Danielsson, to ensure termination it is enough to track in the type whether a parser accepts the empty string. The grade I chose is more expressive than that. As we will see in the Combinators section, this allows giving precise types to combinators that a simpler grade could not.
Graded monad
We just saw how grades multiply when parsers run in sequence. That's what gbind does, and gpure carries the unit grade. With g g' : Grade:
def gpure : α → Parser pure α -- pure = ⟨never, never⟩, the monoid unit
def gbind : Parser g α → (α → Parser g' β) → Parser (g * g') β
This is a graded monad (Katsumata's parametric effect monads). Briefly, a graded monad is like a standard monad indexed by an element g of some monoid (the grade). gbind multiplies the indices; gpure uses the monoid unit. In the table below I compare the signatures:
| Standard Monad | Graded Monad |
|---|---|
| pure : α → m α | gpure : α → m 1 α |
| bind : m α → (α → m β) → m β | gbind : m i α → (α → m j β) → m (i * j) β |
Note that graded monads are a strict generalisation of standard monads (the graded monad with the trivial monoid is exactly the standard monad). No Lean library ships the graded hierarchy, so prim-parser provides GradedFunctor, GradedApplicative, GradedMonad, and their respective Lawful* variants itself. I've proved the functor, applicative, and monad laws for Parser as Lean theorems. With i j k : Grade, the two sides of each law carry different grade indices syntactically; they unify only after we prove the grade identity in the right column, and those identities follow from the monoid laws on Grade:
| Property | Equation | Grade equation |
|---|---|---|
| Left unit | gpure x >>= f = f x | 1 * j = j |
| Right unit | m >>= gpure = m | i * 1 = i |
| Associativity | (m >>= f) >>= g = m >>= λa. f a >>= g | (i * j) * k = i * (j * k) |
Combinators
This section is a tour of the common parser combinators you'd expect to find in any parser combinator library. For each, we'll look at the signature, describe the behaviour, and relate it back to the grade. By the end I hope you'll see that grades do more than enforce totality; they also help document the combinator's behaviour in the type. fix, the recursion combinator, has its own section.
anyChar consumes and returns a single character, or fails if the input is empty:
anyChar : Parser conditional Char -- conditional = ⟨possibly, always⟩
It may fail (on empty input) and always consumes on success.
lookahead runs p but never consumes:
lookahead : Parser ⟨ge, gc⟩ α → Parser ⟨ge, never⟩ α
The error grade is preserved; the consumption grade is set to never.
eof succeeds only when there is no more input:
eof : Parser lookahead PUnit -- lookahead = ⟨possibly, never⟩
notFollowedBy succeeds exactly when p fails, without consuming:
notFollowedBy : Parser ⟨ge, gc⟩ α → Parser ⟨ge.complement, never⟩ PUnit
complement is a Necessity operation defined by these equations:
complement never = always
complement possibly = possibly
complement always = never
If p always fails, notFollowedBy p never fails; if p never fails, notFollowedBy p always fails. In short, the error grade is flipped.
many applies a parser zero or more times, collecting results:
many : Parser ⟨ge, always⟩ α → Parser flexible (List α)
The argument must always consume (otherwise recursion wouldn't terminate); the result is flexible (never fails, may consume) because zero repetitions is allowed.
many1 is many with at least one repetition. The argument's grade is preserved exactly:
many1 : Parser ⟨ge, always⟩ α → Parser ⟨ge, always⟩ (NonEmptyList α)
optional tries p; on failure, returns none. The result never fails:
optional : Parser ⟨ge, gc⟩ α → Parser ⟨never, ge.complement ⊓ gc⟩ (Option α)
optional p only consumes input when p succeeds. ge.complement flips the error grade to capture how often that happens. Thatis, never if p always fails, always if p never fails. Taking the meet (⊓, min) with gc caps that by what p consumes when it succeeds.
choice tries the first parser; on failure, the second. The error grade is ge ⊓ ge' (the combined parser only always-fails if both branches do); the consumption grade uses a small helper ite that cases on the first branch's failure pattern:
def ite (sel a b : Necessity) : Necessity :=
match sel with
| never => b -- first never fails: take b
| always => a -- first always fails: take a
| possibly => if a = b then a else possibly -- both might run
choice : Parser ⟨ge, gc⟩ α → Parser ⟨ge', gc'⟩ α
→ Parser ⟨ge ⊓ ge', ge.ite gc' gc⟩ α
So in ge.ite gc' gc: if ge = never (first branch never fails), the second is unreachable and consumption is gc. If ge = always, only the second branch runs and consumption is gc'. Otherwise consumption is gc when gc = gc', and possibly otherwise.
oneOf runs the parsers in order, returning the first success.
oneOf : NonEmptyList (Parser g α) → Parser g α
count parses exactly n occurrences, returning a length-indexed vector. Both grade components are relaxed to possibly, since count 0 p succeeds without consuming:
count : (n : Nat) → Parser ⟨ge, gc⟩ α
→ Parser ⟨ge ⊓ possibly, gc ⊓ possibly⟩ (Vector α n)
count1 is the specialisation to n + 1. With at least one repetition guaranteed, the argument's grade is preserved exactly:
count1 : (n : Nat) → Parser ⟨ge, gc⟩ α
→ Parser ⟨ge, gc⟩ (Vector α (n + 1))
sepBy parses zero or more occurrences of p separated by sep:
sepBy : (sep : Parser ⟨ge', gc'⟩ β) → (p : Parser ⟨ge, gc⟩ α)
→ gc' ⊔ gc = always
→ Parser flexible (List α)
The constraint gc' ⊔ gc = always says the separator and the element must consume together.
Guarded recursion via fix
The fix combinator enables recursive parsers:
def fix : ((self : Parser ⟨ge, always⟩ α) → Parser ⟨ge, always⟩ α) → Parser ⟨ge, always⟩ α
The argument to fix is a function. This function must produce a parser that alwas consumes on success. For recursive calls, it uses the self argument parser.
Let's look at an example: a parser for a balanced parenthesis group (e.g. (), (()), (()())):
def group : Parser conditional Unit :=
fix fun self => gdo
char '(' -- always consumes
many self -- possibly consumes
char ')'
We use fix to introduce a recursive parser. The type of fix mandates that in the body of the lambda we must build an always consuming parser. We do so by sequencing three simple parsers with gdo. gdo is do-notation for graded monads (covered later). The first parser char '(' always consumes; the second parser many self possibly consumes (because many accepts zero occurrences), the third parser char ')' always consumes. It obviously follows that the sequence of the parsers is always consuming and thus we've provided a valid argument for fix.
The Parser type
The parser type is parameterised by an error type ε, a grade g, and a result type α. Its single field run is like a parsec parser, but the input is sized so that consumption can be tracked in the type. Text n is List.Vector Char n.
structure Parser (ε : Type) (g : Grade) (α : Type) where
run : ∀ {n}, Text n → Outcome ε n g α
The result type Outcome is computed from the error component of the grade:
abbrev Outcome (ε : Type) (n : Nat) (g : Grade) (α : Type) :=
match g.errors with
| never => Success n g.consumes α
| possibly => Failure n ε ⊕ Success n g.consumes α
| always => Failure n ε
Both non-error branches return a Success, which bundles the parsed value, the leftover input, and a consumption witness:
structure Success (n : Nat) (consumes : Necessity) (α : Type) where
result : α
{restSize : Nat}
restText : Text restSize
witness : consumptionWitness restSize n consumes
The witness relates the input length n to the remaining input length restSize:
abbrev consumptionWitness (rest n : Nat) : Necessity → Prop
| never => rest = n
| possibly => rest ≤ n
| always => rest < n
The < n case for consumes = always is what enables the termination proof for the fix combinator.
The error branches return a Failure:
structure Failure (n : Nat) (ε : Type) where
error : ε
{restSize : Nat}
restText : Text restSize
witness : restSize ≤ n
The gdo macro
do-notation is a well-established syntax sugar for writing monadic programs. Analogously, I have implemented a gdo macro that adapts it to graded monads. The gdo macro included in prim-parser works for any graded monad, not just the one implemented by the parser. Eventually the gdo macro should be published independently of prim-parser.
The gdo macro just sequences statements with gbind. As we saw, the return type of gbind is graded by the product of its arguments grades. For instance, sequencing two parsers of grade g yields g * g:
def twice (p : Parser ε g α) : Parser ε (g * g) α := gdo
p
p
This type-checks, but the inferred grade g * g isn't the one we want to write in the signature. We want to have:
def twice (p : Parser ε g α) : Parser ε g α := ...
This no longer type-checks on its own because g * g is not definitionally equal to g. To solve this, prim-parser provides gcast; a specialised substitution for the grade of the monad:
def gcast (h : i = j) (x : m i α) : m j α := h ▸ x
We can apply gcast directly to the gdo block:
def twice (p : Parser ε g α) : Parser ε g α := gcast (by simp) <| gdo
p
p
by simp proves g * g = g by using a simp lemma included in the library. This is a very common pattern, so I included some syntax sygar that makes it more pleasant to read and write. It's called grade_by and it optionally comes at the end of a gdo block. The definition below is the same as the definition above:
def twice (p : Parser ε g α) : Parser ε g α := gdo
p
p
grade_by by simp
Examples
This section presents two examples: S-expressions and CSV. The Examples/ directory in the repo has a few more parsers in the same style: arithmetic expressions (with operator precedence), JSON, and the untyped lambda calculus.
S-expressions
Let's parse the usual Lispy syntax: alphanumeric atoms and parenthesised lists, e.g.
hello
(a b)
(a b c)
(a (b c))
with non-empty lists folded into right-associative pairs: (a b c) parses to pair a (pair b c).:
inductive SExp where
| atom (str : String)
| pair (l r : SExp)
def patom : Parser Error conditional SExp :=
.atom <$>ᵍ takeWhile1 (·.isAlphanum)
def sexp : Parser Error conditional SExp :=
fix (fun sexp_rec =>
let plist : Parser Error conditional SExp := gdo
lexeme (char '(')
let first ← sexp_rec
let rest ← many (gdo whitespace; sexp_rec)
lexeme (char ')')
return listToPairs (first :: rest)
grade_by by simp
choice patom plist)
CSV
Let's parse a tiny subset of CSV: a header row of column names followed by data rows whose cells are integers or strings, e.g.
name,age
Alice,30
Bob,25
The header determines the column count statically and every subsequent row is required to have exactly that many fields.
inductive Value where
| int (n : Int)
| str (s : String)
structure Table (n : Nat) where
columns : List.Vector String n
rows : List (List.Vector Value n)
def row : Parser Error flexible (List String) :=
sepBy comma field
def exactRow (n : Nat) : Parser Error fallible (List.Vector Value n) :=
sepByN comma cell n
def table : Parser Error conditional ((n : Nat) × Table n) := gdo
let headers ← row
newline
let n := headers.length
let rows ← sepBy newline (exactRow n)
let t : Table n := { columns := ⟨headers, rfl⟩, rows }
return ⟨n, t⟩
The Examples/ directory has a few more parsers in the same style: arithmetic expressions (with operator precedence), JSON, and the untyped lambda calculus.
Related work
In this section I compare prim-parser to three closely related libraries.
| Library | Total | Monadic | Parsec-style | Left-recursion | Coinduction |
|---|---|---|---|---|---|
| prim-parser | ✓ | ✓* | ✓ | ✗ | not needed |
| lean4-parser | ✗ | ✓ | ✓ | ✗ | not needed |
| agdarsec | ✓ | ✗ | ✓ | ✗ | not needed |
| Danielsson 2010 | ✓ | ✓* | ✗ | ✓ | needed |
* Neither has a standard Monad typeclass instance; see the Danielsson section.
lean4-parser
lean4-parser is the most popular Lean 4 parsing library. Both prim-parser and lean4-parser are parsec-style and use the same shallow embedding (a parser is just a function from the input to a result), run by applying it. In that sense the two are essentially the same under the hood. The main differences are:
- Totality. lean4-parser uses
partialfor unchecked recursion; prim-parser is total. - Recursion. lean4-parser writes recursive parsers as ordinary Lean recursive definitions (
partiallets Lean accept them without a termination proof). prim-parser uses an explicit guarded-recursion combinatorfix, whose type forces the recursive call to happen only after consumption. - Monad vs graded monad. lean4-parser is a standard
Monad(and a monad transformer, so it can run on top ofState, etc.) and works with the built-indonotation. prim-parser is a graded monad. Instead, it uses agdonotation that works for any graded monad. - Stream type. lean4-parser is generic in the input stream; prim-parser currently only supports
List.Vector Char n, but it could easily be generalised to any type that has its length in the type.
Danielsson 2010
Monadic. Both libraries are monadic but neither provides a standard
Monadinstance forParser. prim-parser'sParseris aGradedMonadinstance with monad laws proven as propositional equalities. Danielsson'sParserdefines abindand proves the monad laws up to bag equality of parse results. It cannot be an instance of the standardMonadtypeclass because coinduction appears inbind's argument types.Implementation. Danielsson uses a deep embedding based on Brzozowski derivatives: a parser is a data structure that is differentiated one token at a time, and running it returns all successful parses. prim-parser is a shallow embedding (a function from input to output).
Danielsson needs the language to support mixed induction and coinduction. Thus, it cannot be ported to Lean.
Recursion. Danielsson uses regular recursion. In prim-parser recursion can only be expressed through the fix combinator.
Errors. Since Danielsson's parsers return the list of all successful parses, failure is just the empty list, with no way to report why a parse failed. prim-parser, by contrast, supports custom errors.
Practical use. The code is published as the artifact accompanying three publications. It seems to me like the main purpose of the repo is to implement the theory presented in the papers, rather than to be a library for mainstream use.
Left recursion. A rule like
term ::= term '+' factoris left recursive because thetermappears at the start of its own definition. A direct translation loops forever in most parser libraries. prim-parser prevents the loop, but the user must unfold the left recursion into an iterative form, as shown below. This is the case in all parsec-style libraries. Danielsson expresses left-recursive grammars much more directly, but the coinductive arguments must be marked with♯. The grammar below is taken directly from Danielsson's paper.mutual -- term ::= factor | term '+' factor term = factor ∣ ♯ term >>= λ n₁ → tok '+' >>= λ _ → factor >>= λ n₂ → return (n₁ + n₂) -- factor ::= atom | factor '*' atom factor = atom ∣ ♯ factor >>= λ n₁ → tok '*' >>= λ _ → atom >>= λ n₂ → return (n₁ * n₂) -- atom ::= number | '(' term ')' atom = number ∣ tok '(' >>= λ _ → ♯ term >>= λ n → tok ')' >>= λ _ → return nIn prim-parser the same grammar is expressed with the
chainl1combinator.def addOp : Parser Error conditional (Nat → Nat → Nat) := (· + ·) <$ᵍ char '+' def mulOp : Parser Error conditional (Nat → Nat → Nat) := (· * ·) <$ᵍ char '*' def term : Parser Error conditional Nat := fix (fun term_rec => let atom := nat <|> parens term_rec let factor := chainl1 mulOp atom chainl1 addOp factor)
agdarsec
To the best of my knowledge, agdarsec and its ports tparsec (Idris) and parseque (Rocq) are the only implementations of a total parser combinator that have seen some practical adoption.
TODO remove paragraph below
Recursion is guarded by a modal operator □. The type □ Parser encapsulates a parser which can only be called on input strictly smaller than the input the enclosing parser was given. Recursive parsers are built with a fix combinator whose recursive handle is boxed in □. Full details can be found in the agdarsec paper.
In agdarsec every parser must consume input to succeed. This means that non-consuming parsers cannot be defined. In particular, the pure parser that returns a constant value and never consumes cannot be defined, which blocks a monad instance. As we will see in the connectors section, this degrades usability considerably.
Other parsers that cannot be defined in agdarsec include:
- zero or more repetition:
many,skipMany,sepBy,manyTill, etc. optional.lookahead,notFollowedBy.eof(end of input),getPosition.
agdarsec connectors vs prim-parser's gdo notation
The lack of a monadic interface forces agdarsec to provide a big collection of connectors to work around the limitation. These connectors add no meaning to the parser, and the user must remember all of them (use of mnemonic names makes it a bit simpler, but still non-trivial).
Below I list most of agdarsec's connectors, each paired with its prim-parser gdo equivalent. Although gdo is more verbose and needs explicit local bindings, it replaces the whole collection of connectors with one notation that functional programmers are already familiar with, making it far more ergonomic. Moreover, gdo is a generic macro that works for any graded monad, not just prim-parser's.
_>>=_ : Parser a → (a → Parser b) → Parser b:
agdarsec:
p >>= f
prim-parser:
gdo
let x ← p
f x
_&>>=_ : Parser a → (a → Parser (b a)) → Parser (Σ a b):
agdarsec:
p &>>= f
prim-parser:
gdo
let x ← p
let y ← f x
return ⟨x, y⟩
_&>>=′_ : Parser a → (a → Parser b) → Parser (a × b):
agdarsec:
p &>>=′ f
prim-parser:
gdo
let x ← p
let y ← f x
return (x, y)
_&?>>=_ : Parser a → (a → Parser (b a)) → Parser (Σ a (Maybe ∘ b)) makes the second parser optional:
agdarsec:
p &?>>= f
prim-parser:
gdo
let x ← p
let y ← optional (f x)
return ⟨x, y⟩
_?&>>=_ : Parser a → (Maybe a → Parser b) → Parser (Maybe a × b) makes the first parser optional:
agdarsec:
p ?&>>= f
prim-parser:
gdo
let x ← optional p
let y ← f x
return (x, y)
_<&?>_ : Parser a → Parser b → Parser (a × Maybe b):
agdarsec:
p <&?> q
prim-parser:
gdo
let x ← p
let y ← optional q
return (x, y)
_<&?_ : Parser a → Parser b → Parser a:
agdarsec:
p <&? q
prim-parser:
gdo
let x ← p
optional q
return x
_&?>_ : Parser a → Parser b → Parser (Maybe b):
agdarsec:
p &?> q
prim-parser:
gdo
p
optional q
_<?&>_ : Parser a → Parser b → Parser (Maybe a × b):
agdarsec:
p <?&> q
prim-parser:
gdo
let x ← optional p
let y ← q
return (x, y)
_<?&_ : Parser a → Parser b → Parser (Maybe a):
agdarsec:
p <?& q
prim-parser:
gdo
let x ← optional p
q
return x
_?&>_ : Parser a → Parser b → Parser b:
agdarsec:
p ?&> q
prim-parser:
gdo
optional p
q
_<⊎>_ : Parser a → Parser b → Parser (a ⊎ b):
agdarsec:
p <⊎> q
prim-parser:
(Sum.inl <$>ᵍ p) <|> (Sum.inr <$>ᵍ q)
<[_,_]> : (a → r) → (b → Parser r) → Parser (a ⊎ b) → Parser r:
agdarsec:
<[ f , k ]> pab
prim-parser:
gdo
let x ← pab
match x with
| .inl a => return (f a)
| .inr b => k b
A full example: S-expressions
To see the difference on a real example, here is a complete S-expression parser in each library, taken almost verbatim (comments have been stripped) from the respective repositories.
agdarsec (examples/SExp.agda):
sexp : ∀[ Parser SExp ]
sexp = fix (Parser SExp) $ λ rec →
let atom = Atom ∘ String.fromList ∘ List⁺.toList <$> list⁺ alpha
sexp = (λ (a , mb) → maybe (Pair a) a mb)
<$> parens (lift2 (λ p q → withSpaces p <&?> box (q <&? box spaces))
rec rec)
in atom <|> sexp
prim-parser (Examples/SExp.lean):
def patom : Parser Error conditional SExp :=
.atom <$>ᵍ takeWhile1 (·.isAlphanum)
def sexp : Parser Error conditional SExp :=
fix (fun sexp_rec =>
let plist : Parser Error conditional SExp := gdo
lexeme (char '(')
let first ← sexp_rec
let rest ← many (gdo whitespace; sexp_rec)
lexeme (char ')')
return listToPairs (first :: rest)
grade_by by simp
choice patom plist)
Future work
- Better error messages. The error type is currently just
String. The grade machinery is independent of the error representation, so swapping in a richer diagnostic ADT (with source positions and labelled expectations,parsec-style) is doable. - Generic input type. The input is currently fixed to
List.Vector Char n. Generalising to an arbitrary sized type is straightforward. - Expore monad transformers. Perhaps it is possible to introduce an underlying standard monad to the parser type.
- Split out graded monads. The
GradedFunctor/GradedApplicative/GradedMonadhierarchy and their lawful counterparts have nothing to do with parsers; they belong either in mathlib or in their own library.