From d160ffd8f52eb5840bbd215ef30bf72da2dbdeea Mon Sep 17 00:00:00 2001 From: Gregor Kleen Date: Tue, 12 Jan 2016 14:05:16 +0100 Subject: first draft of thermoprint-4 --- provider/posts/thermoprint-4.md | 48 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 provider/posts/thermoprint-4.md diff --git a/provider/posts/thermoprint-4.md b/provider/posts/thermoprint-4.md new file mode 100644 index 0000000..6b2a022 --- /dev/null +++ b/provider/posts/thermoprint-4.md @@ -0,0 +1,48 @@ +--- +title: On the Design of a Parser +published: 2016-01-12 +tags: Thermoprint +--- + +The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) +-- more specifically the contents of the directory `bbcode` in the +[git repo](git://git.yggdrasil.li/thermoprint#rewrite). + +In a manner consistent with designing software as +[compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of +our solution (as illustrated by the following mockup): + +~~~ {.haskell} +-- | Our target structure -- a rose tree with an explicit terminal constructor +data DomTree = Element Text (Map Text Text) [DomTree] + | Content Text + deriving (Show, Eq) + +type Errors = () + +bbcode :: Text -> Either Errors DomTree +-- ^ Parse BBCode +~~~ + +Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using +the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser +combinators instead (the exact usage of attoparsec is out of scope -- we'll just mockup the types). + +~~~ {.haskell} +data Token = BBOpen Text + | BBClose Text + | BBStr Text + +tokenize :: Parser [Token] + +type Error' = () + +runTokenizer :: Text -> Either Error' [Token] +~~~ + +We have now reduced the Problem to `[Token] -> DomTree`. +We quickly see that the structure of the problem is that of a +[fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). + +Having realised this we now require an object of type `DomTree -> Token -> DomTree` to recursively build up our target +structure. -- cgit v1.2.3