diff options
| -rw-r--r-- | provider/posts/thermoprint-4.md | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/provider/posts/thermoprint-4.md b/provider/posts/thermoprint-4.md new file mode 100644 index 0000000..6b2a022 --- /dev/null +++ b/provider/posts/thermoprint-4.md | |||
| @@ -0,0 +1,48 @@ | |||
| 1 | --- | ||
| 2 | title: On the Design of a Parser | ||
| 3 | published: 2016-01-12 | ||
| 4 | tags: Thermoprint | ||
| 5 | --- | ||
| 6 | |||
| 7 | The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) | ||
| 8 | -- more specifically the contents of the directory `bbcode` in the | ||
| 9 | [git repo](git://git.yggdrasil.li/thermoprint#rewrite). | ||
| 10 | |||
| 11 | In a manner consistent with designing software as | ||
| 12 | [compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of | ||
| 13 | our solution (as illustrated by the following mockup): | ||
| 14 | |||
| 15 | ~~~ {.haskell} | ||
| 16 | -- | Our target structure -- a rose tree with an explicit terminal constructor | ||
| 17 | data DomTree = Element Text (Map Text Text) [DomTree] | ||
| 18 | | Content Text | ||
| 19 | deriving (Show, Eq) | ||
| 20 | |||
| 21 | type Errors = () | ||
| 22 | |||
| 23 | bbcode :: Text -> Either Errors DomTree | ||
| 24 | -- ^ Parse BBCode | ||
| 25 | ~~~ | ||
| 26 | |||
| 27 | Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using | ||
| 28 | the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser | ||
| 29 | combinators instead (the exact usage of attoparsec is out of scope -- we'll just mockup the types). | ||
| 30 | |||
| 31 | ~~~ {.haskell} | ||
| 32 | data Token = BBOpen Text | ||
| 33 | | BBClose Text | ||
| 34 | | BBStr Text | ||
| 35 | |||
| 36 | tokenize :: Parser [Token] | ||
| 37 | |||
| 38 | type Error' = () | ||
| 39 | |||
| 40 | runTokenizer :: Text -> Either Error' [Token] | ||
| 41 | ~~~ | ||
| 42 | |||
| 43 | We have now reduced the Problem to `[Token] -> DomTree`. | ||
| 44 | We quickly see that the structure of the problem is that of a | ||
| 45 | [fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). | ||
| 46 | |||
| 47 | Having realised this we now require an object of type `DomTree -> Token -> DomTree` to recursively build up our target | ||
| 48 | structure. | ||
