diff options
-rw-r--r-- | provider/posts/thermoprint-4.md | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/provider/posts/thermoprint-4.md b/provider/posts/thermoprint-4.md new file mode 100644 index 0000000..6b2a022 --- /dev/null +++ b/provider/posts/thermoprint-4.md | |||
@@ -0,0 +1,48 @@ | |||
1 | --- | ||
2 | title: On the Design of a Parser | ||
3 | published: 2016-01-12 | ||
4 | tags: Thermoprint | ||
5 | --- | ||
6 | |||
7 | The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) | ||
8 | -- more specifically the contents of the directory `bbcode` in the | ||
9 | [git repo](git://git.yggdrasil.li/thermoprint#rewrite). | ||
10 | |||
11 | In a manner consistent with designing software as | ||
12 | [compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of | ||
13 | our solution (as illustrated by the following mockup): | ||
14 | |||
15 | ~~~ {.haskell} | ||
16 | -- | Our target structure -- a rose tree with an explicit terminal constructor | ||
17 | data DomTree = Element Text (Map Text Text) [DomTree] | ||
18 | | Content Text | ||
19 | deriving (Show, Eq) | ||
20 | |||
21 | type Errors = () | ||
22 | |||
23 | bbcode :: Text -> Either Errors DomTree | ||
24 | -- ^ Parse BBCode | ||
25 | ~~~ | ||
26 | |||
27 | Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using | ||
28 | the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser | ||
29 | combinators instead (the exact usage of attoparsec is out of scope -- we'll just mockup the types). | ||
30 | |||
31 | ~~~ {.haskell} | ||
32 | data Token = BBOpen Text | ||
33 | | BBClose Text | ||
34 | | BBStr Text | ||
35 | |||
36 | tokenize :: Parser [Token] | ||
37 | |||
38 | type Error' = () | ||
39 | |||
40 | runTokenizer :: Text -> Either Error' [Token] | ||
41 | ~~~ | ||
42 | |||
43 | We have now reduced the Problem to `[Token] -> DomTree`. | ||
44 | We quickly see that the structure of the problem is that of a | ||
45 | [fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). | ||
46 | |||
47 | Having realised this we now require an object of type `DomTree -> Token -> DomTree` to recursively build up our target | ||
48 | structure. | ||