diff options
author | Gregor Kleen <gkleen@yggdrasil.li> | 2016-01-12 14:05:16 +0100 |
---|---|---|
committer | Gregor Kleen <gkleen@yggdrasil.li> | 2016-01-12 14:05:16 +0100 |
commit | d160ffd8f52eb5840bbd215ef30bf72da2dbdeea (patch) | |
tree | 6dc3b0c0b800631f28c6848f467495fe50e94d90 /provider/posts | |
parent | 49dad7900ed4ceb1c5a7d3b30632af3575baa1b0 (diff) | |
download | dirty-haskell.org-d160ffd8f52eb5840bbd215ef30bf72da2dbdeea.tar dirty-haskell.org-d160ffd8f52eb5840bbd215ef30bf72da2dbdeea.tar.gz dirty-haskell.org-d160ffd8f52eb5840bbd215ef30bf72da2dbdeea.tar.bz2 dirty-haskell.org-d160ffd8f52eb5840bbd215ef30bf72da2dbdeea.tar.xz dirty-haskell.org-d160ffd8f52eb5840bbd215ef30bf72da2dbdeea.zip |
first draft of thermoprint-4
Diffstat (limited to 'provider/posts')
-rw-r--r-- | provider/posts/thermoprint-4.md | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/provider/posts/thermoprint-4.md b/provider/posts/thermoprint-4.md new file mode 100644 index 0000000..6b2a022 --- /dev/null +++ b/provider/posts/thermoprint-4.md | |||
@@ -0,0 +1,48 @@ | |||
1 | --- | ||
2 | title: On the Design of a Parser | ||
3 | published: 2016-01-12 | ||
4 | tags: Thermoprint | ||
5 | --- | ||
6 | |||
7 | The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) | ||
8 | -- more specifically the contents of the directory `bbcode` in the | ||
9 | [git repo](git://git.yggdrasil.li/thermoprint#rewrite). | ||
10 | |||
11 | In a manner consistent with designing software as | ||
12 | [compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of | ||
13 | our solution (as illustrated by the following mockup): | ||
14 | |||
15 | ~~~ {.haskell} | ||
16 | -- | Our target structure -- a rose tree with an explicit terminal constructor | ||
17 | data DomTree = Element Text (Map Text Text) [DomTree] | ||
18 | | Content Text | ||
19 | deriving (Show, Eq) | ||
20 | |||
21 | type Errors = () | ||
22 | |||
23 | bbcode :: Text -> Either Errors DomTree | ||
24 | -- ^ Parse BBCode | ||
25 | ~~~ | ||
26 | |||
27 | Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using | ||
28 | the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser | ||
29 | combinators instead (the exact usage of attoparsec is out of scope -- we'll just mockup the types). | ||
30 | |||
31 | ~~~ {.haskell} | ||
32 | data Token = BBOpen Text | ||
33 | | BBClose Text | ||
34 | | BBStr Text | ||
35 | |||
36 | tokenize :: Parser [Token] | ||
37 | |||
38 | type Error' = () | ||
39 | |||
40 | runTokenizer :: Text -> Either Error' [Token] | ||
41 | ~~~ | ||
42 | |||
43 | We have now reduced the Problem to `[Token] -> DomTree`. | ||
44 | We quickly see that the structure of the problem is that of a | ||
45 | [fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). | ||
46 | |||
47 | Having realised this we now require an object of type `DomTree -> Token -> DomTree` to recursively build up our target | ||
48 | structure. | ||