From 5b063193f389ef472366e4355a683f1843f29733 Mon Sep 17 00:00:00 2001 From: Gregor Kleen Date: Fri, 27 May 2016 19:40:18 +0200 Subject: structure --- provider/posts/thermoprint/1.md | 130 +++++++++++++++++++ provider/posts/thermoprint/2.lhs | 262 +++++++++++++++++++++++++++++++++++++++ provider/posts/thermoprint/3.lhs | 92 ++++++++++++++ provider/posts/thermoprint/4.md | 116 +++++++++++++++++ provider/posts/thermoprint/5.md | 198 +++++++++++++++++++++++++++++ provider/posts/thermoprint/6.lhs | 142 +++++++++++++++++++++ 6 files changed, 940 insertions(+) create mode 100644 provider/posts/thermoprint/1.md create mode 100644 provider/posts/thermoprint/2.lhs create mode 100644 provider/posts/thermoprint/3.lhs create mode 100644 provider/posts/thermoprint/4.md create mode 100644 provider/posts/thermoprint/5.md create mode 100644 provider/posts/thermoprint/6.lhs (limited to 'provider/posts/thermoprint') diff --git a/provider/posts/thermoprint/1.md b/provider/posts/thermoprint/1.md new file mode 100644 index 0000000..032e2f6 --- /dev/null +++ b/provider/posts/thermoprint/1.md @@ -0,0 +1,130 @@ +--- +title: On the Architecture of a tool-set for interacting with character-oriented printers +published: 2015-12-25 +tags: Thermoprint +--- + +# Motivation + +Some time ago I bought a cheap Chinese +[thermoprinter](https://en.wikipedia.org/wiki/Thermal_printing) off eBay. +As expected the printers firmware is really awkward to use (including binary +control codes used to switch between char sets such as bold, italic, underlined, +etc.). +The obvious solution was to write a library to parse a more sensible +representation and send it to be printed. + +Since there might, at some point, be other users wanting to print to my +acquisition the architecture is intended to be present a somewhat usable +interface to the uninitiated. + +# Implementation + +## Location + +Recently I created a new branch in +[thermoprint](https://git.yggdrasil.li/thermoprint) called +[rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite). + +## Architecture Overview + +The new macroscopic architecture I´m currently aiming for is quite similar to +the old one: + + * A server intended to run on the machine connected to my cheap printer talking + directly to the printer on one end and serving a + [json api](https://hackage.haskell.org/package/servant) on the other. + * A (hopefully) tiny cli tool for debugging and personal use. + * A website (it will probably end up being based on + [yesod](https://hackage.haskell.org/package/yesod)) presenting a web interface + similar to the cli tool. + +## Features + +Features I intend to implement include: + + * A parser for a bbcode-dialect which should be used in both the cli tool and the + website (it will probably end up using + [attoparsec](https://hackage.haskell.org/package/attoparsec)) -- bbcode as + presented on [Wikipedia](https://en.wikipedia.org/wiki/BBCode) is a proper + superset of the feature-set of my cheap Chinese printer. + * Reasonable test coverage using + [QuickCheck](https://hackage.haskell.org/package/QuickCheck), + [HUnit](http://hackage.haskell.org/package/HUnit). + + Automatic testing with [cabal](https://www.haskell.org/cabal/) facilitated by + [hspec](https://hackage.haskell.org/package/hspec). + * Support and server-side storage for drafts. + * The Website should provide some richer formats than bbcode which will + probably find inclusion in the payload datastructure such as lists, + checklists, tables, etc. + + The cli-tool should be able to use these too (the input will probably end up + being json-formatted). + +## Work so far + +### Prototype + +I already have a prototype. +It's quite bug-ridden and has recently developed serious problems actually +printing after working satisfactorily for a few weeks. + +It also does not include a web-interface and I am quite unsatisfied with the +overall code quality. + +The [685 lines of code](http://cloc.sourceforge.net/) can be found in the +[repo](https://git.yggdrasil.li/thermoprint?h=master) as well. + +### Rewrite + +Currently the [rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite) contains a +single file of moment -- spec/src/Thermoprint/Printout.hs -- wherein we define +the payload for the api -- our take on a structured document format (somewhat +inspired by +[pandoc](http://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html)): + +~~~ {.haskell} +-- | A 'Printout' is a sequence of visually seperated 'Paragraph's +type Printout = Seq Paragraph + +-- | A 'Paragraph' is a non-seperated sequence of 'Chunk's +type Paragraph = Seq Chunk + +-- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. +-- +-- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' +data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer + | Raw ByteString -- ^ direct instructions to the printer + deriving (Generic, NFData, Show, CoArbitrary) + +-- | 'Block' is the entry point for our structured document format +data Block = Line Line -- ^ a single 'Line' of text + | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines + | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines + deriving (Generic, NFData, Show, CoArbitrary) + +{- | A 'Line' is one of: + + * a single word + * horizontal space equivalent to the width of 'Integer' `em`. + * a sequence of words seperated by spaces + +We don't export all constructors and instead encourage the use of 'text'. +-} +data Line = Word Text + | HSpace Integer + | SpaceSep (Seq Line) + deriving (Generic, NFData, Show, CoArbitrary) +~~~ + +(The code is verbatim as of 8307d7e). + + + + + diff --git a/provider/posts/thermoprint/2.lhs b/provider/posts/thermoprint/2.lhs new file mode 100644 index 0000000..a144fb5 --- /dev/null +++ b/provider/posts/thermoprint/2.lhs @@ -0,0 +1,262 @@ +--- +title: On the design of a structured document format compatible with character oriented printers +published: 2016-01-11 +tags: Thermoprint +--- + +This post is an annotated version of the file [spec/src/Thermoprint/Printout.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/Printout.hs?h=rewrite&id=f6dc3d1) as of commit `f6dc3d1`. + +> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} +> {-# LANGUAGE OverloadedStrings #-} +> {-# OPTIONS_HADDOCK show-extensions #-} + +Motivation +---------- + +We want our codebase to be compatible with as many different models of printers as we are willing to implement. +It is therefore desirable to maintain a structured document format which we can transform into a printer-specific representation of the payload to be printed with minimal effort. + +In this post we present one such format. + +Contents +-------- + +> -- | This module contains the definition of the structure -- 'Printout' -- used to represent the content of a job +> module Thermoprint.Printout +> ( Printout(..) +> , Paragraph(..) +> , Chunk(..) +> , Block(..) +> , Line( HSpace +> , SpaceSep +> ) +> , text, cotext +> , prop_text +> ) where + +Preliminaries +------------- + +> import Data.Sequence (Seq, (|>), (<|)) + +A Sequence represents the same structure as the linked lists common in haskell but supports $O(1)$ `snoc`, which is desirable since we intend to iteratively build up the structure when parsing input formats. + +> import Data.Text.Lazy (Text) +> +> import Data.ByteString.Lazy (ByteString) + +The entire structure will be lazy by default but an instance of `NFData`, thus the lazy variants of `Text` and `ByteString`. + +> import GHC.Generics (Generic) + +We will use derived instances of `Generic` to get handed suitable instances of rather complicated classes such as `Arbitrary` and `FromJSON` + +> import Control.DeepSeq (NFData) + +Instances of `NFData` allow us to strictly evaluate our document structure when needed + +> import Data.Aeson (FromJSON(..), ToJSON(..), Value(..)) +> import qualified Data.Aeson as JSON (encode, decode) +> import Data.Aeson.Types (typeMismatch) + +We will encode the document as a [json](https://en.wikipedia.org/wiki/JSON) object during transport + +> import Test.QuickCheck.Arbitrary (Arbitrary(..), CoArbitrary, genericShrink) +> import Test.QuickCheck.Modifiers (NonNegative(..)) +> import Test.QuickCheck.Gen (oneof, suchThat, scale) +> import Test.QuickCheck.Instances +> import Test.QuickCheck (forAll, Property) + +We will use [QuickCheck](https://hackage.haskell.org/package/QuickCheck) for automatic test generation. + +> import qualified Data.Text.Lazy as TL (split, null, pack, filter, intercalate, map) +> import qualified Data.Text as T (pack) +> import Data.Char (isSpace) +> +> import Data.Monoid (Monoid(..), (<>)) +> +> import Data.List (dropWhile, dropWhileEnd, groupBy, genericLength, genericReplicate) +> +> import Data.Sequence as Seq (fromList, null, singleton) +> +> import Data.Function (on) +> +> import Data.Foldable (toList, fold) + +We will need to do some parsing and pretty-printing to implement `text` and `cotext`, respectively. + +> import Data.Encoding (encodeLazyByteStringExplicit, decodeLazyByteString) +> import Data.Encoding.UTF8 +> import qualified Data.ByteString.Base64.Lazy as Base64 (encode, decode) + +Since we want end users to be able to include direct instructions the printer in the form of a lazy [`ByteString`](https://hackage.haskell.org/package/bytestring) we need some way to encode `ByteString`s in JSON. +We chose [base64](https://hackage.haskell.org/package/base64-bytestring). + +> import Prelude hiding (fold) +> +> +> -- | A 'Printout' is a sequence of visually seperated 'Paragraph's +> type Printout = Seq Paragraph + +"visually seperated" will most likely end up meaning "seperated by a single blank line" + +> -- | A 'Paragraph' is a non-seperated sequence of 'Chunk's +> type Paragraph = Seq Chunk +> +> -- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. +> -- +> -- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' +> data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer +> | Raw ByteString -- ^ direct instructions to the printer +> deriving (Generic, NFData, Show, CoArbitrary) +> +> instance FromJSON Chunk where +> parseJSON s@(String _) = Raw <$> ((either fail return . decodeBase64) =<< parseJSON s) +> where +> decodeBase64 :: String -> Either String ByteString +> decodeBase64 s = (either (Left . show) Right . encodeLazyByteStringExplicit UTF8Strict $ s) >>= Base64.decode +> parseJSON o@(Object _) = Cooked <$> parseJSON o +> parseJSON v = typeMismatch "Chunk" v +> +> instance ToJSON Chunk where +> toJSON (Raw bs) = String . T.pack . decodeLazyByteString UTF8Strict . Base64.encode $ bs +> toJSON (Cooked block) = toJSON block + +We provide custom instances of `FromJSON Chunk` and `ToJSON Chunk` so that we might reduce the sice of the resulting JSON somewhat (this is an opportune target since disambiguaty is simple) + +> -- | 'Block' is the entry point for our structured document format +> data Block = Line Line -- ^ a single 'Line' of text +> | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines +> | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines +> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) +> +> {- | A 'Line' is one of: +> +> * a single word +> * horizontal space equivalent to the width of 'Integer' `em`. +> * a sequence of words seperated by spaces +> +> We don't export all constructors and instead encourage the use of 'text'. +> -} +> data Line = Word Text +> | HSpace Integer +> | SpaceSep (Seq Line) +> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) +> +> instance Monoid Block where +> mempty = NewlSep mempty +> x@(NewlSep xs) `mappend` y@(NewlSep ys) +> | Seq.null xs = y +> | Seq.null ys = x +> | otherwise = NewlSep (xs <> ys) +> (NewlSep xs) `mappend` y +> | Seq.null xs = y +> | otherwise = NewlSep (xs |> y) +> x `mappend` (NewlSep ys) +> | Seq.null ys = x +> | otherwise = NewlSep (x <| ys) +> x `mappend` y = NewlSep $ Seq.fromList [x, y] +> +> instance Monoid Line where +> mempty = SpaceSep mempty +> x@(SpaceSep xs) `mappend` y@(SpaceSep ys) +> | Seq.null xs = y +> | Seq.null ys = x +> | otherwise = SpaceSep (xs <> ys) +> (SpaceSep xs) `mappend` y +> | Seq.null xs = y +> | otherwise = SpaceSep (xs |> y) +> x `mappend` (SpaceSep ys) +> | Seq.null ys = x +> | otherwise = SpaceSep (x <| ys) +> x `mappend` y = SpaceSep $ Seq.fromList [x, y] + +The Monoid instances for `Block` and `Line` are somewhat unwieldy since we want to guarantee minimum overhead by reducing expressions such as `SpaceSep (fromList [x])` to `x` whenever possible. + +The same effect would have been possible by introducing the monoid structure *one level higher* -- we could have introduced constructors such as `Line :: Seq Word -> Block`. +This was deemed undesirable since we would not have been able to implement instances such as `Monoid Line` which allow the use of more generic functions during parsing. + +> text :: Text -> Either Block Line +> -- ^ Smart constructor for 'Line'/'Block' which maps word and line boundaries (as determined by 'isSpace' and '(== '\n')' respectively) to the structure of 'Block' and 'Line'. +> -- +> -- Since we are unwilling to duplicate the list of chars from 'isSpace' we cannot reasonably determine a width for the various whitespace 'Char's. +> -- Thus they are all weighted equally as having width 1 `em`. +> text t = case splitLines t of +> [] -> Right mempty +> [Line x] -> Right x +> xs -> Left $ mconcat xs +> where +> splitLines :: Text -> [Block] +> splitLines t = map toBlock +> . groupBy ((==) `on` TL.null) +> $ TL.split (== '\n') t +> splitWords :: Text -> [Line] +> splitWords t = map toLine +> . groupBy ((==) `on` TL.null) +> $ TL.split isSpace t +> toBlock [] = mempty +> toBlock xs@(x:_) +> | TL.null x = VSpace $ genericLength xs - 1 +> | otherwise = mconcat . map (Line . mconcat . splitWords) $ xs +> toLine [] = mempty +> toLine xs@(x:_) +> | TL.null x = HSpace $ genericLength xs - 1 +> | otherwise = mconcat . map Word $ xs +> list :: b -> (a -> [a] -> b) -> [a] -> b +> list c _ [] = c +> list _ f (x:xs) = f x xs + +Implementations using `TL.lines` and `TL.words` were tested. +We chose to use `TL.split`-based solutions instead because the more specific splitting functions provided by [text](https://hackage.haskell.org/package/text) drop information concerning the exact amount of whitespace. + +> cotext :: Block -> Text +> -- ^ inverse of +> -- @ +> -- either id Line . `text` +> -- @ +> cotext (VSpace n) = TL.pack . genericReplicate n $ '\n' +> cotext (NewlSep xs) = TL.intercalate "\n" . map cotext . toList $ xs +> cotext (Line x) = cotext' x +> where +> cotext' (Word x) = x +> cotext' (HSpace n) = TL.pack . genericReplicate n $ ' ' +> cotext' (SpaceSep xs) = TL.intercalate " " . map cotext' . toList $ xs + +We provide cotext for testing `text` and to enable determining semantic equality of `Printout`s at a later date + +> prop_text :: Text -> Bool +> -- ^ prop> (`cotext` . either id Line . `text` $ x) == x +> -- +> -- Where 'x' is restricted to those `TL.Text` which do not contain whitespace besides ' ' and '\n'. +> prop_text x = (cotext . either id Line . text $ x') == x' +> where +> x' = TL.map normSpace x +> normSpace c +> | isSpace c +> , c `elem` keep = c +> | isSpace c = ' ' -- We have to do this because all whitespace gets interpreted as width 1 +> | otherwise = c +> keep = [' ', '\n'] +> +> -- | We don't test 'Raw' 'Chunk's +> instance Arbitrary Chunk where +> shrink = genericShrink +> arbitrary = Cooked <$> arbitrary +> +> instance Arbitrary Block where +> shrink = genericShrink +> arbitrary = oneof [ Line <$> arbitrary +> , VSpace . getNonNegative <$> arbitrary +> , NewlSep <$> scale' arbitrary +> ] +> +> instance Arbitrary Line where +> shrink = genericShrink +> arbitrary = oneof [ Word . TL.filter (not . isSpace) <$> arbitrary -- 'isSpace '\n' == True' +> , HSpace . getNonNegative <$> arbitrary +> , SpaceSep <$> scale' arbitrary +> ] +> +> scale' = scale (round . sqrt . fromInteger . toInteger) + +Failing to properly scale the tested structures was shown to use more than 8GiB of RAM during testing diff --git a/provider/posts/thermoprint/3.lhs b/provider/posts/thermoprint/3.lhs new file mode 100644 index 0000000..5e7eee9 --- /dev/null +++ b/provider/posts/thermoprint/3.lhs @@ -0,0 +1,92 @@ +--- +title: Thoughts on a network protocol for a toolset for interacting with character-oriented printers +published: 2016-01-11 +tags: Thermoprint +--- + +This post is an annotated version of the file [spec/src/Thermoprint/API.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/API.hs?h=rewrite&id=3ad700c) as of commit `3ad700c`. + +> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} +> {-# LANGUAGE TypeOperators, DataKinds #-} +> {-# LANGUAGE OverloadedStrings #-} +> +> module Thermoprint.API +> ( PrinterStatus(..) +> , JobStatus(..) +> , ThermoprintAPI +> , thermoprintAPI +> , module Thermoprint.Identifiers +> , module Thermoprint.Printout +> ) where +> +> import Thermoprint.Printout + +See [a previous post](https://dirty-haskell.org/posts/thermoprint-2.html). + +> import Thermoprint.Identifiers + +`Thermoprint.Identifiers` provides some newtypes of `Integer` to add some typesafety to dealing with objects identified by autoincremented numbers + +> import Servant.API +> import Servant.Docs +> import Data.Aeson + +We will define our API to be compatible with [servant](https://hackage.haskell.org/package/servant) + +> import Data.Set (Set) +> import Data.Sequence (Seq) + +Higher performance versions of lists for our various applications + +> import GHC.Generics (Generic) +> +> import Data.Proxy (Proxy(..)) +> +> import Control.Exception (Exception) +> import Data.Typeable (Typeable) +> +> data PrinterStatus = Busy JobId +> | Available +> deriving (Generic, Show, FromJSON, ToJSON) +> +> data JobStatus = Queued +> | Printing +> | Done +> | Failed PrintingError +> deriving (Generic, Show, FromJSON, ToJSON) +> +> data PrintingError = UnknownError +> deriving (Typeable, Generic, Show, FromJSON, ToJSON, Exception) + +We expect the definiton of `PrintingError` to grow considerably while implementing a server for this API + +We support the following actions through our API: + +> type ThermoprintAPI = "printers" :> Get '[JSON] (Set PrinterId) -- List the identifiers of all available printers (/printers) +> :<|> "printer" :> Capture "printerId" PrinterId :> ( +> ReqBody '[JSON] Printout :> Post '[JSON] JobId -- Add a new job to the bottom of the queue by sending its content (/printer:printerId) +> :<|> "status" :> Get '[JSON] PrinterStatus -- Query the current status of a printer (/printer:printerId/status) +> ) +> :<|> "jobs" :> ( +> QueryParam "printer" PrinterId :> QueryParam "min" JobId :> QueryParam "max" JobId :> Get '[JSON] (Seq JobId) -- List all jobs allowing for selection by printerId and pagination (/jobs?printer=*&min=*&max=*) +> ) +> :<|> "job" :> Capture "jobId" JobId :> ( +> Get '[JSON] Printout -- Get the contents of a job currently known to the server (/job:jobId) +> :<|> "status" :> Get '[JSON] JobStatus -- Get the status of a job (/job:jobId/status) +> :<|> "printer" :> Get '[JSON] PrinterId -- Find the printer a job was queued for (/job:jobId/printer) +> :<|> Delete '[] () -- Abort a job (which we expect to make it unknown to the server) (/job:jobId) +> ) +> :<|> "drafts" :> ( +> Get '[JSON] (Set DraftId) -- List the identifiers of all drafts known to the server (/drafts) +> :<|> ReqBody '[JSON] Printout :> Post '[JSON] DraftId -- Make a draft known to the server by submitting its contents (/drafts) +> ) +> :<|> "draft" :> Capture "draftId" DraftId :> ( +> ReqBody '[JSON] Printout :> Put '[] () -- Update a draft by replacing its contents (/draft:draftId) +> :<|> Get '[JSON] Printout -- Get the contents of a draft (/draft:draftId) +> :<|> Delete '[] () -- Delete a draft (/draft:draftId) +> ) +> +> thermoprintAPI :: Proxy ThermoprintAPI +> thermoprintAPI = Proxy + +servant needs an object of type `Proxy ThermoprintAPI` in various places diff --git a/provider/posts/thermoprint/4.md b/provider/posts/thermoprint/4.md new file mode 100644 index 0000000..756c166 --- /dev/null +++ b/provider/posts/thermoprint/4.md @@ -0,0 +1,116 @@ +--- +title: On the Design of a Parser +published: 2016-01-12 +tags: Thermoprint +--- + +The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) +-- more specifically the contents of the directory `bbcode` in the +[git repo](https://git.yggdrasil.li/thermoprint/tree/bbcode?h=rewrite&id=dc99dae). + +In a manner consistent with designing software as +[compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of +our solution (as illustrated by the following mockup): + +~~~ {.haskell} +-- | Our target structure -- a rose tree with an explicit terminal constructor +data DomTree = Element Text (Map Text Text) [DomTree] + | Content Text + deriving (Show, Eq) + +bbcode :: Text -> Maybe DomTree +-- ^ Parse BBCode +~~~ + +Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using +the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser +combinators instead. + +We reproduce an incomplete version of the lexer below (it’s missing tag attributes and self-closing tags). + +We introduce `escapedText`, a helper function for extracting text until we reach one of a set of delimiting characters +(exclusive). +While doing this we also parse any delimiting character iff it's prefixed with an escape character (we use `\`) -- the +escape character itself needs only be escaped if encountered directly before one of the delimiting characters. + +~~~ {.haskell} +data Token = BBOpen Text -- ^ "[open]" + | BBClose Text -- ^ "[/close]" + | BBStr Text -- ^ "text" + +token :: Parser [Token] +token = BBClose <$ "[/" <*> escapedText' [']'] <* "]" + <|> BBOpen <$ "[" <*> escapedText' [']'] <* "]" + <|> BBStr <$> escapedText ['['] + +escapedText' :: [Char] -> Parser Text +escapedText' = option "" . escapedText + +escapedText :: [Char] -> Parser Text +escapedText [] = takeText -- No delimiting characters -- parse all remaining input +escapedText cs = recurse $ choice [ takeWhile1 (not . special) -- a series of characters we don't treat as special + , escapeSeq -- an escaped delimiter + , escapeChar' -- the escape character + ] + where + escapeChar = '\\' + special = inClass $ escapeChar : cs + escapeChar' = string $ T.singleton escapeChar + escapeSeq = escapeChar' *> (T.singleton <$> satisfy special) -- escape character followed by a special character (which encludes the escape character) + recurse p = mappend <$> p <*> escapedText' cs -- parse a prefix and optionally append another chunk of escapedText + +runTokenizer :: Text -> Maybe [Token] +runTokenizer = either (const Nothing) Just . parseOnly (many token <* endOfInput) +~~~ + +We have now reduced the Problem to `[Token] -> DomTree`. +We quickly see that the structure of the problem is that of a +[fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). + +Having realised this we require a function of type `Token -> DomTree -> DomTree` to recursively build up our target +structure. + +In general we’ll want to not only keep track of the `DomTree` during recursion but also maintain a reference to the +position at which we’ll be inserting new tokens. +This kind of problem is well understood and solved idiomatically by using a +[zipper](https://en.wikipedia.org/wiki/Zipper_(data_structure)) +([a cursory introduction](http://learnyouahaskell.com/zippers)). + +Writing zippers tends to be tedious. We’ll therefore introduce an +[additional intermediate structure](https://hackage.haskell.org/package/containers/docs/Data-Tree.html) for which an +[implementation](https://hackage.haskell.org/package/rosezipper) is available readily. +The morphism from this new structure (`Forest BBLabel`) to our `DomTree` will be almost trivial. + +~~~ {.haskell} +import Data.Tree.Zipper (TreePos, Empty, Full) +import qualified Data.Tree.Zipper as Z + +data BBLabel = BBTag Text + | BBPlain Text + +rose :: [BBToken] -> Maybe (Forest BBLabel) +rose = Z.toForest <$> foldM (flip rose') (Z.fromForest []) + +rose' :: BBToken -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) +rose' (BBStr t) = return . Z.nextSpace . Z.insert (Node (BBPlain t) []) -- insert a node with no children and move one step to the right in the forest we’re currently viewing +rose' (BBOpen t) = return . Z.children . Z.insert (Node (BBTag t) []) -- insert the node and move into position to insert it's first child +rose' (BBClose t) = close t -- haskell complains if multiple equations for the same function have a differing number of arguments, therefore: 'close' + where + close :: Text -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) + close tag pos = do + pos' <- Z.parent pos -- fail if we're trying to close a tag that does not have a parent (this indicates imbalanced tags) + let + pTag = (\(BBTag t) -> t) $ Z.label pos' -- yes, this will fail unceremoniously if the parent is not a tag, this poses no problem since we're constructing the structure ourselves. The proof that this failure mode does not occur is left as an exercise for the reader. + guard (pTag == tag) -- The structure shows that this mode of failure (opening tags content does not match the closing tags) is not logically required -- it only serves as a *notification* to the user + return $ Z.nextSpace pos' -- move one level up and to point at the next sibling of the parent +~~~ + +All that is left to do now is present our final morphism: + +~~~ {.haskell} +dom :: Forest BBLabel -> [DomTree] +dom = map dom' + where + dom' (Node (BBPlain t) []) = Content t + dom' (Node (BBTag t) ts = Element t $ map dom' ts +~~~ diff --git a/provider/posts/thermoprint/5.md b/provider/posts/thermoprint/5.md new file mode 100644 index 0000000..0249734 --- /dev/null +++ b/provider/posts/thermoprint/5.md @@ -0,0 +1,198 @@ +--- +title: Building an Extensible Framework for Specifying Compile-Time Configuration using Universal Quantification +tags: Thermoprint +published: 2016-02-18 +--- + +When I write *Universal Quantification* I mean what is commonly referred to as +[existential quantification](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/data-type-extensions.html#existential-quantification), +which I think is a misnomer. To wit: + +$( \exists x \ldotp f(x) ) \to y$ is isomorphic to $\forall x \ldotp (f(x) \to y)$ (I +won´t try to back this claim up with actual category theory just now. You might want to +nag me occasionally if this bothers you -- I really should invest some more time into +category theory). Since haskell does not support `exists` we´re required to use the +`forall`-version, which really is universally quantified. + +## Printer Configuration + +What we want is to have the user provide us with a set of specifications of how to +interact with one printer each. +Something like the following: + +~~~ {.haskell} +newtype PrinterMethod = PM { unPM :: Printout -> IO (Maybe PrintingError) } + +data Printer = Printer + { print :: PrinterMethod + , queue :: TVar Queue + } +~~~ + +The first step in refining this is necessitated by having the user provide the +[monad-transformer-stack](http://book.realworldhaskell.org/read/monad-transformers.html) +to use at compile time. +Thus we introduce our first universal quantification (in conjunction with +[polymorphic components](https://prime.haskell.org/wiki/PolymorphicComponents)) -- this +one is not isomorphic to an existential one: + +~~~ {.haskell} +newtype PrinterMethod = PM { unPm :: forall m. MonadResource m => Printout -> m (Maybe PrintingError) } +~~~ + +Since we don´t want to *burden* the user with the details of setting up `TVar Queue`{.haskell} we +also introduce function to help with that: + +~~~ {.haskell} +printer :: MonadResource m => PrinterMethod -> m Printer +printer p = Printer p <$> liftIO (newTVarIO def) +~~~ + +We could at this point provide ways to set up `PrinterMethod`{.haskell}s and have the user +provide us with a list of them. + +We, however, have numerous examples of printers which require some setup (such opening a +file descriptor). The idiomatic way to handle this is to decorate that setup with some +constraints and construct our list of printers in an +[`Applicative`{.haskell}](https://hackage.haskell.org/package/base/docs/Control-Applicative.html#t:Applicative) +fashion: + +~~~ {.haskell} +printer :: MonadResource m => m PrinterMethod -> m Printer +printer p = Printer <$> p <*> liftIO (newTVarIO def) +~~~ + +At this point a toy implementation of a printer we might provide looks like this: + +~~~ {.haskell} +debugPrint :: Applicative m => m PrinterMethod +debugPrint = pure . PM $ const return Nothing <=< liftIO . putStrLn . toString + +toString :: Printout -> String +toString = undefined +~~~ + +## Management of Printer Queues + +We would like the user to be able to modify the printer queues we maintain in arbitrary +ways. +The motivation for this being various cleanup operations such as pruning all successful +jobs older than a few minutes or limiting the size of history to an arbitrary number of +entries. + +A pattern for this type of modification of a value residing in a `TVar`{.haskell} might +look like this: + +~~~ {.haskell} +modify :: TVar a -> StateT a STM () -> IO () +modify q f = atomically $ writeTVar =<< runStateT f =<< readTVar q +~~~ + +A rather natural extension of this is to allow what we will henceforth call a +`QueueManager`{.haskell} (currently `StateT a STM ()`{.haskell}) to return an indication +of when it wants to be run again: + +~~~ {.haskell} +type QueueManager = StateT Queue STM Micro + +runQM :: QueueManager -> TVar Queue -> IO () +runQM qm q = sleep << qm' + where + qm' = atomically $ (\(a, s) -> a <$ writeTVar q s) =<< runStateT qm =<< readTVar q + sleep (abs -> delay) = threadDelay (fromEnum delay) >> runQM qm q +~~~ + +It stands to reason that sometimes we don't want to run the `QueueManager`{.haskell} ever +again (probably causing the thread running it to terminate). +For doing so we +[extend the real numbers](https://en.wikipedia.org/wiki/Extended_real_number_line) as +represented by `Micro`{.haskell} to +[`Extended Micro`{.haskell}](https://hackage.haskell.org/package/extended-reals): + +~~~ {.haskell} +type QueueManager = StateT Queue STM (Extended Micro) + +runQM … + where + … + sleep (abs -> delay) + | (Finite d) <- delay = threadDelay (fromEnum d) >> runQM qm q + | otherwise = return () +~~~ + +`QueueManager`{.haskell}s whose type effectively is `Queue -> STM (Queue, Extended Micro)`{.haskell} +are certainly useful but can carry no state between invocations (which would be useful +e.g. for limiting the rate at which we prune jobs). + +Therefore we allow the user to provide an arbitrary monad functor (we use +`MFunctor`{.haskell} from +[mmorph](https://hackage.haskell.org/package/mmorph-1.0.6/docs/Control-Monad-Morph.html#t:MFunctor) +instead of `Servant.Server.Internal.Enter` because +[servant-server](https://hackage.haskell.org/package/servant-server-0.4.4.6/docs/Servant-Server-Internal-Enter.html#v:Nat) +doesn't provide all the tools we require for this) which can carry all the state we could +ever want: + +~~~ {.haskell} +type QueueManager t = QueueManagerM t (Extended Micro) +type QueueManagerM t = ComposeT (StateT Queue) t STM -- 'ComposeT' is required since we need 'QueueManagerM' to have the form 't' STM' for some 't'' in order to be able to use 'lift' + +runQM :: (MFunctor t, MonadTrans t, MonadIO (t IO), Monad (t STM)) => QueueManager t -> TVar Queue -> t IO () +runQM … -- nearly identical except for a sprinkling of 'lift' +~~~ + +The final touches are to introduce a typeclass `HasQueue`{.haskell} for convenience: + +~~~ {.haskell} +class HasQueue a where + extractQueue :: a -> TVar Queue + +instance HasQueue (TVar Queue) where + extractQueue = id + +instance HasQueue Printer where + extractQueue = queue +~~~ + +and provide some utility functions for composing `QueueManager`{.haskell}s: + +~~~ {.haskell} +intersection :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t +-- ^ Combine two 'QueueManager's keeping only 'QueueEntry's both managers decide to keep +-- +-- Side effects propagate left to right + +idQM :: Monad (QueueManagerM t) => QueueManager t +-- ^ Identity of 'intersect' + +union :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t +-- ^ Combine two 'QueueManager's keeping all 'QueueEntry's either of the managers decides to keep +-- +-- Side effects propagate left to right + +nullQM :: MonadState Queue (QueueManagerM t) => QueueManager t +-- ^ Identity of 'union' +~~~ + +We merge the effects of two `QueueManager`{.haskell}s by converting the resulting +`Queue`{.haskell}s to `Set`{.haskell}s and using `Set.union`{.haskell} and +`Set.intersection`{.haskell} with appropriate `Ord`{.haskell} and `Eq`{.haskell} +instances. + +### Configuration of `QueueManager`{.haskell}s + +A `QueueManager`{.haskell}s configuration shall be a `QueueManager t`{.haskell} associated +with a specification of how to collapse its monad transformer `t`{.haskell}. +Using universal quantification this is straightforward: + +~~~ {.haskell} +data QMConfig m = forall t. ( MonadTrans t + , MFunctor t + , Monad (t STM) + , MonadIO (t IO) + ) => QMConfig { manager :: QueueManager t + , collapse :: (t IO) :~> m + } + +runQM' :: Printer -> QMConfig m -> m () +runQM' printer (QMConfig qm nat) = unNat nat $ runQM qm printer +~~~ diff --git a/provider/posts/thermoprint/6.lhs b/provider/posts/thermoprint/6.lhs new file mode 100644 index 0000000..9182427 --- /dev/null +++ b/provider/posts/thermoprint/6.lhs @@ -0,0 +1,142 @@ +--- +title: Deriving a Client Library for Interacting with Character-Oriented Printers +tags: Thermoprint +published: 2016-02-18 +--- + +> {-# LANGUAGE DataKinds #-} +> {-# LANGUAGE TypeOperators #-} +> {-# LANGUAGE ViewPatterns #-} +> {-# LANGUAGE RecordWildCards #-} +> +> -- | A client library for 'Thermoprint.API' +> module Thermoprint.Client +> ( Client(..) +> , mkClient, mkClient' +> , throwNat, ioNat +> -- = Reexports +> , ServantError(..) +> , module Servant.Common.BaseUrl +> , module Control.Monad.Trans.Either +> , module Servant.Server.Internal.Enter +> ) where +> +> import Thermoprint.API +> import Data.Map (Map) +> import Data.Sequence (Seq) +> import Data.Time (UTCTime) +> +> import Servant.Client hiding (HasClient(..)) +> import qualified Servant.Client as S +> import Servant.Common.BaseUrl +> import Servant.API +> import Servant.Server.Internal.Enter +> import Control.Monad.Trans.Either +> +> import Control.Monad.Catch (Exception, MonadThrow(..)) +> import Control.Monad.IO.Class (MonadIO(..)) +> +> import Control.Monad +> import Control.Category +> import Prelude hiding (id, (.)) +> +> instance Exception ServantError + +We encapsulate all api operations in a single record parametrized over the monad we intend to use +them in. +Construction of such a record is pure since all we require to do so is a `BaseUrl`{.haskell}. +Using RecordWildCards we can bring all operations into scope with extreme ease. + +> -- | All 'ThermoprintAPI'-functions as a record +> -- +> -- Use like this: +> -- +> -- > {-# LANGUAGE RecordWildCards #-} +> -- > +> -- > main :: IO () +> -- > -- ^ Display a list of printers with their status +> -- > main = print =<< printers +> -- > where Client{..} = mkClient' $ Http "localhost" 3000 +> data Client m = Client +> { printers :: m (Map PrinterId PrinterStatus) +> -- ^ List all printers +> , jobs :: Maybe PrinterId +> -> Maybe (Range (JobId)) +> -> Maybe (Range (UTCTime)) +> -> m (Seq (JobId, UTCTime, JobStatus)) +> -- ^ List a selection of jobs +> , jobCreate :: Maybe PrinterId -> Printout -> m JobId +> -- ^ Send a 'Printout' to be queued +> , job :: JobId -> m Printout +> -- ^ Retrieve the contents of a job +> , jobStatus :: JobId -> m JobStatus +> -- ^ Query a jobs status +> , jobDelete :: JobId -> m () +> -- ^ Delete a job from the queue (not from history or while it is being printed) +> , drafts :: m (Map DraftId (Maybe DraftTitle)) +> -- ^ List all saved drafts +> , draftCreate :: Maybe DraftTitle +> -> Printout +> -> m DraftId +> -- ^ Create a new draft +> , draftReplace :: DraftId +> -> Maybe DraftTitle +> -> Printout +> -> m () +> -- ^ Replace the contents and title of an existing draft +> , draft :: DraftId -> m (Maybe DraftTitle, Printout) +> -- ^ Retrieve the contents and title of a draft +> , draftDelete :: DraftId -> m () +> -- ^ Delete a draft +> , draftPrint :: DraftId -> Maybe PrinterId -> m JobId +> -- ^ Send a draft to be printed +> } + +[servant documentation](https://haskell-servant.github.io/tutorial/server.html#nested-apis) +advises factoring out apis to make the specification more concise. +We are rightly advised that doing so has an effect on the types of the +corresponding `Server`{.haskell}s and `Client`{.haskell}s. +To cope with this we introduce a helper function that allows us, when +used with ViewPatterns, to nontheless simply pattern match on `client`{.haskell}. + +> withArgs :: (x -> a :<|> b) -> ((x -> a) :<|> (x -> b)) +> -- ^ Undo factoring of APIs +> withArgs subAPI = (\(a :<|> _) -> a) . subAPI :<|> (\(_ :<|> b) -> b) . subAPI + +`withArgs`{.haskell} as presented here does not recurse and thus +doesn't handle more than one occurence of `:<|>`{.haskell}. +We have to to so ourselves using nested ViewPatterns (see +`mkClient`{.haskell}). + +> mkClient :: (EitherT ServantError IO :~> m) -- ^ A monad functor ('Nat') used to make the api functions work in any monad which can do 'IO' and handle errors +> -> BaseUrl +> -> Client m +> -- ^ Generate a 'Client' + +RecordWildCards also allows us to construct a record from components +in scope. + +> mkClient n url = Client{..} +> where +> printers +> :<|> (jobs :<|> jobCreate) +> :<|> (withArgs -> job :<|> (withArgs -> jobStatus :<|> jobDelete)) +> :<|> (drafts :<|> draftCreate) +> :<|> (withArgs -> draftReplace :<|> (withArgs -> draft :<|> (withArgs -> draftDelete :<|> draftPrint))) +> = enter n $ client thermoprintAPI url + +We also provide some additional convenience functions so the user +doesn't have to construct their own `Nat`{.haskell}ural +transformations. + +> mkClient' :: (MonadThrow m, MonadIO m) => BaseUrl -> Client m +> -- ^ @mkClient' = mkClient $ ioNat . throwNat@ +> mkClient' = mkClient $ ioNat . throwNat +> +> throwNat :: (Exception e, MonadThrow m) => EitherT e m :~> m +> -- ^ Squash a layer of 'EitherT' into the underlying monad supporting 'throwM' +> throwNat = Nat $ either throwM return <=< runEitherT +> +> ioNat :: MonadIO m => IO :~> m +> -- ^ @ioNat = Nat liftIO@ +> ioNat = Nat liftIO -- cgit v1.2.3