From 5b063193f389ef472366e4355a683f1843f29733 Mon Sep 17 00:00:00 2001 From: Gregor Kleen Date: Fri, 27 May 2016 19:40:18 +0200 Subject: structure --- provider/posts/beuteltier-1.lhs | 327 ----------------------------------- provider/posts/beuteltier-2.lhs | 179 ------------------- provider/posts/beuteltier-3.md | 19 -- provider/posts/beuteltier-4.lhs | 220 ----------------------- provider/posts/beuteltier/1.lhs | 327 +++++++++++++++++++++++++++++++++++ provider/posts/beuteltier/2.lhs | 179 +++++++++++++++++++ provider/posts/beuteltier/3.md | 19 ++ provider/posts/beuteltier/4.lhs | 220 +++++++++++++++++++++++ provider/posts/blog-documentation.md | 9 - provider/posts/blog-rss.md | 43 ----- provider/posts/blog/hakyll.md | 131 ++++++++++++++ provider/posts/blog/origin.md | 9 + provider/posts/blog/rss.md | 43 +++++ provider/posts/blog/tex-support.md | 243 ++++++++++++++++++++++++++ provider/posts/blog/ymir.md | 7 + provider/posts/hakyll.md | 131 -------------- provider/posts/tex-support.md | 243 -------------------------- provider/posts/thermoprint-1.md | 130 -------------- provider/posts/thermoprint-2.lhs | 262 ---------------------------- provider/posts/thermoprint-3.lhs | 92 ---------- provider/posts/thermoprint-4.md | 116 ------------- provider/posts/thermoprint-5.md | 198 --------------------- provider/posts/thermoprint-6.lhs | 142 --------------- provider/posts/thermoprint/1.md | 130 ++++++++++++++ provider/posts/thermoprint/2.lhs | 262 ++++++++++++++++++++++++++++ provider/posts/thermoprint/3.lhs | 92 ++++++++++ provider/posts/thermoprint/4.md | 116 +++++++++++++ provider/posts/thermoprint/5.md | 198 +++++++++++++++++++++ provider/posts/thermoprint/6.lhs | 142 +++++++++++++++ provider/posts/ymir.md | 7 - 30 files changed, 2118 insertions(+), 2118 deletions(-) delete mode 100644 provider/posts/beuteltier-1.lhs delete mode 100644 provider/posts/beuteltier-2.lhs delete mode 100644 provider/posts/beuteltier-3.md delete mode 100644 provider/posts/beuteltier-4.lhs create mode 100644 provider/posts/beuteltier/1.lhs create mode 100644 provider/posts/beuteltier/2.lhs create mode 100644 provider/posts/beuteltier/3.md create mode 100644 provider/posts/beuteltier/4.lhs delete mode 100644 provider/posts/blog-documentation.md delete mode 100644 provider/posts/blog-rss.md create mode 100644 provider/posts/blog/hakyll.md create mode 100644 provider/posts/blog/origin.md create mode 100644 provider/posts/blog/rss.md create mode 100644 provider/posts/blog/tex-support.md create mode 100644 provider/posts/blog/ymir.md delete mode 100644 provider/posts/hakyll.md delete mode 100644 provider/posts/tex-support.md delete mode 100644 provider/posts/thermoprint-1.md delete mode 100644 provider/posts/thermoprint-2.lhs delete mode 100644 provider/posts/thermoprint-3.lhs delete mode 100644 provider/posts/thermoprint-4.md delete mode 100644 provider/posts/thermoprint-5.md delete mode 100644 provider/posts/thermoprint-6.lhs create mode 100644 provider/posts/thermoprint/1.md create mode 100644 provider/posts/thermoprint/2.lhs create mode 100644 provider/posts/thermoprint/3.lhs create mode 100644 provider/posts/thermoprint/4.md create mode 100644 provider/posts/thermoprint/5.md create mode 100644 provider/posts/thermoprint/6.lhs delete mode 100644 provider/posts/ymir.md (limited to 'provider') diff --git a/provider/posts/beuteltier-1.lhs b/provider/posts/beuteltier-1.lhs deleted file mode 100644 index 7789f40..0000000 --- a/provider/posts/beuteltier-1.lhs +++ /dev/null @@ -1,327 +0,0 @@ ---- -title: On the Design of Overly Complicated Feedreaders -published: 2015-08-04 -tags: Beuteltier ---- - -I like feedreaders. -Thus, of course, I had to implement my own, because, as always, all existing software does -not fullfill my exceedingly unrealistic expectations with respect to customizability and -extendability. - -This post marks the start of a series describing and documenting the design of the current -iteration of `Beuteltier` (`Beutel` kind of sounds like [Beuter](https://newsbeuter.org) -and is german for bag, which is what we call our backstore since it is held at such an -universal and unstructured level that the analogy is fitting. `Tier` is german for animal -and taken to mean "Thing that does stuff". In conjunction `Beuteltier` means -[Marsupial](https://en.wikipedia.org/wiki/Marsupial)). - -It should be noted that the library described here is not finished or ready for use in any -sense of the word (at the time of writing a "trivial" implementation of a `Beutel` shipped -with the library supports only `run`, `search`, and `delete`). Searching a way to -procrastinate implementing the more arduous `insert` (it requires nubbing—deduplication in -the backstore) I decided to, instead, start this series of posts and put the thought that -went into the library so far in a form that I can read again for later reference. - -We begin, as is to be expected for a haskell project, with type definitions and, thus, -design philosophy. - -This post in particular reproduces the file `beuteltier/Beuteltier/Types.hs` from the -git repo with annotiations to provide some motivation. - -The `Beuteltier` library itself only provides primitives for (and a default implementation -of) access to what we call a backstore. A backstore is, to us, an instance of the -typeclass `Beutel` which contains the most primitive of primitives for storing, searching -for and deleting representations of the objects we care about from the store. - -It is recommended that the reader not try to follow the rest of this post linearly but start -at the end with the definition of the `Beutel` class and work their way backwards. - -> {-# LANGUAGE FlexibleInstances, StandaloneDeriving, KindSignatures, MultiParamTypeClasses, TypeFamilies #-} -> -> module Beuteltier.Types -> ( -- * Types -> Object -> , ObjectGen(..) -> , SubObject(..) -> , MetaData(..) -> , Thunk(..) -> , ThunkState(..) -> , ThunkResult(..) -> , Tag -> , Flag(..) -> , SubObjectName -> , ThunkName -> , SearchQuery -> , Predicate -> , Beutel(..) -> ) where - -`Flag` ends up being a [sum type](https://en.wikipedia.org/wiki/Sum_type) holding values -such as `Seen`, `Old`, or `Hidden`. -We define it externally. - -> import Beuteltier.Types.Flags - -The `Identity` functor serves as basis for many a Monadtransformer-stack. - -> import Data.Functor.Identity -> import Data.Functor.Classes () - -Binary contents are encoded as `ByteStrings` - -> import qualified Data.ByteString.Lazy as Lazy (ByteString) -> import qualified Data.ByteString.Lazy as LBS - -Unicode text as `Text` - -> import Data.Text (Text) - -Long unicode text as lazy `Text` - -> import qualified Data.Text.Lazy as Lazy (Text) -> import qualified Data.Text.Lazy as LT -> -> import Data.Set (Set) -> -> import Data.Map (Map) -> -> import Data.Time (UTCTime) -> -> import Data.Function (on) -> import Data.Ord (comparing) -> import Control.Applicative - -`Data.Default` provides some convenience when constructing extensive record structures. - -> import Data.Default - -The `boolexpr` package provides us with a structure for representing boolean expressions -supporting functor operations and evaluation. - -> import Data.BoolExpr - -Previous iterations of Beuteltier acted on Objects that were kept completely in RAM during -all operations. -This proved to be unsustainable, not only because nubbing (deduplication in the store of -all objects) tended to exceed all RAM constraints (>4GiB for a few hundred objects), but -also because cheaper operations on objects, like presentation to the user, got painfully -slow once large `SubObject`s (like videos) were introduced into the store. - -The straight forward solution was to enrich the `Object` structure with provisions for -explicit lazyness and partial construction. - -> -- | We deal in, at runtime, partially retrieved Objects -> data ObjectGen (f :: * -> *) = ObjectGen -> { _oMeta :: f MetaData -> -- ^ An undetermined set of Metainformation -> , _oContent :: f (Map SubObjectName (f SubObject)) -> -- ^ A list of undetermined length of undetermined -> -- 'SubObject's with guaranteed unique 'SubObjectName's -> , _oThunks :: f [f Thunk] -> -- ^ A list of undetermined length of undetermined Thunks. -> -- There is such a thing as thunk colissions (i.e.: two -> -- thunks promise or even create 'SubObject's with the -> -- same name). -> -- Precedence in such a case is to be as suggested by -> -- the list structure (later thunks override earlier ones). -> } -> -> instance Monad f => Default (ObjectGen f) where -> def = ObjectGen { _oContent = return def -> , _oThunks = return def -> , _oMeta = return def -> } - -It is straight forward to collapse the more advanced representation of `Object`s back to -the old behaviour by parametrising over the Identity functor, which is simply a newtype -wrapper over the contained structure. - -> -- | An entirely retrieved Object -> type Object = ObjectGen Identity -> -> -- -- | The default 'Object' is empty except for metadata -> -- instance Default Object where -> -- def = ObjectGen { _oContent = return def -> -- , _oThunks = return def -> -- , _oMeta = return def -> -- } -> -> -- | Equality simply gets deferred to all subcomponents -> deriving instance Eq Object -> -> -- | 'Object's compare as their 'MetaData' -> instance Ord Object where -> compare = comparing _oMeta - -We would like to associate some set of meta information with all objects. -Therefore, we do. - -> -- | Metadata associated with an Object -> data MetaData = MetaData -> { _mRetrieved :: UTCTime -- ^ Time of creation -> , _mTags :: Set Tag -- ^ Tags such as the name of the author, -> -- the title of the work represented in -> -- the 'Object', …. -> -- We use something like @show . _mTags@ -> -- to identify an 'Object' to the user -> , _mFlags :: Set Flag -- ^ Flags such as \"Read\" or \"Spam\" -> } deriving (Show, Ord) -> -- | Tags are unicode text -> type Tag = Text -> -> -- | 'MetaData' equates as the contained tags -> instance Eq MetaData where -> (==) = (==) `on` _mTags -> -> -- | The default MetaData has no tags, no flags, and an undefined timestamp -> instance Default MetaData where -> def = MetaData { _mFlags = def -> , _mTags = def -> , _mRetrieved = undefined -- There really is no such thing as a default time -> } - -Objects are no fun if they don´t contain anything of interest in the end. - -Below we see a remnant of an older model of associating names to `SubObject`s. We switched -to using a `Map` for reasons of deduplication. Inserting into a `Map` carries some -guarantees that keys end up being unique. - -Note below: creation of a `SubObject` is an update. It is thus expected, that `SubObject`s -created at the same time as the `Object` they are associated to encode an update -time that matches the `Object`s creation time. - -> -- | Contents of an object -> data SubObject = SubObject -> -- { _sId :: SubObjectName -> -- ^ We associate a name to every chunk of content to determine -> -- how to present an object to the user -> { _sContent :: Lazy.ByteString -> , _sUpdates :: [UTCTime] -> -- ^ Times of witnessed updates to this 'SubObject' -> } deriving (Show) -> -> -- | No content, no witnessed updates -> instance Default SubObject where -> def = SubObject { _sContent = def -> , _sUpdates = def -> } -> -> -- | Extensionality for 'SubObject's: -> -- -> -- > (==) = (==) `on` _sContent -> instance Eq SubObject where -> (==) = (==) `on` _sContent - -The distinguishing feature of Beuteltier is it´s support for `Thunk`s. They are, as the -name suggests, loosly based on the concept of lazy evaluation. They are, however, less -transparent and thus more explicit than implementations as they are used in, for example -haskell. - -As far as Beuteltier is concerned `Thunk`s are executables that are expected to produce -files in the directory they are executed in in a pure manner. That is to say they do not -access external resources, where possible. A `Thunk` that downloads a video from the -internet will, of course, access the internet and can thus fail. We expect it, however, to -not to try and access the users home directory to look for e.g. credentials for -authentication it intends to use to its primary job. - -When a `Thunk`s executable gets executed the files it creates (excluding itself) get -translated to `SubObject`s with the filenames (directories stripped of course) as their -`SubObjectName`s and the file contents as their… well, their contents. It is understood, -that not all possible `SubObjectName`s can be created thus (we restrict ourselves to valid -filenames on whatever system we happen to be on). We do not consider this to be a great -loss. - -The advanced equality checks mentioned below are, in fact, implemented and will be explained -in more detail in a later post concerned with the file `beuteltier/Beuteltier/Types/Util.hs`. - -> -- | Thunks are at runtime not yet known parts of an object -> data Thunk = Thunk -> { _tId :: ThunkName -- ^ For debugging -> , _tScript :: Lazy.ByteString -> -- ^ A Thunk is, in the end, a shell script that is expected to generate -> -- 'SubObject's -> , _tPromises :: Maybe [SubObjectName] -> -- ^ Maybe we already know what our script is going to generate? -> -- This would enable us to do some more advanced equality checks under -> -- the assumption that scripts are pure -> , _tState :: ThunkState -> } -> deriving (Show) -> -> -- | Empty id, empty script, promises nothing, and with default state -> instance Default Thunk where -> def = Thunk { _tId = def -> , _tScript = def -> , _tPromises = def -> , _tState = def -> } -> -> -- | Equality on 'Thunk's ignores '_tState' and '_tId' -> instance Eq Thunk where -> a == b = and $ [ (==) `on` _tScript -> , (==) `on` _tPromises -> ] <*> pure a <*> pure b -> -> -- | The states in which a 'Thunk' can be encountered. -> data ThunkState = NotExecuted -> | Executed [SubObjectName] ThunkResult -> deriving (Show) -> -> -- | Return the default 'ThunkResult' upon forcing -> instance Default ThunkState where -> def = NotExecuted -> -> -- | Thunks generate some data during execution -> data ThunkResult = ThunkResult -> { _rOutErr, _rOutStd :: Lazy.Text -> , _rExit :: Integer -- ^ Numerical exit code (0 usually means success) -> } -> deriving (Show) -> -> -- | Empty output, and with undefined exit code (no execution took place and we can´t -> -- encode this in a numerical exit code) -> instance Default ThunkResult where -> def = ThunkResult { _rOutErr = LT.empty, _rOutStd = LT.empty -> , _rExit = undefined -> } -> -> -- | We expect identifiers for 'SubObject's to be short, thus 'String' -> type SubObjectName = String -> -- | We expect identifiers for 'Thunk's to be short, thus 'String' -> type ThunkName = String -> -> -- | @LBS.empty@ -> instance Default (Lazy.ByteString) where -> def = LBS.empty - -What good is a library for managing a backstore if it does not support search operations? -We consider the answer to be "very little" and, thus, support searches. - -> type SearchQuery f = BoolExpr (Predicate f) -> -- data Predicate f = Prim (ObjectGen f -> f Bool) -> -- | Meta (MetaData -> Bool) -> type Predicate f = ObjectGen f -> f Bool - -The heart of the `Beuteltier` library is the typeclass reproduced below. We expect -implementations of backstores to be `Monad`s so that we may be able to construct -complicated actions that act on the backstore in question. -Once we have constructed such an action using the three primitives `search`, `insert`, and -`delete` we additionally require a way to execute that action from within the `IO` -`Monad`. - -Additional primitives, such as those for "forcing" and resetting thunks, are provided in -additional libraries and, thus, later posts. - -> -- | We have the user supply the functions we use to interact with whatever backstore -> -- she uses -> class Monad functor => Beutel (functor :: * -> *) where -> data Config :: * -> run :: Config -> functor a -> IO a -> -- ^ Actually run whatever action we constructed against the backstore -> search :: SearchQuery functor -> functor [ObjectGen functor] -> -- ^ Perform a search -> insert :: Object -> functor () -> -- ^ Insert an object -> delete :: SearchQuery functor -> functor () -> -- ^ Delete the results of a search diff --git a/provider/posts/beuteltier-2.lhs b/provider/posts/beuteltier-2.lhs deleted file mode 100644 index 8da4711..0000000 --- a/provider/posts/beuteltier-2.lhs +++ /dev/null @@ -1,179 +0,0 @@ ---- -title: "\"Type level\" utilities for an overly complicated feedreader" -published: 2015-08-05 -tags: Beuteltier ---- - -By popular (n=1) demand we will, in this post, be taking a look at -`beuteltier/Beuteltier/Types/Util.hs` the, creatively named, module providing some "type -level" utilities. - -What I mean when I say "type level" is: additional instances (placed here when they -contain major design decisions and are not "Ord" or "Eq"), utilities not connected to -beuteltier itself (like the different flavours of `alter` below) - -In contrast to the first, this post is straightforward enough to be read linearly. - -> {-# LANGUAGE TypeSynonymInstances, FlexibleInstances #-} -> -> module Beuteltier.Types.Util -> ( -- * Constructing structures -> construct -> , construct' -> , alter -> , alter' -> -- * Dealing with 'ObjectGen's (here be dragons) -> , generateObject -> , liftGen -> -- * Equivalence on 'Object's (for nubbing) -> , Equivalent(..) -> -- * Operations on 'SearchQuery's -> , runQuery -> -- , runExpr -> ) where -> -> import Beuteltier.Types -> import Beuteltier.Types.Lenses - -We make use of lenses (as provided by [lens](http://hackage.haskell.org/package/lens)) -extensively. -We won´t dedicate a post to `beuteltier/Beuteltier/Types/Lenses.hs` because it consists -mostly of the canonical invocations of -[makeLenses](http://hackage.haskell.org/package/lens-4.12.3/docs/Control-Lens-TH.html#v:makeLenses). - -> import Data.Default -> -> import Prelude hiding (sequence) -> import Data.Traversable (sequence) -> -> import Control.Lens -> -> import Control.Monad.State.Lazy hiding (sequence) -- Why is this exported? -> -> import Data.Map (Map) -> import qualified Data.Map as Map -> -> import Data.Set (Set) -> import qualified Data.Set as Set -> -> import Data.Hashable (Hashable(..), hashUsing) -> -> import Data.Monoid ((<>)) -> -> import Data.Function (on) -> import Data.Maybe (mapMaybe) -> -> import Data.BoolExpr - -Quite often we find ourselves in the position that we want to alter some small parts of a -complicated structure. We would therefore like to write the following: - -~~~ {.haskell} -updateFoo :: Foo -> Monad Foo -updateFoo x = alter x $ do - bar <~ (constructNewBar :: Monad Bar) - buz .= (makeConstantBuz :: Buz) -~~~ - -The definitions below allow us not only to do so, but also provide some convenience -functions for constructing entirely new values and performing both operations in a pure -context. - -> alter :: Monad m => s -> StateT s m a -> m s -> -- ^ Alter a complex structure monodically -> alter = flip execStateT -> -> alter' :: s -> State s a -> s -> -- ^ Specialization of 'alter' to 'Identity' -> alter' s = runIdentity . alter s -> -> construct :: (Monad m, Default s) => StateT s m a -> m s -> -- ^ Compute a complex structure monadically -> construct = alter def -> -> construct' :: Default s => State s a -> s -> -- ^ Specialization of 'construct' to 'Identity' -> construct' = runIdentity . construct - -Sometimes we just really want to translate an `ObjectGen` to an `Object`. - -> generateObject :: Monad f => ObjectGen f -> f Object -> -- ^ Run an object generator. -> -- Use iff /all/ components of an object are needed /in RAM now/. -> generateObject gen = construct $ do -> content <- lift $ gen ^. oContent >>= sequence -> thunks <- lift $ gen ^. oThunks >>= sequence -> meta <- lift $ gen ^. oMeta -> oContent .= return (fmap return content) -> oThunks .= return (fmap return thunks) -> oMeta .= return meta -> -> liftGen :: Monad f => Object -> ObjectGen f -> -- ^ Lift an 'Object' to be an 'ObjectGen' in any 'Monad' by the power of 'return' -> liftGen obj = construct' $ do -> oContent .= return (Map.map return $ obj ^. oContent') -> oThunks .= return (map return $ obj ^. oThunks') -> oMeta .= return (obj ^. oMeta') - -We expect implementations of `insert` to perform what we call nubbing. That is removal of -`Object`s that are, in some sense, `Equivalent` to the new one we´re currently -inserting. Thus we provide a definition of what we mean, when we say `Equivalent`. - -> class Equivalent a where -> (~~) :: a -> a -> Bool -> -> -- | Two 'Object's are equivalent iff their content is identical as follows: -> -- the set of 'SubObjectName's both promised and actually occurring is identical -> -- and all 'SubObject's that actually occurr and share a 'SubObjectName' are -> -- identical (as per '(==)') -> -- -> -- Additionally we expect their 'Metadata' to be identical (as per '(==)') -> instance Equivalent Object where -> a ~~ b = (contentCompare `on` content) a b && ((==) `on` (^. oMeta')) a b -> where -> contentCompare :: (Ord k, Eq v) => Map k (Maybe v) -> Map k (Maybe v) -> Bool -> contentCompare a b = Map.foldl (&&) True $ Map.mergeWithKey combine setFalse setFalse a b -> combine _ a b = Just $ cmpMaybes a b -> setFalse = Map.map $ const False -> -> cmpMaybes Nothing _ = True -> cmpMaybes _ Nothing = True -> cmpMaybes (Just a) (Just b) = a == b - -To speed up nubbing we also provide a quick way to "cache results". To make caching -meaningful we of course expect the following to hold: - -~~~ -a ~~ b ⇒ (hash a) == (hash b) -~~~ - -Note that we do not expect the converse to hold. We will thus require a second pass over -all objects sharing a hash to determine true equivalency. - -> -- | Two 'Object's´ hashes are a first indication of whether they are 'Equivalent' -> instance Hashable Object where -> hashWithSalt = hashUsing $ \a -> (a ^. oMeta', Map.keys $ content a) -> -> instance Hashable MetaData where -> hashWithSalt = hashUsing $ Set.toList . (^. mTags) -> -> content :: Object -> Map SubObjectName (Maybe SubObject) -> content obj = promised obj <> actual obj -> actual :: Object -> Map SubObjectName (Maybe SubObject) -> actual = fmap Just . (^. oContent') -> promised :: Object -> Map SubObjectName (Maybe SubObject) -> promised = Map.fromList . map (\n -> (n, Nothing)) . concat . promises -> promises :: Object -> [[SubObjectName]] -> promises = mapMaybe (^. tPromises) . (^. oThunks') - -Evaluating a `SearchQuery` against an `ObjectGen` is, due to the structure of elementary -`SearchQuery`s quite straightforward. - -> runQuery :: Monad f => SearchQuery f -> ObjectGen f -> f Bool -> -- ^ Run a 'SearchQuery' against an 'ObjectGen' -> runQuery query obj = liftM reduceBoolExpr $ sequence $ fmap ($ obj) query -> -> -- runExpr :: Monad f => ObjectGen f -> Predicate f -> f Bool -> -- -- ^ Run a 'Predicate' (»an atomic 'SearchQuery'«) against an 'ObjectGen' -> -- runExpr obj (Prim f) = f obj -> -- runExpr obj (Meta f) = liftM f (obj ^. oMeta) diff --git a/provider/posts/beuteltier-3.md b/provider/posts/beuteltier-3.md deleted file mode 100644 index 9b699c1..0000000 --- a/provider/posts/beuteltier-3.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: An Update to the Type Level Utilities of an Overly Complicated Feedreader -published: 2015-08-12 -tags: Beuteltier ---- - -I commited a change to `beuteltier/Beuteltier/Types/Util.hs` ("[…] Hashable update"). I -replaced the `Hashable` instances for `Object` and `MetaData` with a single, better -optimized function: - -~~~ {.haskell} -objHash :: Applicative f => ObjectGen f -> f Int --- ^ Two 'ObjectGen's hashes are a first indication of whether they are 'Equivalent' -objHash o = fmap hash $ (,) <$> (Set.toList . (^. mTags) <$> o ^. oMeta) <*> (Map.keys <$> o ^. oContent) -~~~ - -The new implementation allows computation of hashes without calling `generateObject` (that -function is evil — it makes sure the entire `Object` is "in RAM" (it isn´t actually, of -course (because haskell is lazy)—but I have no guarantee of that)). diff --git a/provider/posts/beuteltier-4.lhs b/provider/posts/beuteltier-4.lhs deleted file mode 100644 index 478cbac..0000000 --- a/provider/posts/beuteltier-4.lhs +++ /dev/null @@ -1,220 +0,0 @@ ---- -title: Business Logic for an Overly Complicated Feedreader -published: 2015-08-12 -tags: Beuteltier ---- - -It turns out I don´t have to write much in the way of comments—the source file is already -quite well documented. - -> {-# LANGUAGE KindSignatures #-} -> -> module Beuteltier -> ( -- * Forcing (Executing) 'Thunk's -> forceAllThunks -> , WithObjects -> , forceThunk -> , resetAllThunks -> , resetThunk -> -- * Higher level interactions with a 'Beutel' -> , replace -> , eqTo -> , update -> , forceAllThunks' -> , resetAllThunks' -> , module Beuteltier.Util -> , module Beuteltier.Types.Common -> ) where -> -> import Beuteltier.Types.Common -> import Beuteltier.Util -> -> import Data.Map (Map) -> import qualified Data.Map as Map -> -> import Control.Lens -> -> import qualified Data.ByteString.Lazy as LBS -> import qualified Data.ByteString as BS -> -> import qualified Data.Text as T -> import qualified Data.Text.IO as T -> import qualified Data.Text.Lazy as TL -> -> import System.FilePath -> import System.Directory -> import System.Posix.Files -> import System.Posix.Temp -> -> import System.Environment -> import System.IO -> import GHC.IO.Handle -> import System.Process as P -> import System.Exit -> -> import Control.Concurrent -> import Control.Applicative -> import Control.Monad.Morph -> import Control.Monad.Trans.State -> import Control.Monad.Writer -> import Control.Monad.Trans.Resource -> import Control.Monad (liftM) -> -> import Data.Time.Clock -> -> import Data.BoolExpr - -The distinguishing feature of our Overly Complicated Feedreader™ is it´s support for -`Thunk`s. It is thus reasonable to expect, that we have some functions to actually -interact with them. Most striking in that hypothetical set of functions would be one that -executes all thunks associated with a single `ObjectGen` and return a new one with the -content generated by `Thunk`s filled in. - -Enter `forceAllThunks`. - -> forceAllThunks :: (MonadIO f, MonadResource f) -> => (Thunk -> Bool) -- ^ Select which 'Thunk's to force -> -> ObjectGen f -> f (ObjectGen f) -> -- ^ Force all thunks in place and update '_oContent' -> -- -> -- The reason we require a 'MonadResource' instance is that we would like to store our expensive to hold in RAM 'SubObject' contents in temporary files. -> forceAllThunks pred = flip alter $ do -> pureThunks <- use oThunks >>= lift >>= mapM lift -> (newThunks, objectResults) <- mapAndUnzipM forceThunk' pureThunks -> assign oThunks $ return $ map return newThunks -> oContent %= liftM (<> mconcat objectResults) -> where -> forceThunk' thunk -> | pred thunk = forceThunk thunk -> | otherwise = return (thunk, Map.empty) - -`forceAllThunks'` (`resetAllThunks'` below, too) is tainted by the evil of -`generateObject` but included for convenience. - -> forceAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f () -> -- ^ Version of 'forceAllThunks' suitable for use with 'update' -> forceAllThunks' pred = get >>= lift . forceAllThunks pred . liftGen >>= lift . generateObject >>= put -> -> -- | Internal helper to track computations creating 'SubObject's -> type WithObjects (n :: * -> *) (m :: * -> *) = WriterT (Map SubObjectName (n SubObject)) m -> -> forceThunk_ :: (MonadIO m, MonadResource m) => Thunk -> WithObjects (ResourceT IO) m Thunk -> -- ^ Make sure the 'ThunkState' contained within a 'Thunk' is 'Executed' -> forceThunk_ = flip alter $ do -> -- (result, subObjects) <- liftIO $ runWriterT $ parseThunk thunk -> -- writer ((), subObjects) -> (result, subObjects) <- listen . hoist (hoist liftResourceT) . lift . parseThunk =<< get -> tState .= Executed (Map.keys subObjects) result -> -> forceThunk :: (MonadResource m, MonadResource n) => Thunk -> m (Thunk, Map SubObjectName (n SubObject)) -> -- ^ Force a 'Thunk' and return it in 'Executed' state together with the 'SubObject's it created during execution -> forceThunk thunk = liftM (_2 %~ fmap liftResourceT) $ (runWriterT . forceThunk_) thunk - -Quite often we want to undue the harm done by `forceAllThunks` (to save space, usually). - -> resetAllThunks :: Monad f -> => (Thunk -> Bool) -- ^ Select which 'Thunk's to reset -> -> ObjectGen f -> f (ObjectGen f) -> -- ^ Undoes 'forceAllThunks': -> -- -> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) >>= forceAllThunks (const True) = forceAllThunks (const True) obj -> -- -> -- This inevitably drops information ('ThunkResult's for one). -> -- -> -- In the case where 'forceAllThunks' does not drop information (i.e.: no 'SubObjectName' collisions ocurr) the following, stronger property holds: -> -- -> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) = return obj -> resetAllThunks pred = flip alter $ do -> thunks <- liftM (map lift) (use oThunks >>= lift) >>= sequence -> let -> (subObjectNames, newThunks) = over _1 concat $ unzip $ map resetThunk' thunks -> oThunks .= return (map return newThunks) -> oContent %= (>>= return . Map.filterWithKey (\k _ -> k `notElem` subObjectNames)) -> where -> resetThunk' thunk -> | pred thunk = resetThunk thunk -> | otherwise = ([], thunk) -> -> resetAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f () -> -- ^ Version of 'resetAllThunks' suitable for use with 'update' -> resetAllThunks' pred = get >>= lift . resetAllThunks pred . liftGen >>= lift . generateObject >>= put -> -> resetThunk :: Thunk -> ([SubObjectName], Thunk) -> -- ^ Reset a thunk and return the 'SubObjectName's of the 'SubObject's it once created. -> -- This forgets information. -> resetThunk thunk = case thunk ^. tState of -> NotExecuted -> ([], thunk) -> Executed created _ -> (created, set tState NotExecuted thunk) -> -> parseThunk :: Thunk -> WithObjects (ResourceT IO) (ResourceT IO) ThunkResult -> -- ^ Generate a runnable action from a 'Thunk' -> -- -> -- Regarding the "inner" and "outer" 'Monad' here being 'IO': We have not, at time of forcing, a neccessary connection to our backstore and thus cannot expect the monads to be anything else. -> parseThunk thunk = do -> tmpDirName <- liftIO getTemporaryDirectory -> progName <- liftIO getProgName -> let -> tmpDirName' = tmpDirName progName -> (_, tmpDir) <- allocate (mkdtemp tmpDirName') removeDirectoryRecursive -> let exec = tmpDir "exec" -> out = tmpDir "out" -> result <- liftIO $ do -> createDirectory out -> LBS.writeFile exec script -> setFileMode exec $ foldl unionFileModes nullFileMode [ownerReadMode, ownerExecuteMode] -> (Just std_in, Just std_out, Just std_err, ph) <- createProcess $ (P.proc exec []) { cwd = Just out, std_in = CreatePipe, std_out = CreatePipe, std_err = CreatePipe } -> hClose std_in -> hSetBinaryMode std_out True -> hSetBinaryMode std_err True -> std_out `sendTo` stdout -> std_err `sendTo` stderr -> construct $ do -> rOutStd <~ TL.fromStrict <$> liftIO (T.hGetContents std_out) -- Yes, sadly we have to be strict here -> rOutErr <~ TL.fromStrict <$> liftIO (T.hGetContents std_err) -> rExit <~ toNum <$> liftIO (waitForProcess ph) -> outputFiles <- liftIO $ getDirectoryContents out -> let -> outputFiles' = filter fileFilter outputFiles -> fileFilter = and . (<*>) [(/=) ".", (/=) ".."] . pure . takeFileName -> mapM_ tell =<< mapM (liftResourceT . toSubObject) outputFiles' -> return result -> where -> script = thunk ^. tScript -> toSubObject :: FilePath -> ResIO (Map SubObjectName (ResIO SubObject)) -> -- ^ Using 'ResourceT' provides us with the guarantee, that the 'FilePath' we´re referring to should still exist when we actually try to get the 'SubObject'´s contents -> toSubObject name = fmap (Map.singleton name' . return) $ construct $ do -> sContent <~ liftIO (LBS.readFile name) -> sUpdates <~ pure <$> liftIO getCurrentTime -> where -> name' = takeFileName name -> sendTo input output = do -> input' <- hDuplicate input -> forkIO $ do -> hSetBuffering input' NoBuffering -> LBS.hGetContents input' >>= LBS.hPutStr output -> return () -> toNum :: Num a => ExitCode -> a -> toNum ExitSuccess = 0 -> toNum (ExitFailure i) = fromInteger $ toInteger i - -We provide `update`, a convenience function for high-level interactions (though costly on -large sets of equivalent objects (which should not exist due to nubbing)). - -> eqTo :: Monad f => Object -> SearchQuery f -> -- ^ @eqTo o@ constructs a 'SearchQuery' that matches all 'Object's 'Equivalent' to @o@ -> -- -> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel'. -> eqTo o = BConst ((>>= return . (~~) o) . generateObject) -> -> update :: Beutel f => SearchQuery f -> StateT Object f a -> f () -> -- ^ @update search action@ replaces /all/ 'Object's matching @search@ within the 'Beutel' by versions of themselves modified by applying @action@. -> -- -> -- Does not handle '_sUpdates'. -> -- -> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel' /and/ all results of @search@ (in order to use 'Eq' on 'Object's to delete the results of the initial 'search'). -> update query alteration = do -> matches <- search query -> matches' <- mapM generateObject matches -> delete $ BConst ((>>= return . (`elem` matches')) . generateObject) -> mapM_ (\o -> alter o alteration >>= insert) matches' -> return () diff --git a/provider/posts/beuteltier/1.lhs b/provider/posts/beuteltier/1.lhs new file mode 100644 index 0000000..7789f40 --- /dev/null +++ b/provider/posts/beuteltier/1.lhs @@ -0,0 +1,327 @@ +--- +title: On the Design of Overly Complicated Feedreaders +published: 2015-08-04 +tags: Beuteltier +--- + +I like feedreaders. +Thus, of course, I had to implement my own, because, as always, all existing software does +not fullfill my exceedingly unrealistic expectations with respect to customizability and +extendability. + +This post marks the start of a series describing and documenting the design of the current +iteration of `Beuteltier` (`Beutel` kind of sounds like [Beuter](https://newsbeuter.org) +and is german for bag, which is what we call our backstore since it is held at such an +universal and unstructured level that the analogy is fitting. `Tier` is german for animal +and taken to mean "Thing that does stuff". In conjunction `Beuteltier` means +[Marsupial](https://en.wikipedia.org/wiki/Marsupial)). + +It should be noted that the library described here is not finished or ready for use in any +sense of the word (at the time of writing a "trivial" implementation of a `Beutel` shipped +with the library supports only `run`, `search`, and `delete`). Searching a way to +procrastinate implementing the more arduous `insert` (it requires nubbing—deduplication in +the backstore) I decided to, instead, start this series of posts and put the thought that +went into the library so far in a form that I can read again for later reference. + +We begin, as is to be expected for a haskell project, with type definitions and, thus, +design philosophy. + +This post in particular reproduces the file `beuteltier/Beuteltier/Types.hs` from the +git repo with annotiations to provide some motivation. + +The `Beuteltier` library itself only provides primitives for (and a default implementation +of) access to what we call a backstore. A backstore is, to us, an instance of the +typeclass `Beutel` which contains the most primitive of primitives for storing, searching +for and deleting representations of the objects we care about from the store. + +It is recommended that the reader not try to follow the rest of this post linearly but start +at the end with the definition of the `Beutel` class and work their way backwards. + +> {-# LANGUAGE FlexibleInstances, StandaloneDeriving, KindSignatures, MultiParamTypeClasses, TypeFamilies #-} +> +> module Beuteltier.Types +> ( -- * Types +> Object +> , ObjectGen(..) +> , SubObject(..) +> , MetaData(..) +> , Thunk(..) +> , ThunkState(..) +> , ThunkResult(..) +> , Tag +> , Flag(..) +> , SubObjectName +> , ThunkName +> , SearchQuery +> , Predicate +> , Beutel(..) +> ) where + +`Flag` ends up being a [sum type](https://en.wikipedia.org/wiki/Sum_type) holding values +such as `Seen`, `Old`, or `Hidden`. +We define it externally. + +> import Beuteltier.Types.Flags + +The `Identity` functor serves as basis for many a Monadtransformer-stack. + +> import Data.Functor.Identity +> import Data.Functor.Classes () + +Binary contents are encoded as `ByteStrings` + +> import qualified Data.ByteString.Lazy as Lazy (ByteString) +> import qualified Data.ByteString.Lazy as LBS + +Unicode text as `Text` + +> import Data.Text (Text) + +Long unicode text as lazy `Text` + +> import qualified Data.Text.Lazy as Lazy (Text) +> import qualified Data.Text.Lazy as LT +> +> import Data.Set (Set) +> +> import Data.Map (Map) +> +> import Data.Time (UTCTime) +> +> import Data.Function (on) +> import Data.Ord (comparing) +> import Control.Applicative + +`Data.Default` provides some convenience when constructing extensive record structures. + +> import Data.Default + +The `boolexpr` package provides us with a structure for representing boolean expressions +supporting functor operations and evaluation. + +> import Data.BoolExpr + +Previous iterations of Beuteltier acted on Objects that were kept completely in RAM during +all operations. +This proved to be unsustainable, not only because nubbing (deduplication in the store of +all objects) tended to exceed all RAM constraints (>4GiB for a few hundred objects), but +also because cheaper operations on objects, like presentation to the user, got painfully +slow once large `SubObject`s (like videos) were introduced into the store. + +The straight forward solution was to enrich the `Object` structure with provisions for +explicit lazyness and partial construction. + +> -- | We deal in, at runtime, partially retrieved Objects +> data ObjectGen (f :: * -> *) = ObjectGen +> { _oMeta :: f MetaData +> -- ^ An undetermined set of Metainformation +> , _oContent :: f (Map SubObjectName (f SubObject)) +> -- ^ A list of undetermined length of undetermined +> -- 'SubObject's with guaranteed unique 'SubObjectName's +> , _oThunks :: f [f Thunk] +> -- ^ A list of undetermined length of undetermined Thunks. +> -- There is such a thing as thunk colissions (i.e.: two +> -- thunks promise or even create 'SubObject's with the +> -- same name). +> -- Precedence in such a case is to be as suggested by +> -- the list structure (later thunks override earlier ones). +> } +> +> instance Monad f => Default (ObjectGen f) where +> def = ObjectGen { _oContent = return def +> , _oThunks = return def +> , _oMeta = return def +> } + +It is straight forward to collapse the more advanced representation of `Object`s back to +the old behaviour by parametrising over the Identity functor, which is simply a newtype +wrapper over the contained structure. + +> -- | An entirely retrieved Object +> type Object = ObjectGen Identity +> +> -- -- | The default 'Object' is empty except for metadata +> -- instance Default Object where +> -- def = ObjectGen { _oContent = return def +> -- , _oThunks = return def +> -- , _oMeta = return def +> -- } +> +> -- | Equality simply gets deferred to all subcomponents +> deriving instance Eq Object +> +> -- | 'Object's compare as their 'MetaData' +> instance Ord Object where +> compare = comparing _oMeta + +We would like to associate some set of meta information with all objects. +Therefore, we do. + +> -- | Metadata associated with an Object +> data MetaData = MetaData +> { _mRetrieved :: UTCTime -- ^ Time of creation +> , _mTags :: Set Tag -- ^ Tags such as the name of the author, +> -- the title of the work represented in +> -- the 'Object', …. +> -- We use something like @show . _mTags@ +> -- to identify an 'Object' to the user +> , _mFlags :: Set Flag -- ^ Flags such as \"Read\" or \"Spam\" +> } deriving (Show, Ord) +> -- | Tags are unicode text +> type Tag = Text +> +> -- | 'MetaData' equates as the contained tags +> instance Eq MetaData where +> (==) = (==) `on` _mTags +> +> -- | The default MetaData has no tags, no flags, and an undefined timestamp +> instance Default MetaData where +> def = MetaData { _mFlags = def +> , _mTags = def +> , _mRetrieved = undefined -- There really is no such thing as a default time +> } + +Objects are no fun if they don´t contain anything of interest in the end. + +Below we see a remnant of an older model of associating names to `SubObject`s. We switched +to using a `Map` for reasons of deduplication. Inserting into a `Map` carries some +guarantees that keys end up being unique. + +Note below: creation of a `SubObject` is an update. It is thus expected, that `SubObject`s +created at the same time as the `Object` they are associated to encode an update +time that matches the `Object`s creation time. + +> -- | Contents of an object +> data SubObject = SubObject +> -- { _sId :: SubObjectName +> -- ^ We associate a name to every chunk of content to determine +> -- how to present an object to the user +> { _sContent :: Lazy.ByteString +> , _sUpdates :: [UTCTime] +> -- ^ Times of witnessed updates to this 'SubObject' +> } deriving (Show) +> +> -- | No content, no witnessed updates +> instance Default SubObject where +> def = SubObject { _sContent = def +> , _sUpdates = def +> } +> +> -- | Extensionality for 'SubObject's: +> -- +> -- > (==) = (==) `on` _sContent +> instance Eq SubObject where +> (==) = (==) `on` _sContent + +The distinguishing feature of Beuteltier is it´s support for `Thunk`s. They are, as the +name suggests, loosly based on the concept of lazy evaluation. They are, however, less +transparent and thus more explicit than implementations as they are used in, for example +haskell. + +As far as Beuteltier is concerned `Thunk`s are executables that are expected to produce +files in the directory they are executed in in a pure manner. That is to say they do not +access external resources, where possible. A `Thunk` that downloads a video from the +internet will, of course, access the internet and can thus fail. We expect it, however, to +not to try and access the users home directory to look for e.g. credentials for +authentication it intends to use to its primary job. + +When a `Thunk`s executable gets executed the files it creates (excluding itself) get +translated to `SubObject`s with the filenames (directories stripped of course) as their +`SubObjectName`s and the file contents as their… well, their contents. It is understood, +that not all possible `SubObjectName`s can be created thus (we restrict ourselves to valid +filenames on whatever system we happen to be on). We do not consider this to be a great +loss. + +The advanced equality checks mentioned below are, in fact, implemented and will be explained +in more detail in a later post concerned with the file `beuteltier/Beuteltier/Types/Util.hs`. + +> -- | Thunks are at runtime not yet known parts of an object +> data Thunk = Thunk +> { _tId :: ThunkName -- ^ For debugging +> , _tScript :: Lazy.ByteString +> -- ^ A Thunk is, in the end, a shell script that is expected to generate +> -- 'SubObject's +> , _tPromises :: Maybe [SubObjectName] +> -- ^ Maybe we already know what our script is going to generate? +> -- This would enable us to do some more advanced equality checks under +> -- the assumption that scripts are pure +> , _tState :: ThunkState +> } +> deriving (Show) +> +> -- | Empty id, empty script, promises nothing, and with default state +> instance Default Thunk where +> def = Thunk { _tId = def +> , _tScript = def +> , _tPromises = def +> , _tState = def +> } +> +> -- | Equality on 'Thunk's ignores '_tState' and '_tId' +> instance Eq Thunk where +> a == b = and $ [ (==) `on` _tScript +> , (==) `on` _tPromises +> ] <*> pure a <*> pure b +> +> -- | The states in which a 'Thunk' can be encountered. +> data ThunkState = NotExecuted +> | Executed [SubObjectName] ThunkResult +> deriving (Show) +> +> -- | Return the default 'ThunkResult' upon forcing +> instance Default ThunkState where +> def = NotExecuted +> +> -- | Thunks generate some data during execution +> data ThunkResult = ThunkResult +> { _rOutErr, _rOutStd :: Lazy.Text +> , _rExit :: Integer -- ^ Numerical exit code (0 usually means success) +> } +> deriving (Show) +> +> -- | Empty output, and with undefined exit code (no execution took place and we can´t +> -- encode this in a numerical exit code) +> instance Default ThunkResult where +> def = ThunkResult { _rOutErr = LT.empty, _rOutStd = LT.empty +> , _rExit = undefined +> } +> +> -- | We expect identifiers for 'SubObject's to be short, thus 'String' +> type SubObjectName = String +> -- | We expect identifiers for 'Thunk's to be short, thus 'String' +> type ThunkName = String +> +> -- | @LBS.empty@ +> instance Default (Lazy.ByteString) where +> def = LBS.empty + +What good is a library for managing a backstore if it does not support search operations? +We consider the answer to be "very little" and, thus, support searches. + +> type SearchQuery f = BoolExpr (Predicate f) +> -- data Predicate f = Prim (ObjectGen f -> f Bool) +> -- | Meta (MetaData -> Bool) +> type Predicate f = ObjectGen f -> f Bool + +The heart of the `Beuteltier` library is the typeclass reproduced below. We expect +implementations of backstores to be `Monad`s so that we may be able to construct +complicated actions that act on the backstore in question. +Once we have constructed such an action using the three primitives `search`, `insert`, and +`delete` we additionally require a way to execute that action from within the `IO` +`Monad`. + +Additional primitives, such as those for "forcing" and resetting thunks, are provided in +additional libraries and, thus, later posts. + +> -- | We have the user supply the functions we use to interact with whatever backstore +> -- she uses +> class Monad functor => Beutel (functor :: * -> *) where +> data Config :: * +> run :: Config -> functor a -> IO a +> -- ^ Actually run whatever action we constructed against the backstore +> search :: SearchQuery functor -> functor [ObjectGen functor] +> -- ^ Perform a search +> insert :: Object -> functor () +> -- ^ Insert an object +> delete :: SearchQuery functor -> functor () +> -- ^ Delete the results of a search diff --git a/provider/posts/beuteltier/2.lhs b/provider/posts/beuteltier/2.lhs new file mode 100644 index 0000000..8da4711 --- /dev/null +++ b/provider/posts/beuteltier/2.lhs @@ -0,0 +1,179 @@ +--- +title: "\"Type level\" utilities for an overly complicated feedreader" +published: 2015-08-05 +tags: Beuteltier +--- + +By popular (n=1) demand we will, in this post, be taking a look at +`beuteltier/Beuteltier/Types/Util.hs` the, creatively named, module providing some "type +level" utilities. + +What I mean when I say "type level" is: additional instances (placed here when they +contain major design decisions and are not "Ord" or "Eq"), utilities not connected to +beuteltier itself (like the different flavours of `alter` below) + +In contrast to the first, this post is straightforward enough to be read linearly. + +> {-# LANGUAGE TypeSynonymInstances, FlexibleInstances #-} +> +> module Beuteltier.Types.Util +> ( -- * Constructing structures +> construct +> , construct' +> , alter +> , alter' +> -- * Dealing with 'ObjectGen's (here be dragons) +> , generateObject +> , liftGen +> -- * Equivalence on 'Object's (for nubbing) +> , Equivalent(..) +> -- * Operations on 'SearchQuery's +> , runQuery +> -- , runExpr +> ) where +> +> import Beuteltier.Types +> import Beuteltier.Types.Lenses + +We make use of lenses (as provided by [lens](http://hackage.haskell.org/package/lens)) +extensively. +We won´t dedicate a post to `beuteltier/Beuteltier/Types/Lenses.hs` because it consists +mostly of the canonical invocations of +[makeLenses](http://hackage.haskell.org/package/lens-4.12.3/docs/Control-Lens-TH.html#v:makeLenses). + +> import Data.Default +> +> import Prelude hiding (sequence) +> import Data.Traversable (sequence) +> +> import Control.Lens +> +> import Control.Monad.State.Lazy hiding (sequence) -- Why is this exported? +> +> import Data.Map (Map) +> import qualified Data.Map as Map +> +> import Data.Set (Set) +> import qualified Data.Set as Set +> +> import Data.Hashable (Hashable(..), hashUsing) +> +> import Data.Monoid ((<>)) +> +> import Data.Function (on) +> import Data.Maybe (mapMaybe) +> +> import Data.BoolExpr + +Quite often we find ourselves in the position that we want to alter some small parts of a +complicated structure. We would therefore like to write the following: + +~~~ {.haskell} +updateFoo :: Foo -> Monad Foo +updateFoo x = alter x $ do + bar <~ (constructNewBar :: Monad Bar) + buz .= (makeConstantBuz :: Buz) +~~~ + +The definitions below allow us not only to do so, but also provide some convenience +functions for constructing entirely new values and performing both operations in a pure +context. + +> alter :: Monad m => s -> StateT s m a -> m s +> -- ^ Alter a complex structure monodically +> alter = flip execStateT +> +> alter' :: s -> State s a -> s +> -- ^ Specialization of 'alter' to 'Identity' +> alter' s = runIdentity . alter s +> +> construct :: (Monad m, Default s) => StateT s m a -> m s +> -- ^ Compute a complex structure monadically +> construct = alter def +> +> construct' :: Default s => State s a -> s +> -- ^ Specialization of 'construct' to 'Identity' +> construct' = runIdentity . construct + +Sometimes we just really want to translate an `ObjectGen` to an `Object`. + +> generateObject :: Monad f => ObjectGen f -> f Object +> -- ^ Run an object generator. +> -- Use iff /all/ components of an object are needed /in RAM now/. +> generateObject gen = construct $ do +> content <- lift $ gen ^. oContent >>= sequence +> thunks <- lift $ gen ^. oThunks >>= sequence +> meta <- lift $ gen ^. oMeta +> oContent .= return (fmap return content) +> oThunks .= return (fmap return thunks) +> oMeta .= return meta +> +> liftGen :: Monad f => Object -> ObjectGen f +> -- ^ Lift an 'Object' to be an 'ObjectGen' in any 'Monad' by the power of 'return' +> liftGen obj = construct' $ do +> oContent .= return (Map.map return $ obj ^. oContent') +> oThunks .= return (map return $ obj ^. oThunks') +> oMeta .= return (obj ^. oMeta') + +We expect implementations of `insert` to perform what we call nubbing. That is removal of +`Object`s that are, in some sense, `Equivalent` to the new one we´re currently +inserting. Thus we provide a definition of what we mean, when we say `Equivalent`. + +> class Equivalent a where +> (~~) :: a -> a -> Bool +> +> -- | Two 'Object's are equivalent iff their content is identical as follows: +> -- the set of 'SubObjectName's both promised and actually occurring is identical +> -- and all 'SubObject's that actually occurr and share a 'SubObjectName' are +> -- identical (as per '(==)') +> -- +> -- Additionally we expect their 'Metadata' to be identical (as per '(==)') +> instance Equivalent Object where +> a ~~ b = (contentCompare `on` content) a b && ((==) `on` (^. oMeta')) a b +> where +> contentCompare :: (Ord k, Eq v) => Map k (Maybe v) -> Map k (Maybe v) -> Bool +> contentCompare a b = Map.foldl (&&) True $ Map.mergeWithKey combine setFalse setFalse a b +> combine _ a b = Just $ cmpMaybes a b +> setFalse = Map.map $ const False +> +> cmpMaybes Nothing _ = True +> cmpMaybes _ Nothing = True +> cmpMaybes (Just a) (Just b) = a == b + +To speed up nubbing we also provide a quick way to "cache results". To make caching +meaningful we of course expect the following to hold: + +~~~ +a ~~ b ⇒ (hash a) == (hash b) +~~~ + +Note that we do not expect the converse to hold. We will thus require a second pass over +all objects sharing a hash to determine true equivalency. + +> -- | Two 'Object's´ hashes are a first indication of whether they are 'Equivalent' +> instance Hashable Object where +> hashWithSalt = hashUsing $ \a -> (a ^. oMeta', Map.keys $ content a) +> +> instance Hashable MetaData where +> hashWithSalt = hashUsing $ Set.toList . (^. mTags) +> +> content :: Object -> Map SubObjectName (Maybe SubObject) +> content obj = promised obj <> actual obj +> actual :: Object -> Map SubObjectName (Maybe SubObject) +> actual = fmap Just . (^. oContent') +> promised :: Object -> Map SubObjectName (Maybe SubObject) +> promised = Map.fromList . map (\n -> (n, Nothing)) . concat . promises +> promises :: Object -> [[SubObjectName]] +> promises = mapMaybe (^. tPromises) . (^. oThunks') + +Evaluating a `SearchQuery` against an `ObjectGen` is, due to the structure of elementary +`SearchQuery`s quite straightforward. + +> runQuery :: Monad f => SearchQuery f -> ObjectGen f -> f Bool +> -- ^ Run a 'SearchQuery' against an 'ObjectGen' +> runQuery query obj = liftM reduceBoolExpr $ sequence $ fmap ($ obj) query +> +> -- runExpr :: Monad f => ObjectGen f -> Predicate f -> f Bool +> -- -- ^ Run a 'Predicate' (»an atomic 'SearchQuery'«) against an 'ObjectGen' +> -- runExpr obj (Prim f) = f obj +> -- runExpr obj (Meta f) = liftM f (obj ^. oMeta) diff --git a/provider/posts/beuteltier/3.md b/provider/posts/beuteltier/3.md new file mode 100644 index 0000000..9b699c1 --- /dev/null +++ b/provider/posts/beuteltier/3.md @@ -0,0 +1,19 @@ +--- +title: An Update to the Type Level Utilities of an Overly Complicated Feedreader +published: 2015-08-12 +tags: Beuteltier +--- + +I commited a change to `beuteltier/Beuteltier/Types/Util.hs` ("[…] Hashable update"). I +replaced the `Hashable` instances for `Object` and `MetaData` with a single, better +optimized function: + +~~~ {.haskell} +objHash :: Applicative f => ObjectGen f -> f Int +-- ^ Two 'ObjectGen's hashes are a first indication of whether they are 'Equivalent' +objHash o = fmap hash $ (,) <$> (Set.toList . (^. mTags) <$> o ^. oMeta) <*> (Map.keys <$> o ^. oContent) +~~~ + +The new implementation allows computation of hashes without calling `generateObject` (that +function is evil — it makes sure the entire `Object` is "in RAM" (it isn´t actually, of +course (because haskell is lazy)—but I have no guarantee of that)). diff --git a/provider/posts/beuteltier/4.lhs b/provider/posts/beuteltier/4.lhs new file mode 100644 index 0000000..478cbac --- /dev/null +++ b/provider/posts/beuteltier/4.lhs @@ -0,0 +1,220 @@ +--- +title: Business Logic for an Overly Complicated Feedreader +published: 2015-08-12 +tags: Beuteltier +--- + +It turns out I don´t have to write much in the way of comments—the source file is already +quite well documented. + +> {-# LANGUAGE KindSignatures #-} +> +> module Beuteltier +> ( -- * Forcing (Executing) 'Thunk's +> forceAllThunks +> , WithObjects +> , forceThunk +> , resetAllThunks +> , resetThunk +> -- * Higher level interactions with a 'Beutel' +> , replace +> , eqTo +> , update +> , forceAllThunks' +> , resetAllThunks' +> , module Beuteltier.Util +> , module Beuteltier.Types.Common +> ) where +> +> import Beuteltier.Types.Common +> import Beuteltier.Util +> +> import Data.Map (Map) +> import qualified Data.Map as Map +> +> import Control.Lens +> +> import qualified Data.ByteString.Lazy as LBS +> import qualified Data.ByteString as BS +> +> import qualified Data.Text as T +> import qualified Data.Text.IO as T +> import qualified Data.Text.Lazy as TL +> +> import System.FilePath +> import System.Directory +> import System.Posix.Files +> import System.Posix.Temp +> +> import System.Environment +> import System.IO +> import GHC.IO.Handle +> import System.Process as P +> import System.Exit +> +> import Control.Concurrent +> import Control.Applicative +> import Control.Monad.Morph +> import Control.Monad.Trans.State +> import Control.Monad.Writer +> import Control.Monad.Trans.Resource +> import Control.Monad (liftM) +> +> import Data.Time.Clock +> +> import Data.BoolExpr + +The distinguishing feature of our Overly Complicated Feedreader™ is it´s support for +`Thunk`s. It is thus reasonable to expect, that we have some functions to actually +interact with them. Most striking in that hypothetical set of functions would be one that +executes all thunks associated with a single `ObjectGen` and return a new one with the +content generated by `Thunk`s filled in. + +Enter `forceAllThunks`. + +> forceAllThunks :: (MonadIO f, MonadResource f) +> => (Thunk -> Bool) -- ^ Select which 'Thunk's to force +> -> ObjectGen f -> f (ObjectGen f) +> -- ^ Force all thunks in place and update '_oContent' +> -- +> -- The reason we require a 'MonadResource' instance is that we would like to store our expensive to hold in RAM 'SubObject' contents in temporary files. +> forceAllThunks pred = flip alter $ do +> pureThunks <- use oThunks >>= lift >>= mapM lift +> (newThunks, objectResults) <- mapAndUnzipM forceThunk' pureThunks +> assign oThunks $ return $ map return newThunks +> oContent %= liftM (<> mconcat objectResults) +> where +> forceThunk' thunk +> | pred thunk = forceThunk thunk +> | otherwise = return (thunk, Map.empty) + +`forceAllThunks'` (`resetAllThunks'` below, too) is tainted by the evil of +`generateObject` but included for convenience. + +> forceAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f () +> -- ^ Version of 'forceAllThunks' suitable for use with 'update' +> forceAllThunks' pred = get >>= lift . forceAllThunks pred . liftGen >>= lift . generateObject >>= put +> +> -- | Internal helper to track computations creating 'SubObject's +> type WithObjects (n :: * -> *) (m :: * -> *) = WriterT (Map SubObjectName (n SubObject)) m +> +> forceThunk_ :: (MonadIO m, MonadResource m) => Thunk -> WithObjects (ResourceT IO) m Thunk +> -- ^ Make sure the 'ThunkState' contained within a 'Thunk' is 'Executed' +> forceThunk_ = flip alter $ do +> -- (result, subObjects) <- liftIO $ runWriterT $ parseThunk thunk +> -- writer ((), subObjects) +> (result, subObjects) <- listen . hoist (hoist liftResourceT) . lift . parseThunk =<< get +> tState .= Executed (Map.keys subObjects) result +> +> forceThunk :: (MonadResource m, MonadResource n) => Thunk -> m (Thunk, Map SubObjectName (n SubObject)) +> -- ^ Force a 'Thunk' and return it in 'Executed' state together with the 'SubObject's it created during execution +> forceThunk thunk = liftM (_2 %~ fmap liftResourceT) $ (runWriterT . forceThunk_) thunk + +Quite often we want to undue the harm done by `forceAllThunks` (to save space, usually). + +> resetAllThunks :: Monad f +> => (Thunk -> Bool) -- ^ Select which 'Thunk's to reset +> -> ObjectGen f -> f (ObjectGen f) +> -- ^ Undoes 'forceAllThunks': +> -- +> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) >>= forceAllThunks (const True) = forceAllThunks (const True) obj +> -- +> -- This inevitably drops information ('ThunkResult's for one). +> -- +> -- In the case where 'forceAllThunks' does not drop information (i.e.: no 'SubObjectName' collisions ocurr) the following, stronger property holds: +> -- +> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) = return obj +> resetAllThunks pred = flip alter $ do +> thunks <- liftM (map lift) (use oThunks >>= lift) >>= sequence +> let +> (subObjectNames, newThunks) = over _1 concat $ unzip $ map resetThunk' thunks +> oThunks .= return (map return newThunks) +> oContent %= (>>= return . Map.filterWithKey (\k _ -> k `notElem` subObjectNames)) +> where +> resetThunk' thunk +> | pred thunk = resetThunk thunk +> | otherwise = ([], thunk) +> +> resetAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f () +> -- ^ Version of 'resetAllThunks' suitable for use with 'update' +> resetAllThunks' pred = get >>= lift . resetAllThunks pred . liftGen >>= lift . generateObject >>= put +> +> resetThunk :: Thunk -> ([SubObjectName], Thunk) +> -- ^ Reset a thunk and return the 'SubObjectName's of the 'SubObject's it once created. +> -- This forgets information. +> resetThunk thunk = case thunk ^. tState of +> NotExecuted -> ([], thunk) +> Executed created _ -> (created, set tState NotExecuted thunk) +> +> parseThunk :: Thunk -> WithObjects (ResourceT IO) (ResourceT IO) ThunkResult +> -- ^ Generate a runnable action from a 'Thunk' +> -- +> -- Regarding the "inner" and "outer" 'Monad' here being 'IO': We have not, at time of forcing, a neccessary connection to our backstore and thus cannot expect the monads to be anything else. +> parseThunk thunk = do +> tmpDirName <- liftIO getTemporaryDirectory +> progName <- liftIO getProgName +> let +> tmpDirName' = tmpDirName progName +> (_, tmpDir) <- allocate (mkdtemp tmpDirName') removeDirectoryRecursive +> let exec = tmpDir "exec" +> out = tmpDir "out" +> result <- liftIO $ do +> createDirectory out +> LBS.writeFile exec script +> setFileMode exec $ foldl unionFileModes nullFileMode [ownerReadMode, ownerExecuteMode] +> (Just std_in, Just std_out, Just std_err, ph) <- createProcess $ (P.proc exec []) { cwd = Just out, std_in = CreatePipe, std_out = CreatePipe, std_err = CreatePipe } +> hClose std_in +> hSetBinaryMode std_out True +> hSetBinaryMode std_err True +> std_out `sendTo` stdout +> std_err `sendTo` stderr +> construct $ do +> rOutStd <~ TL.fromStrict <$> liftIO (T.hGetContents std_out) -- Yes, sadly we have to be strict here +> rOutErr <~ TL.fromStrict <$> liftIO (T.hGetContents std_err) +> rExit <~ toNum <$> liftIO (waitForProcess ph) +> outputFiles <- liftIO $ getDirectoryContents out +> let +> outputFiles' = filter fileFilter outputFiles +> fileFilter = and . (<*>) [(/=) ".", (/=) ".."] . pure . takeFileName +> mapM_ tell =<< mapM (liftResourceT . toSubObject) outputFiles' +> return result +> where +> script = thunk ^. tScript +> toSubObject :: FilePath -> ResIO (Map SubObjectName (ResIO SubObject)) +> -- ^ Using 'ResourceT' provides us with the guarantee, that the 'FilePath' we´re referring to should still exist when we actually try to get the 'SubObject'´s contents +> toSubObject name = fmap (Map.singleton name' . return) $ construct $ do +> sContent <~ liftIO (LBS.readFile name) +> sUpdates <~ pure <$> liftIO getCurrentTime +> where +> name' = takeFileName name +> sendTo input output = do +> input' <- hDuplicate input +> forkIO $ do +> hSetBuffering input' NoBuffering +> LBS.hGetContents input' >>= LBS.hPutStr output +> return () +> toNum :: Num a => ExitCode -> a +> toNum ExitSuccess = 0 +> toNum (ExitFailure i) = fromInteger $ toInteger i + +We provide `update`, a convenience function for high-level interactions (though costly on +large sets of equivalent objects (which should not exist due to nubbing)). + +> eqTo :: Monad f => Object -> SearchQuery f +> -- ^ @eqTo o@ constructs a 'SearchQuery' that matches all 'Object's 'Equivalent' to @o@ +> -- +> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel'. +> eqTo o = BConst ((>>= return . (~~) o) . generateObject) +> +> update :: Beutel f => SearchQuery f -> StateT Object f a -> f () +> -- ^ @update search action@ replaces /all/ 'Object's matching @search@ within the 'Beutel' by versions of themselves modified by applying @action@. +> -- +> -- Does not handle '_sUpdates'. +> -- +> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel' /and/ all results of @search@ (in order to use 'Eq' on 'Object's to delete the results of the initial 'search'). +> update query alteration = do +> matches <- search query +> matches' <- mapM generateObject matches +> delete $ BConst ((>>= return . (`elem` matches')) . generateObject) +> mapM_ (\o -> alter o alteration >>= insert) matches' +> return () diff --git a/provider/posts/blog-documentation.md b/provider/posts/blog-documentation.md deleted file mode 100644 index b0d4af6..0000000 --- a/provider/posts/blog-documentation.md +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: On the Origin of dirty-haskell.org -published: 2015-03-12 -tags: Blog Software ---- - -The software used is a trivially modified version of the one powering [math.kleen.org](http://math.kleen.org/lists/blog.html). - -The title is without deeper meaning. diff --git a/provider/posts/blog-rss.md b/provider/posts/blog-rss.md deleted file mode 100644 index 095ff56..0000000 --- a/provider/posts/blog-rss.md +++ /dev/null @@ -1,43 +0,0 @@ ---- -title: dirty-haskell.org´s rss feeds -published: 2015-03-29 -tags: Blog Software ---- - -I extended the software suite inherited from [math.kleen.org](http://math.kleen.org) to include support for rss feeds. -The heart of the issue is a ~80 line haskell script I chose to call, in a bout of creativity, "generate-rss.hs". -The script uses the [feed](http://hackage.haskell.org/package/feed-0.3.9.2) package. - -generate-rss.hs gets passed a title and a list of paths below ./lists to incorporate as items. -It generates an empty feed structure, adds title and a (hardcoded) base url for RSS metadata, and iterates over the given paths — generating for each path an item to be included in the finished feed. -This procedure makes use of a state monad (StateT (Feed, Maybe ClockTime) IO ()) to sequentially add items to the feed and keep track of the modification/change time of the newest path examined. -Each item carries a title, an url, a date, and contents as follows: - -- The date used is the modification/change time of the path supplied as a command line argument at the beginning of the program (usually a symbolic link in ./lists) — as such it is the time the post was linked into the particular list we´re generating a RSS feed for (this was not a deliberate design choice but a side effect of the canonical implementation — it was later decided that this behaviour was in fact the one expected all along). -- The url is generated by following, recursively, the trail of symbolic links starting in ./lists, assuming the final target is indeed in ./posts, and forming the filename of that target into a (hopefully) functional url in a hardcoded fashion. -- The title is extracted from the markdown file using a function shamelessly copied from extract-title.hs (The author wrote that one too, after all). -- The contents are read into Pandoc and rendered into [AsciiDoc](http://en.wikipedia.org/wiki/AsciiDoc) format (it seemed convenient at the time). - -Along the way two helper functions were introduced — if an implementation of those already exists in Prelude or somewhere else common please mail in a comment: - -~~~ {.haskell} -(<->) :: [(a -> b)] -> a -> [b] -[] <-> _ = [] -(f:fs) <-> x = (f x:fs <-> x) - -(<-->) :: [(a -> a)] -> a -> a -[] <--> x = x -(f:fs) <--> x = fs <--> (f x) -~~~ - -## Update ## - -~~~ {.haskell} -import Control.Applicative ((<*>), pure) - -(<->) fs = (<*>) fs . pure - -(<-->) = flip $ foldl (.) id -~~~ - -Thanks, viktor. diff --git a/provider/posts/blog/hakyll.md b/provider/posts/blog/hakyll.md new file mode 100644 index 0000000..be3bc1b --- /dev/null +++ b/provider/posts/blog/hakyll.md @@ -0,0 +1,131 @@ +--- +title: Switch to Hakyll +published: 2015-08-03 +tags: Blog Software +--- + +I stopped using the software suite inherited from +[math.kleen.org](http://math.kleen.org) and switched over to using +[hakyll](http://jaspervdj.be/hakyll/) instead, since I realised that the two +were doing essentially the same job and keeping my mess in one haskell file +(`src/Site.hs`, for those of you who are willing to checkout the +[git repo](https://git.yggdrasil.li/gkleen/pub/dirty-haskell.org)) instead of spread over a +large number of interlocking zsh and haskell scripts. + +I expect nothing to be seriously broken (Only the filepaths of lists have +changed), but some feed readers might have stopped working (hakyll´s +deceptively named `renderRss` actually renders atom). + +## Implementation Details + +I´m using this post to document some of the more involved things I had to do +during migration in no particular order. + +### Lists → Tags + +I´m using hakyll´s implementation of +[tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html) +instead of the [math.kleen.org](http://math.kleen.org) concept of lists, now. + +This required some tweaking. + +In order to retain the [All Posts](/tags/all-posts.html) list I introduced a +function to add new tags to an already existing +[Tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#t:Tags) +structure and used it to add my desired pseudo-tag. + +~~~ {.haskell} +main = hakyllWith config $ do + … + tags <- buildTags "posts/*" tagTranslation' >>= addTag "All Posts" "posts/*" + … + +addTag :: MonadMetadata m => String -> Pattern -> Tags -> m Tags +addTag name pattern tags = do + ids <- getMatches pattern + return $ tags { tagsMap = tagsMap tags ++ [(name, ids)] } +~~~ + +### Printing lists is an involved affair + +I wanted to keep the layout of the site including the lists of posts on the +[index page](/). + +Generating those lists turned out to be a hassle. + +The `Rule` for `index.md` adds to the context of the templates used in it´s +creation a list field which contains verbatim HTML as produced by renderTag for +each tag. +A trick I used to implement the desired behaviour of replacing old posts with +"…" is to introduce a pseudo post-item which has a flag in it´s context to tell +the corresponding template to only print "…". +Trimming the list of posts is straightforward. + +~~~ {.haskell} +renderTag :: String -- ^ Tag name + -> Tags + -> Compiler (Item String) +renderTag tag tags = do + ellipsisItem <- makeItem "" + let + ids = fromMaybe [] $ lookup tag $ tagsMap tags + postCtx = mconcat [ listField "posts" (ellipsisContext ellipsisItem) $ + liftM (withEllipsis ellipsisItem) $ chronological =<< mapM load ids + , constField "title" tag + , constField "rss" ("tags/" ++ tagTranslation tag ++ ".rss") + , constField "url" ("tags/" ++ tagTranslation tag ++ ".html") + , defaultContext + ] + makeItem "" + >>= loadAndApplyTemplate "templates/post-list.html" postCtx + >>= loadAndApplyTemplate "templates/tag.html" postCtx + where + ellipsisContext item = mconcat [ boolField "ellipsis" (== item) + , defaultContext + ] + boolField name f = field name (\i -> if f i + then pure (error $ unwords ["no string value for bool field:",name]) + else empty) + withEllipsis ellipsisItem xs + | length xs > max = [ellipsisItem] ++ takeEnd (max - 1) xs + | otherwise = xs + takeEnd i = reverse . take i . reverse + max = 4 +~~~ + +### Everything needs a [Rule](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Core-Rules.html#t:Rules) + +I was stumped for a while when my templates wouldn´t +[load](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Template.html#v:loadAndApplyTemplate). + +This was easily rectified by realising, that even templates need (of course) a +declaration of how to compile them: + +~~~ {.haskell} +main = hakyllWith config $ do + match "templates/*" $ compile templateCompiler + … +~~~ + +### Duplicate Rules are duplicate + +Hakyll tracks dependencies. +Therefore it seems to keep a list of +[Identifier](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Core-Identifier.html#t:Identifier)s +it has encountered with priority given to the more early ones. + +It was thus necessary to tweak the function that does `Identifier`/`String` +conversion for tags contained within a +[Tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#v:Tags) +structure if I wanted to use (the very convenient) +[tagsRules](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#v:tagsRules) +twice. + +So I did: + +~~~ {.haskell} +main = hakyllWith config $ do + tags <- buildTags "posts/*" tagTranslation' … + let + tags' = tags { tagsMakeId = fromFilePath . (\b -> "rss" b <.> "rss") . takeBaseName . toFilePath . tagsMakeId tags} +~~~ diff --git a/provider/posts/blog/origin.md b/provider/posts/blog/origin.md new file mode 100644 index 0000000..b0d4af6 --- /dev/null +++ b/provider/posts/blog/origin.md @@ -0,0 +1,9 @@ +--- +title: On the Origin of dirty-haskell.org +published: 2015-03-12 +tags: Blog Software +--- + +The software used is a trivially modified version of the one powering [math.kleen.org](http://math.kleen.org/lists/blog.html). + +The title is without deeper meaning. diff --git a/provider/posts/blog/rss.md b/provider/posts/blog/rss.md new file mode 100644 index 0000000..095ff56 --- /dev/null +++ b/provider/posts/blog/rss.md @@ -0,0 +1,43 @@ +--- +title: dirty-haskell.org´s rss feeds +published: 2015-03-29 +tags: Blog Software +--- + +I extended the software suite inherited from [math.kleen.org](http://math.kleen.org) to include support for rss feeds. +The heart of the issue is a ~80 line haskell script I chose to call, in a bout of creativity, "generate-rss.hs". +The script uses the [feed](http://hackage.haskell.org/package/feed-0.3.9.2) package. + +generate-rss.hs gets passed a title and a list of paths below ./lists to incorporate as items. +It generates an empty feed structure, adds title and a (hardcoded) base url for RSS metadata, and iterates over the given paths — generating for each path an item to be included in the finished feed. +This procedure makes use of a state monad (StateT (Feed, Maybe ClockTime) IO ()) to sequentially add items to the feed and keep track of the modification/change time of the newest path examined. +Each item carries a title, an url, a date, and contents as follows: + +- The date used is the modification/change time of the path supplied as a command line argument at the beginning of the program (usually a symbolic link in ./lists) — as such it is the time the post was linked into the particular list we´re generating a RSS feed for (this was not a deliberate design choice but a side effect of the canonical implementation — it was later decided that this behaviour was in fact the one expected all along). +- The url is generated by following, recursively, the trail of symbolic links starting in ./lists, assuming the final target is indeed in ./posts, and forming the filename of that target into a (hopefully) functional url in a hardcoded fashion. +- The title is extracted from the markdown file using a function shamelessly copied from extract-title.hs (The author wrote that one too, after all). +- The contents are read into Pandoc and rendered into [AsciiDoc](http://en.wikipedia.org/wiki/AsciiDoc) format (it seemed convenient at the time). + +Along the way two helper functions were introduced — if an implementation of those already exists in Prelude or somewhere else common please mail in a comment: + +~~~ {.haskell} +(<->) :: [(a -> b)] -> a -> [b] +[] <-> _ = [] +(f:fs) <-> x = (f x:fs <-> x) + +(<-->) :: [(a -> a)] -> a -> a +[] <--> x = x +(f:fs) <--> x = fs <--> (f x) +~~~ + +## Update ## + +~~~ {.haskell} +import Control.Applicative ((<*>), pure) + +(<->) fs = (<*>) fs . pure + +(<-->) = flip $ foldl (.) id +~~~ + +Thanks, viktor. diff --git a/provider/posts/blog/tex-support.md b/provider/posts/blog/tex-support.md new file mode 100644 index 0000000..7f43eb6 --- /dev/null +++ b/provider/posts/blog/tex-support.md @@ -0,0 +1,243 @@ +--- +title: Cursory Math-Support +published: 2015-11-05 +tags: Blog Software +--- + +## Demonstration + +I added some cursory support for math as shown below: + +
+ +Formulae are rendered with $\text{\LaTeX}$ and included as [SVG](https://en.wikipedia.org/wiki/SVG). + +
+$$e^{ix} =\text{cos}(x) + i \text{sin} (x)$$ +
+
+ +Inline formulae get correctly aligned to match the baseline of the surrounding text. + +
+$\text{tan}(\phi) = \frac{\text{sin}(\phi)}{\text{cos}(\phi)}$ +
+
+
+ +## Implementation + +Theorem environments are written using [pandoc](http://pandoc.org)s support for block environments: + +~~~ {.markdown} +
+ +Formulae are rendered with $\text{\LaTeX}$ and included as [SVG](https://en.wikipedia.org/wiki/SVG). + +
+$$e^{ix} =\text{cos}(x) + i \text{sin} (x)$$ +
+
+ +Inline formulae get correctly aligned to match the baseline of the surrounding text. + +
+$\text{tan}(\phi) = \frac{\text{sin}(\phi)}{\text{cos}(\phi)}$ +
+
+
+~~~ + +Combined with a smattering of CSS this works nicely. +$\text{\LaTeX}$ support is, however, lacking as I opted not to patch pandoc ([math.kleen.org](https://math.kleen.org) did). + +### `Math.hs` + +The actual compilation happens in a new module I named `Math.hs`. We´ll start there. +For your reading pleasure I added some comments to the reproduction below. + +~~~ {.haskell} +module Math + ( compileMath + ) where + +import System.IO (stdout, stderr, hPutStrLn, writeFile, readFile) +import System.IO.Temp (withSystemTempDirectory) +import System.Process (callProcess, readProcessWithExitCode) +import System.Directory (copyFile, getCurrentDirectory, setCurrentDirectory) +import System.FilePath (takeFileName, FilePath(..), ()) +import System.Exit (ExitCode(..)) + +import Control.Monad (when) +import Control.Exception (bracket, throwIO) +import Data.Maybe (fromMaybe, listToMaybe) + +import Control.Monad.Writer.Strict (WriterT(..), execWriterT, tell) +import Control.Monad.Trans (liftIO) + +import Control.DeepSeq (($!!)) + +import Text.Regex.TDFA ((=~)) + +-- We built a monoid instance for `ExitCode` so we can easily collect failure using a `MonadWriter` +instance Monoid ExitCode where + mempty = ExitSuccess + (ExitFailure a) `mappend` _ = ExitFailure a + ExitSuccess `mappend` x@(ExitFailure _) = x + ExitSuccess `mappend` ExitSuccess = ExitSuccess + + +compileMath :: String -> IO (String, String) +compileMath = withSystemTempDirectory "math" . compileMath' -- Create a temporary directory, run `compileMath'`, and make sure the directory get's deleted + +compileMath' :: String -> FilePath -> IO (String, String) +compileMath' input tmpDir = do + mapM_ (copyToTmp . ("tex" )) [ "preamble.tex" + , "preview.dtx" + , "preview.ins" + ] + (exitCode, out, err) <- withCurrentDirectory tmpDir $ execWriterT $ do -- Collect stdout, stderr, and exitCode of all subprocesses (stdout and stderr simply get appended to one another) + run "latex" [ "-interaction=batchmode" + , "preview.ins" + ] "" + liftIO $ writeFile (tmpDir "image.tex") input + run "latex" [ "-interaction=batchmode" + , "image.tex" + ] "" + run "dvisvgm" [ "--exact" + , "--no-fonts" + , tmpDir "image.dvi" + ] "" + when (exitCode /= ExitSuccess) $ do -- Fail with maximum noise if any of the latex passes fail -- otherwise be silent + hPutStrLn stdout out + hPutStrLn stderr err + throwIO exitCode + (\x -> return $!! (x, extractAlignment err)) =<< (readFile $ tmpDir "image.svg") -- Note the call to `($!!)` -- since we'll be deleting `tmpDir` we need to make sure the entire generated output resides in memory before we leave this block + where + copyToTmp fp = copyFile fp (tmpDir takeFileName fp) + run :: String -> [String] -> String -> WriterT (ExitCode, String, String) IO () + run bin args stdin = tell =<< liftIO (readProcessWithExitCode bin args stdin) + +withCurrentDirectory :: FilePath -- ^ Directory to execute in + -> IO a -- ^ Action to be executed + -> IO a +-- ^ This is provided in newer versions of temporary +withCurrentDirectory dir action = + bracket getCurrentDirectory setCurrentDirectory $ \ _ -> do + setCurrentDirectory dir + action + +extractAlignment :: String -> String +extractAlignment = fromMaybe "0pt" . extract . (=~ "depth=([^\\s]+)") -- One of the few places where regular expressions really prove usefull + where + extract :: (String, String, String, [String]) -> Maybe String + extract (_, _, _, xs) = listToMaybe xs +~~~ + +### `Site.hs` + +The more trick part proved to be integration into the framework as provided by [Hakyll](http://jaspervdj.be/hakyll/). + +~~~ {.haskell} +… + +import qualified Crypto.Hash.SHA256 as SHA256 (hash) +import qualified Data.ByteString.Char8 as CBS +import Data.Hex (hex) +import Data.Char (toLower) + +import Math (compileMath) +import Text.Printf (printf) + +main :: IO () +main = hakyllWith config $ do + … + + math <- getMath "posts/*" mathTranslation' + forM_ math $ \(_, mathStr) -> create [mathTranslation' mathStr] $ do + route idRoute + compile $ do + item <- makeItem mathStr + >>= loadAndApplyTemplate "templates/math.tex" defaultContext + >>= withItemBody (unsafeCompiler . compileMath) -- unsafeCompiler :: IO a -> Compiler a + saveSnapshot "alignment" $ fmap snd item + return $ fmap fst item + + match "posts/*" $ do + route $ setExtension ".html" + compile $ do + getResourceBody >>= saveSnapshot "content" + pandocCompilerWithTransformM defaultHakyllReaderOptions defaultHakyllWriterOptions mathTransform -- pandocCompilerWithTransformM :: ReaderOptions -> WriterOptions -> (Pandoc -> Compiler Pandoc) -> Item String + >>= loadAndApplyTemplate "templates/default.html" defaultContext + >>= relativizeUrls + … + +… + +mathTranslation' :: String -> Identifier +-- ^ This generates the filename for a svg file given the TeX-source +mathTranslation' = fromCapture "math/*.svg" . map toLower . CBS.unpack . hex . SHA256.hash . CBS.pack + +getMath :: Pattern -> (String -> Identifier) -> Rules [([Identifier], String)] +-- ^ We scrape all posts for math, calls `readPandoc'` +getMath pattern makeId = do + ids <- getMatches pattern + mathStrs <- concat `liftM` mapM (\id -> map ((,) [id]) `liftM` getMath' (toFilePath' id)) ids + return $ mergeGroups $ groupBy ((==) `on` snd) $ mathStrs + where + getMath' :: FilePath -> Rules [String] + getMath' path = preprocess (query extractMath `liftM` readPandoc' path) + extractMath :: Inline -> [String] + extractMath (Math _ str) = [str] + extractMath _ = [] + mergeGroups :: [[([Identifier], String)]] -> [([Identifier], String)] + mergeGroups = map mergeGroups' . filter (not . null) + mergeGroups' :: [([Identifier], String)] -> ([Identifier], String) + mergeGroups' xs@((_, str):_) = (concatMap fst xs, str) + +readPandoc' :: FilePath -> IO Pandoc +-- ^ This is copied, almost verbatim, from Hakyll source -- Does what it says on the tin +readPandoc' path = readFile path >>= either fail return . result' + where + result' str = case result str of + Left (ParseFailure err) -> Left $ + "parse failed: " ++ err + Left (ParsecError _ err) -> Left $ + "parse failed: " ++ show err + Right item' -> Right item' + result str = reader defaultHakyllReaderOptions (fileType path) str + reader ro t = case t of + DocBook -> readDocBook ro + Html -> readHtml ro + LaTeX -> readLaTeX ro + LiterateHaskell t' -> reader (addExt ro Ext_literate_haskell) t' + Markdown -> readMarkdown ro + MediaWiki -> readMediaWiki ro + OrgMode -> readOrg ro + Rst -> readRST ro + Textile -> readTextile ro + _ -> error $ + "I don't know how to read a file of " ++ + "the type " ++ show t ++ " for: " ++ path + + addExt ro e = ro {readerExtensions = Set.insert e $ readerExtensions ro} + +mathTransform :: Pandoc -> Compiler Pandoc +-- ^ We replace math by raw html includes of the respective svg files here +mathTransform = walkM mathTransform' + where + mathTransform' :: Inline -> Compiler Inline + mathTransform' (Math mathType tex) = do + alignment <- loadSnapshotBody texId "alignment" + let + html = printf "%s" + (toFilePath texId) (alignment :: String) tex + return $ Span ("", [classOf mathType], []) [RawInline (Format "html") html] + where + texId = mathTranslation' tex + classOf DisplayMath = "display-math" + classOf InlineMath = "inline-math" + mathTransform' x = return x + +… +~~~ diff --git a/provider/posts/blog/ymir.md b/provider/posts/blog/ymir.md new file mode 100644 index 0000000..83e5811 --- /dev/null +++ b/provider/posts/blog/ymir.md @@ -0,0 +1,7 @@ +--- +title: Moved servers +published: 2015-11-05 +tags: Blog Software +--- + +dirty-haskell.org now lives on ymir.yggdrasil.li. diff --git a/provider/posts/hakyll.md b/provider/posts/hakyll.md deleted file mode 100644 index be3bc1b..0000000 --- a/provider/posts/hakyll.md +++ /dev/null @@ -1,131 +0,0 @@ ---- -title: Switch to Hakyll -published: 2015-08-03 -tags: Blog Software ---- - -I stopped using the software suite inherited from -[math.kleen.org](http://math.kleen.org) and switched over to using -[hakyll](http://jaspervdj.be/hakyll/) instead, since I realised that the two -were doing essentially the same job and keeping my mess in one haskell file -(`src/Site.hs`, for those of you who are willing to checkout the -[git repo](https://git.yggdrasil.li/gkleen/pub/dirty-haskell.org)) instead of spread over a -large number of interlocking zsh and haskell scripts. - -I expect nothing to be seriously broken (Only the filepaths of lists have -changed), but some feed readers might have stopped working (hakyll´s -deceptively named `renderRss` actually renders atom). - -## Implementation Details - -I´m using this post to document some of the more involved things I had to do -during migration in no particular order. - -### Lists → Tags - -I´m using hakyll´s implementation of -[tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html) -instead of the [math.kleen.org](http://math.kleen.org) concept of lists, now. - -This required some tweaking. - -In order to retain the [All Posts](/tags/all-posts.html) list I introduced a -function to add new tags to an already existing -[Tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#t:Tags) -structure and used it to add my desired pseudo-tag. - -~~~ {.haskell} -main = hakyllWith config $ do - … - tags <- buildTags "posts/*" tagTranslation' >>= addTag "All Posts" "posts/*" - … - -addTag :: MonadMetadata m => String -> Pattern -> Tags -> m Tags -addTag name pattern tags = do - ids <- getMatches pattern - return $ tags { tagsMap = tagsMap tags ++ [(name, ids)] } -~~~ - -### Printing lists is an involved affair - -I wanted to keep the layout of the site including the lists of posts on the -[index page](/). - -Generating those lists turned out to be a hassle. - -The `Rule` for `index.md` adds to the context of the templates used in it´s -creation a list field which contains verbatim HTML as produced by renderTag for -each tag. -A trick I used to implement the desired behaviour of replacing old posts with -"…" is to introduce a pseudo post-item which has a flag in it´s context to tell -the corresponding template to only print "…". -Trimming the list of posts is straightforward. - -~~~ {.haskell} -renderTag :: String -- ^ Tag name - -> Tags - -> Compiler (Item String) -renderTag tag tags = do - ellipsisItem <- makeItem "" - let - ids = fromMaybe [] $ lookup tag $ tagsMap tags - postCtx = mconcat [ listField "posts" (ellipsisContext ellipsisItem) $ - liftM (withEllipsis ellipsisItem) $ chronological =<< mapM load ids - , constField "title" tag - , constField "rss" ("tags/" ++ tagTranslation tag ++ ".rss") - , constField "url" ("tags/" ++ tagTranslation tag ++ ".html") - , defaultContext - ] - makeItem "" - >>= loadAndApplyTemplate "templates/post-list.html" postCtx - >>= loadAndApplyTemplate "templates/tag.html" postCtx - where - ellipsisContext item = mconcat [ boolField "ellipsis" (== item) - , defaultContext - ] - boolField name f = field name (\i -> if f i - then pure (error $ unwords ["no string value for bool field:",name]) - else empty) - withEllipsis ellipsisItem xs - | length xs > max = [ellipsisItem] ++ takeEnd (max - 1) xs - | otherwise = xs - takeEnd i = reverse . take i . reverse - max = 4 -~~~ - -### Everything needs a [Rule](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Core-Rules.html#t:Rules) - -I was stumped for a while when my templates wouldn´t -[load](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Template.html#v:loadAndApplyTemplate). - -This was easily rectified by realising, that even templates need (of course) a -declaration of how to compile them: - -~~~ {.haskell} -main = hakyllWith config $ do - match "templates/*" $ compile templateCompiler - … -~~~ - -### Duplicate Rules are duplicate - -Hakyll tracks dependencies. -Therefore it seems to keep a list of -[Identifier](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Core-Identifier.html#t:Identifier)s -it has encountered with priority given to the more early ones. - -It was thus necessary to tweak the function that does `Identifier`/`String` -conversion for tags contained within a -[Tags](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#v:Tags) -structure if I wanted to use (the very convenient) -[tagsRules](http://hackage.haskell.org/package/hakyll-4.7.2.2/docs/Hakyll-Web-Tags.html#v:tagsRules) -twice. - -So I did: - -~~~ {.haskell} -main = hakyllWith config $ do - tags <- buildTags "posts/*" tagTranslation' … - let - tags' = tags { tagsMakeId = fromFilePath . (\b -> "rss" b <.> "rss") . takeBaseName . toFilePath . tagsMakeId tags} -~~~ diff --git a/provider/posts/tex-support.md b/provider/posts/tex-support.md deleted file mode 100644 index 7f43eb6..0000000 --- a/provider/posts/tex-support.md +++ /dev/null @@ -1,243 +0,0 @@ ---- -title: Cursory Math-Support -published: 2015-11-05 -tags: Blog Software ---- - -## Demonstration - -I added some cursory support for math as shown below: - -
- -Formulae are rendered with $\text{\LaTeX}$ and included as [SVG](https://en.wikipedia.org/wiki/SVG). - -
-$$e^{ix} =\text{cos}(x) + i \text{sin} (x)$$ -
-
- -Inline formulae get correctly aligned to match the baseline of the surrounding text. - -
-$\text{tan}(\phi) = \frac{\text{sin}(\phi)}{\text{cos}(\phi)}$ -
-
-
- -## Implementation - -Theorem environments are written using [pandoc](http://pandoc.org)s support for block environments: - -~~~ {.markdown} -
- -Formulae are rendered with $\text{\LaTeX}$ and included as [SVG](https://en.wikipedia.org/wiki/SVG). - -
-$$e^{ix} =\text{cos}(x) + i \text{sin} (x)$$ -
-
- -Inline formulae get correctly aligned to match the baseline of the surrounding text. - -
-$\text{tan}(\phi) = \frac{\text{sin}(\phi)}{\text{cos}(\phi)}$ -
-
-
-~~~ - -Combined with a smattering of CSS this works nicely. -$\text{\LaTeX}$ support is, however, lacking as I opted not to patch pandoc ([math.kleen.org](https://math.kleen.org) did). - -### `Math.hs` - -The actual compilation happens in a new module I named `Math.hs`. We´ll start there. -For your reading pleasure I added some comments to the reproduction below. - -~~~ {.haskell} -module Math - ( compileMath - ) where - -import System.IO (stdout, stderr, hPutStrLn, writeFile, readFile) -import System.IO.Temp (withSystemTempDirectory) -import System.Process (callProcess, readProcessWithExitCode) -import System.Directory (copyFile, getCurrentDirectory, setCurrentDirectory) -import System.FilePath (takeFileName, FilePath(..), ()) -import System.Exit (ExitCode(..)) - -import Control.Monad (when) -import Control.Exception (bracket, throwIO) -import Data.Maybe (fromMaybe, listToMaybe) - -import Control.Monad.Writer.Strict (WriterT(..), execWriterT, tell) -import Control.Monad.Trans (liftIO) - -import Control.DeepSeq (($!!)) - -import Text.Regex.TDFA ((=~)) - --- We built a monoid instance for `ExitCode` so we can easily collect failure using a `MonadWriter` -instance Monoid ExitCode where - mempty = ExitSuccess - (ExitFailure a) `mappend` _ = ExitFailure a - ExitSuccess `mappend` x@(ExitFailure _) = x - ExitSuccess `mappend` ExitSuccess = ExitSuccess - - -compileMath :: String -> IO (String, String) -compileMath = withSystemTempDirectory "math" . compileMath' -- Create a temporary directory, run `compileMath'`, and make sure the directory get's deleted - -compileMath' :: String -> FilePath -> IO (String, String) -compileMath' input tmpDir = do - mapM_ (copyToTmp . ("tex" )) [ "preamble.tex" - , "preview.dtx" - , "preview.ins" - ] - (exitCode, out, err) <- withCurrentDirectory tmpDir $ execWriterT $ do -- Collect stdout, stderr, and exitCode of all subprocesses (stdout and stderr simply get appended to one another) - run "latex" [ "-interaction=batchmode" - , "preview.ins" - ] "" - liftIO $ writeFile (tmpDir "image.tex") input - run "latex" [ "-interaction=batchmode" - , "image.tex" - ] "" - run "dvisvgm" [ "--exact" - , "--no-fonts" - , tmpDir "image.dvi" - ] "" - when (exitCode /= ExitSuccess) $ do -- Fail with maximum noise if any of the latex passes fail -- otherwise be silent - hPutStrLn stdout out - hPutStrLn stderr err - throwIO exitCode - (\x -> return $!! (x, extractAlignment err)) =<< (readFile $ tmpDir "image.svg") -- Note the call to `($!!)` -- since we'll be deleting `tmpDir` we need to make sure the entire generated output resides in memory before we leave this block - where - copyToTmp fp = copyFile fp (tmpDir takeFileName fp) - run :: String -> [String] -> String -> WriterT (ExitCode, String, String) IO () - run bin args stdin = tell =<< liftIO (readProcessWithExitCode bin args stdin) - -withCurrentDirectory :: FilePath -- ^ Directory to execute in - -> IO a -- ^ Action to be executed - -> IO a --- ^ This is provided in newer versions of temporary -withCurrentDirectory dir action = - bracket getCurrentDirectory setCurrentDirectory $ \ _ -> do - setCurrentDirectory dir - action - -extractAlignment :: String -> String -extractAlignment = fromMaybe "0pt" . extract . (=~ "depth=([^\\s]+)") -- One of the few places where regular expressions really prove usefull - where - extract :: (String, String, String, [String]) -> Maybe String - extract (_, _, _, xs) = listToMaybe xs -~~~ - -### `Site.hs` - -The more trick part proved to be integration into the framework as provided by [Hakyll](http://jaspervdj.be/hakyll/). - -~~~ {.haskell} -… - -import qualified Crypto.Hash.SHA256 as SHA256 (hash) -import qualified Data.ByteString.Char8 as CBS -import Data.Hex (hex) -import Data.Char (toLower) - -import Math (compileMath) -import Text.Printf (printf) - -main :: IO () -main = hakyllWith config $ do - … - - math <- getMath "posts/*" mathTranslation' - forM_ math $ \(_, mathStr) -> create [mathTranslation' mathStr] $ do - route idRoute - compile $ do - item <- makeItem mathStr - >>= loadAndApplyTemplate "templates/math.tex" defaultContext - >>= withItemBody (unsafeCompiler . compileMath) -- unsafeCompiler :: IO a -> Compiler a - saveSnapshot "alignment" $ fmap snd item - return $ fmap fst item - - match "posts/*" $ do - route $ setExtension ".html" - compile $ do - getResourceBody >>= saveSnapshot "content" - pandocCompilerWithTransformM defaultHakyllReaderOptions defaultHakyllWriterOptions mathTransform -- pandocCompilerWithTransformM :: ReaderOptions -> WriterOptions -> (Pandoc -> Compiler Pandoc) -> Item String - >>= loadAndApplyTemplate "templates/default.html" defaultContext - >>= relativizeUrls - … - -… - -mathTranslation' :: String -> Identifier --- ^ This generates the filename for a svg file given the TeX-source -mathTranslation' = fromCapture "math/*.svg" . map toLower . CBS.unpack . hex . SHA256.hash . CBS.pack - -getMath :: Pattern -> (String -> Identifier) -> Rules [([Identifier], String)] --- ^ We scrape all posts for math, calls `readPandoc'` -getMath pattern makeId = do - ids <- getMatches pattern - mathStrs <- concat `liftM` mapM (\id -> map ((,) [id]) `liftM` getMath' (toFilePath' id)) ids - return $ mergeGroups $ groupBy ((==) `on` snd) $ mathStrs - where - getMath' :: FilePath -> Rules [String] - getMath' path = preprocess (query extractMath `liftM` readPandoc' path) - extractMath :: Inline -> [String] - extractMath (Math _ str) = [str] - extractMath _ = [] - mergeGroups :: [[([Identifier], String)]] -> [([Identifier], String)] - mergeGroups = map mergeGroups' . filter (not . null) - mergeGroups' :: [([Identifier], String)] -> ([Identifier], String) - mergeGroups' xs@((_, str):_) = (concatMap fst xs, str) - -readPandoc' :: FilePath -> IO Pandoc --- ^ This is copied, almost verbatim, from Hakyll source -- Does what it says on the tin -readPandoc' path = readFile path >>= either fail return . result' - where - result' str = case result str of - Left (ParseFailure err) -> Left $ - "parse failed: " ++ err - Left (ParsecError _ err) -> Left $ - "parse failed: " ++ show err - Right item' -> Right item' - result str = reader defaultHakyllReaderOptions (fileType path) str - reader ro t = case t of - DocBook -> readDocBook ro - Html -> readHtml ro - LaTeX -> readLaTeX ro - LiterateHaskell t' -> reader (addExt ro Ext_literate_haskell) t' - Markdown -> readMarkdown ro - MediaWiki -> readMediaWiki ro - OrgMode -> readOrg ro - Rst -> readRST ro - Textile -> readTextile ro - _ -> error $ - "I don't know how to read a file of " ++ - "the type " ++ show t ++ " for: " ++ path - - addExt ro e = ro {readerExtensions = Set.insert e $ readerExtensions ro} - -mathTransform :: Pandoc -> Compiler Pandoc --- ^ We replace math by raw html includes of the respective svg files here -mathTransform = walkM mathTransform' - where - mathTransform' :: Inline -> Compiler Inline - mathTransform' (Math mathType tex) = do - alignment <- loadSnapshotBody texId "alignment" - let - html = printf "%s" - (toFilePath texId) (alignment :: String) tex - return $ Span ("", [classOf mathType], []) [RawInline (Format "html") html] - where - texId = mathTranslation' tex - classOf DisplayMath = "display-math" - classOf InlineMath = "inline-math" - mathTransform' x = return x - -… -~~~ diff --git a/provider/posts/thermoprint-1.md b/provider/posts/thermoprint-1.md deleted file mode 100644 index 032e2f6..0000000 --- a/provider/posts/thermoprint-1.md +++ /dev/null @@ -1,130 +0,0 @@ ---- -title: On the Architecture of a tool-set for interacting with character-oriented printers -published: 2015-12-25 -tags: Thermoprint ---- - -# Motivation - -Some time ago I bought a cheap Chinese -[thermoprinter](https://en.wikipedia.org/wiki/Thermal_printing) off eBay. -As expected the printers firmware is really awkward to use (including binary -control codes used to switch between char sets such as bold, italic, underlined, -etc.). -The obvious solution was to write a library to parse a more sensible -representation and send it to be printed. - -Since there might, at some point, be other users wanting to print to my -acquisition the architecture is intended to be present a somewhat usable -interface to the uninitiated. - -# Implementation - -## Location - -Recently I created a new branch in -[thermoprint](https://git.yggdrasil.li/thermoprint) called -[rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite). - -## Architecture Overview - -The new macroscopic architecture I´m currently aiming for is quite similar to -the old one: - - * A server intended to run on the machine connected to my cheap printer talking - directly to the printer on one end and serving a - [json api](https://hackage.haskell.org/package/servant) on the other. - * A (hopefully) tiny cli tool for debugging and personal use. - * A website (it will probably end up being based on - [yesod](https://hackage.haskell.org/package/yesod)) presenting a web interface - similar to the cli tool. - -## Features - -Features I intend to implement include: - - * A parser for a bbcode-dialect which should be used in both the cli tool and the - website (it will probably end up using - [attoparsec](https://hackage.haskell.org/package/attoparsec)) -- bbcode as - presented on [Wikipedia](https://en.wikipedia.org/wiki/BBCode) is a proper - superset of the feature-set of my cheap Chinese printer. - * Reasonable test coverage using - [QuickCheck](https://hackage.haskell.org/package/QuickCheck), - [HUnit](http://hackage.haskell.org/package/HUnit). - - Automatic testing with [cabal](https://www.haskell.org/cabal/) facilitated by - [hspec](https://hackage.haskell.org/package/hspec). - * Support and server-side storage for drafts. - * The Website should provide some richer formats than bbcode which will - probably find inclusion in the payload datastructure such as lists, - checklists, tables, etc. - - The cli-tool should be able to use these too (the input will probably end up - being json-formatted). - -## Work so far - -### Prototype - -I already have a prototype. -It's quite bug-ridden and has recently developed serious problems actually -printing after working satisfactorily for a few weeks. - -It also does not include a web-interface and I am quite unsatisfied with the -overall code quality. - -The [685 lines of code](http://cloc.sourceforge.net/) can be found in the -[repo](https://git.yggdrasil.li/thermoprint?h=master) as well. - -### Rewrite - -Currently the [rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite) contains a -single file of moment -- spec/src/Thermoprint/Printout.hs -- wherein we define -the payload for the api -- our take on a structured document format (somewhat -inspired by -[pandoc](http://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html)): - -~~~ {.haskell} --- | A 'Printout' is a sequence of visually seperated 'Paragraph's -type Printout = Seq Paragraph - --- | A 'Paragraph' is a non-seperated sequence of 'Chunk's -type Paragraph = Seq Chunk - --- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. --- --- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' -data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer - | Raw ByteString -- ^ direct instructions to the printer - deriving (Generic, NFData, Show, CoArbitrary) - --- | 'Block' is the entry point for our structured document format -data Block = Line Line -- ^ a single 'Line' of text - | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines - | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines - deriving (Generic, NFData, Show, CoArbitrary) - -{- | A 'Line' is one of: - - * a single word - * horizontal space equivalent to the width of 'Integer' `em`. - * a sequence of words seperated by spaces - -We don't export all constructors and instead encourage the use of 'text'. --} -data Line = Word Text - | HSpace Integer - | SpaceSep (Seq Line) - deriving (Generic, NFData, Show, CoArbitrary) -~~~ - -(The code is verbatim as of 8307d7e). - - - - - diff --git a/provider/posts/thermoprint-2.lhs b/provider/posts/thermoprint-2.lhs deleted file mode 100644 index a144fb5..0000000 --- a/provider/posts/thermoprint-2.lhs +++ /dev/null @@ -1,262 +0,0 @@ ---- -title: On the design of a structured document format compatible with character oriented printers -published: 2016-01-11 -tags: Thermoprint ---- - -This post is an annotated version of the file [spec/src/Thermoprint/Printout.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/Printout.hs?h=rewrite&id=f6dc3d1) as of commit `f6dc3d1`. - -> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} -> {-# LANGUAGE OverloadedStrings #-} -> {-# OPTIONS_HADDOCK show-extensions #-} - -Motivation ----------- - -We want our codebase to be compatible with as many different models of printers as we are willing to implement. -It is therefore desirable to maintain a structured document format which we can transform into a printer-specific representation of the payload to be printed with minimal effort. - -In this post we present one such format. - -Contents --------- - -> -- | This module contains the definition of the structure -- 'Printout' -- used to represent the content of a job -> module Thermoprint.Printout -> ( Printout(..) -> , Paragraph(..) -> , Chunk(..) -> , Block(..) -> , Line( HSpace -> , SpaceSep -> ) -> , text, cotext -> , prop_text -> ) where - -Preliminaries -------------- - -> import Data.Sequence (Seq, (|>), (<|)) - -A Sequence represents the same structure as the linked lists common in haskell but supports $O(1)$ `snoc`, which is desirable since we intend to iteratively build up the structure when parsing input formats. - -> import Data.Text.Lazy (Text) -> -> import Data.ByteString.Lazy (ByteString) - -The entire structure will be lazy by default but an instance of `NFData`, thus the lazy variants of `Text` and `ByteString`. - -> import GHC.Generics (Generic) - -We will use derived instances of `Generic` to get handed suitable instances of rather complicated classes such as `Arbitrary` and `FromJSON` - -> import Control.DeepSeq (NFData) - -Instances of `NFData` allow us to strictly evaluate our document structure when needed - -> import Data.Aeson (FromJSON(..), ToJSON(..), Value(..)) -> import qualified Data.Aeson as JSON (encode, decode) -> import Data.Aeson.Types (typeMismatch) - -We will encode the document as a [json](https://en.wikipedia.org/wiki/JSON) object during transport - -> import Test.QuickCheck.Arbitrary (Arbitrary(..), CoArbitrary, genericShrink) -> import Test.QuickCheck.Modifiers (NonNegative(..)) -> import Test.QuickCheck.Gen (oneof, suchThat, scale) -> import Test.QuickCheck.Instances -> import Test.QuickCheck (forAll, Property) - -We will use [QuickCheck](https://hackage.haskell.org/package/QuickCheck) for automatic test generation. - -> import qualified Data.Text.Lazy as TL (split, null, pack, filter, intercalate, map) -> import qualified Data.Text as T (pack) -> import Data.Char (isSpace) -> -> import Data.Monoid (Monoid(..), (<>)) -> -> import Data.List (dropWhile, dropWhileEnd, groupBy, genericLength, genericReplicate) -> -> import Data.Sequence as Seq (fromList, null, singleton) -> -> import Data.Function (on) -> -> import Data.Foldable (toList, fold) - -We will need to do some parsing and pretty-printing to implement `text` and `cotext`, respectively. - -> import Data.Encoding (encodeLazyByteStringExplicit, decodeLazyByteString) -> import Data.Encoding.UTF8 -> import qualified Data.ByteString.Base64.Lazy as Base64 (encode, decode) - -Since we want end users to be able to include direct instructions the printer in the form of a lazy [`ByteString`](https://hackage.haskell.org/package/bytestring) we need some way to encode `ByteString`s in JSON. -We chose [base64](https://hackage.haskell.org/package/base64-bytestring). - -> import Prelude hiding (fold) -> -> -> -- | A 'Printout' is a sequence of visually seperated 'Paragraph's -> type Printout = Seq Paragraph - -"visually seperated" will most likely end up meaning "seperated by a single blank line" - -> -- | A 'Paragraph' is a non-seperated sequence of 'Chunk's -> type Paragraph = Seq Chunk -> -> -- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. -> -- -> -- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' -> data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer -> | Raw ByteString -- ^ direct instructions to the printer -> deriving (Generic, NFData, Show, CoArbitrary) -> -> instance FromJSON Chunk where -> parseJSON s@(String _) = Raw <$> ((either fail return . decodeBase64) =<< parseJSON s) -> where -> decodeBase64 :: String -> Either String ByteString -> decodeBase64 s = (either (Left . show) Right . encodeLazyByteStringExplicit UTF8Strict $ s) >>= Base64.decode -> parseJSON o@(Object _) = Cooked <$> parseJSON o -> parseJSON v = typeMismatch "Chunk" v -> -> instance ToJSON Chunk where -> toJSON (Raw bs) = String . T.pack . decodeLazyByteString UTF8Strict . Base64.encode $ bs -> toJSON (Cooked block) = toJSON block - -We provide custom instances of `FromJSON Chunk` and `ToJSON Chunk` so that we might reduce the sice of the resulting JSON somewhat (this is an opportune target since disambiguaty is simple) - -> -- | 'Block' is the entry point for our structured document format -> data Block = Line Line -- ^ a single 'Line' of text -> | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines -> | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines -> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) -> -> {- | A 'Line' is one of: -> -> * a single word -> * horizontal space equivalent to the width of 'Integer' `em`. -> * a sequence of words seperated by spaces -> -> We don't export all constructors and instead encourage the use of 'text'. -> -} -> data Line = Word Text -> | HSpace Integer -> | SpaceSep (Seq Line) -> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) -> -> instance Monoid Block where -> mempty = NewlSep mempty -> x@(NewlSep xs) `mappend` y@(NewlSep ys) -> | Seq.null xs = y -> | Seq.null ys = x -> | otherwise = NewlSep (xs <> ys) -> (NewlSep xs) `mappend` y -> | Seq.null xs = y -> | otherwise = NewlSep (xs |> y) -> x `mappend` (NewlSep ys) -> | Seq.null ys = x -> | otherwise = NewlSep (x <| ys) -> x `mappend` y = NewlSep $ Seq.fromList [x, y] -> -> instance Monoid Line where -> mempty = SpaceSep mempty -> x@(SpaceSep xs) `mappend` y@(SpaceSep ys) -> | Seq.null xs = y -> | Seq.null ys = x -> | otherwise = SpaceSep (xs <> ys) -> (SpaceSep xs) `mappend` y -> | Seq.null xs = y -> | otherwise = SpaceSep (xs |> y) -> x `mappend` (SpaceSep ys) -> | Seq.null ys = x -> | otherwise = SpaceSep (x <| ys) -> x `mappend` y = SpaceSep $ Seq.fromList [x, y] - -The Monoid instances for `Block` and `Line` are somewhat unwieldy since we want to guarantee minimum overhead by reducing expressions such as `SpaceSep (fromList [x])` to `x` whenever possible. - -The same effect would have been possible by introducing the monoid structure *one level higher* -- we could have introduced constructors such as `Line :: Seq Word -> Block`. -This was deemed undesirable since we would not have been able to implement instances such as `Monoid Line` which allow the use of more generic functions during parsing. - -> text :: Text -> Either Block Line -> -- ^ Smart constructor for 'Line'/'Block' which maps word and line boundaries (as determined by 'isSpace' and '(== '\n')' respectively) to the structure of 'Block' and 'Line'. -> -- -> -- Since we are unwilling to duplicate the list of chars from 'isSpace' we cannot reasonably determine a width for the various whitespace 'Char's. -> -- Thus they are all weighted equally as having width 1 `em`. -> text t = case splitLines t of -> [] -> Right mempty -> [Line x] -> Right x -> xs -> Left $ mconcat xs -> where -> splitLines :: Text -> [Block] -> splitLines t = map toBlock -> . groupBy ((==) `on` TL.null) -> $ TL.split (== '\n') t -> splitWords :: Text -> [Line] -> splitWords t = map toLine -> . groupBy ((==) `on` TL.null) -> $ TL.split isSpace t -> toBlock [] = mempty -> toBlock xs@(x:_) -> | TL.null x = VSpace $ genericLength xs - 1 -> | otherwise = mconcat . map (Line . mconcat . splitWords) $ xs -> toLine [] = mempty -> toLine xs@(x:_) -> | TL.null x = HSpace $ genericLength xs - 1 -> | otherwise = mconcat . map Word $ xs -> list :: b -> (a -> [a] -> b) -> [a] -> b -> list c _ [] = c -> list _ f (x:xs) = f x xs - -Implementations using `TL.lines` and `TL.words` were tested. -We chose to use `TL.split`-based solutions instead because the more specific splitting functions provided by [text](https://hackage.haskell.org/package/text) drop information concerning the exact amount of whitespace. - -> cotext :: Block -> Text -> -- ^ inverse of -> -- @ -> -- either id Line . `text` -> -- @ -> cotext (VSpace n) = TL.pack . genericReplicate n $ '\n' -> cotext (NewlSep xs) = TL.intercalate "\n" . map cotext . toList $ xs -> cotext (Line x) = cotext' x -> where -> cotext' (Word x) = x -> cotext' (HSpace n) = TL.pack . genericReplicate n $ ' ' -> cotext' (SpaceSep xs) = TL.intercalate " " . map cotext' . toList $ xs - -We provide cotext for testing `text` and to enable determining semantic equality of `Printout`s at a later date - -> prop_text :: Text -> Bool -> -- ^ prop> (`cotext` . either id Line . `text` $ x) == x -> -- -> -- Where 'x' is restricted to those `TL.Text` which do not contain whitespace besides ' ' and '\n'. -> prop_text x = (cotext . either id Line . text $ x') == x' -> where -> x' = TL.map normSpace x -> normSpace c -> | isSpace c -> , c `elem` keep = c -> | isSpace c = ' ' -- We have to do this because all whitespace gets interpreted as width 1 -> | otherwise = c -> keep = [' ', '\n'] -> -> -- | We don't test 'Raw' 'Chunk's -> instance Arbitrary Chunk where -> shrink = genericShrink -> arbitrary = Cooked <$> arbitrary -> -> instance Arbitrary Block where -> shrink = genericShrink -> arbitrary = oneof [ Line <$> arbitrary -> , VSpace . getNonNegative <$> arbitrary -> , NewlSep <$> scale' arbitrary -> ] -> -> instance Arbitrary Line where -> shrink = genericShrink -> arbitrary = oneof [ Word . TL.filter (not . isSpace) <$> arbitrary -- 'isSpace '\n' == True' -> , HSpace . getNonNegative <$> arbitrary -> , SpaceSep <$> scale' arbitrary -> ] -> -> scale' = scale (round . sqrt . fromInteger . toInteger) - -Failing to properly scale the tested structures was shown to use more than 8GiB of RAM during testing diff --git a/provider/posts/thermoprint-3.lhs b/provider/posts/thermoprint-3.lhs deleted file mode 100644 index 5e7eee9..0000000 --- a/provider/posts/thermoprint-3.lhs +++ /dev/null @@ -1,92 +0,0 @@ ---- -title: Thoughts on a network protocol for a toolset for interacting with character-oriented printers -published: 2016-01-11 -tags: Thermoprint ---- - -This post is an annotated version of the file [spec/src/Thermoprint/API.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/API.hs?h=rewrite&id=3ad700c) as of commit `3ad700c`. - -> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} -> {-# LANGUAGE TypeOperators, DataKinds #-} -> {-# LANGUAGE OverloadedStrings #-} -> -> module Thermoprint.API -> ( PrinterStatus(..) -> , JobStatus(..) -> , ThermoprintAPI -> , thermoprintAPI -> , module Thermoprint.Identifiers -> , module Thermoprint.Printout -> ) where -> -> import Thermoprint.Printout - -See [a previous post](https://dirty-haskell.org/posts/thermoprint-2.html). - -> import Thermoprint.Identifiers - -`Thermoprint.Identifiers` provides some newtypes of `Integer` to add some typesafety to dealing with objects identified by autoincremented numbers - -> import Servant.API -> import Servant.Docs -> import Data.Aeson - -We will define our API to be compatible with [servant](https://hackage.haskell.org/package/servant) - -> import Data.Set (Set) -> import Data.Sequence (Seq) - -Higher performance versions of lists for our various applications - -> import GHC.Generics (Generic) -> -> import Data.Proxy (Proxy(..)) -> -> import Control.Exception (Exception) -> import Data.Typeable (Typeable) -> -> data PrinterStatus = Busy JobId -> | Available -> deriving (Generic, Show, FromJSON, ToJSON) -> -> data JobStatus = Queued -> | Printing -> | Done -> | Failed PrintingError -> deriving (Generic, Show, FromJSON, ToJSON) -> -> data PrintingError = UnknownError -> deriving (Typeable, Generic, Show, FromJSON, ToJSON, Exception) - -We expect the definiton of `PrintingError` to grow considerably while implementing a server for this API - -We support the following actions through our API: - -> type ThermoprintAPI = "printers" :> Get '[JSON] (Set PrinterId) -- List the identifiers of all available printers (/printers) -> :<|> "printer" :> Capture "printerId" PrinterId :> ( -> ReqBody '[JSON] Printout :> Post '[JSON] JobId -- Add a new job to the bottom of the queue by sending its content (/printer:printerId) -> :<|> "status" :> Get '[JSON] PrinterStatus -- Query the current status of a printer (/printer:printerId/status) -> ) -> :<|> "jobs" :> ( -> QueryParam "printer" PrinterId :> QueryParam "min" JobId :> QueryParam "max" JobId :> Get '[JSON] (Seq JobId) -- List all jobs allowing for selection by printerId and pagination (/jobs?printer=*&min=*&max=*) -> ) -> :<|> "job" :> Capture "jobId" JobId :> ( -> Get '[JSON] Printout -- Get the contents of a job currently known to the server (/job:jobId) -> :<|> "status" :> Get '[JSON] JobStatus -- Get the status of a job (/job:jobId/status) -> :<|> "printer" :> Get '[JSON] PrinterId -- Find the printer a job was queued for (/job:jobId/printer) -> :<|> Delete '[] () -- Abort a job (which we expect to make it unknown to the server) (/job:jobId) -> ) -> :<|> "drafts" :> ( -> Get '[JSON] (Set DraftId) -- List the identifiers of all drafts known to the server (/drafts) -> :<|> ReqBody '[JSON] Printout :> Post '[JSON] DraftId -- Make a draft known to the server by submitting its contents (/drafts) -> ) -> :<|> "draft" :> Capture "draftId" DraftId :> ( -> ReqBody '[JSON] Printout :> Put '[] () -- Update a draft by replacing its contents (/draft:draftId) -> :<|> Get '[JSON] Printout -- Get the contents of a draft (/draft:draftId) -> :<|> Delete '[] () -- Delete a draft (/draft:draftId) -> ) -> -> thermoprintAPI :: Proxy ThermoprintAPI -> thermoprintAPI = Proxy - -servant needs an object of type `Proxy ThermoprintAPI` in various places diff --git a/provider/posts/thermoprint-4.md b/provider/posts/thermoprint-4.md deleted file mode 100644 index 756c166..0000000 --- a/provider/posts/thermoprint-4.md +++ /dev/null @@ -1,116 +0,0 @@ ---- -title: On the Design of a Parser -published: 2016-01-12 -tags: Thermoprint ---- - -The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) --- more specifically the contents of the directory `bbcode` in the -[git repo](https://git.yggdrasil.li/thermoprint/tree/bbcode?h=rewrite&id=dc99dae). - -In a manner consistent with designing software as -[compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of -our solution (as illustrated by the following mockup): - -~~~ {.haskell} --- | Our target structure -- a rose tree with an explicit terminal constructor -data DomTree = Element Text (Map Text Text) [DomTree] - | Content Text - deriving (Show, Eq) - -bbcode :: Text -> Maybe DomTree --- ^ Parse BBCode -~~~ - -Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using -the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser -combinators instead. - -We reproduce an incomplete version of the lexer below (it’s missing tag attributes and self-closing tags). - -We introduce `escapedText`, a helper function for extracting text until we reach one of a set of delimiting characters -(exclusive). -While doing this we also parse any delimiting character iff it's prefixed with an escape character (we use `\`) -- the -escape character itself needs only be escaped if encountered directly before one of the delimiting characters. - -~~~ {.haskell} -data Token = BBOpen Text -- ^ "[open]" - | BBClose Text -- ^ "[/close]" - | BBStr Text -- ^ "text" - -token :: Parser [Token] -token = BBClose <$ "[/" <*> escapedText' [']'] <* "]" - <|> BBOpen <$ "[" <*> escapedText' [']'] <* "]" - <|> BBStr <$> escapedText ['['] - -escapedText' :: [Char] -> Parser Text -escapedText' = option "" . escapedText - -escapedText :: [Char] -> Parser Text -escapedText [] = takeText -- No delimiting characters -- parse all remaining input -escapedText cs = recurse $ choice [ takeWhile1 (not . special) -- a series of characters we don't treat as special - , escapeSeq -- an escaped delimiter - , escapeChar' -- the escape character - ] - where - escapeChar = '\\' - special = inClass $ escapeChar : cs - escapeChar' = string $ T.singleton escapeChar - escapeSeq = escapeChar' *> (T.singleton <$> satisfy special) -- escape character followed by a special character (which encludes the escape character) - recurse p = mappend <$> p <*> escapedText' cs -- parse a prefix and optionally append another chunk of escapedText - -runTokenizer :: Text -> Maybe [Token] -runTokenizer = either (const Nothing) Just . parseOnly (many token <* endOfInput) -~~~ - -We have now reduced the Problem to `[Token] -> DomTree`. -We quickly see that the structure of the problem is that of a -[fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). - -Having realised this we require a function of type `Token -> DomTree -> DomTree` to recursively build up our target -structure. - -In general we’ll want to not only keep track of the `DomTree` during recursion but also maintain a reference to the -position at which we’ll be inserting new tokens. -This kind of problem is well understood and solved idiomatically by using a -[zipper](https://en.wikipedia.org/wiki/Zipper_(data_structure)) -([a cursory introduction](http://learnyouahaskell.com/zippers)). - -Writing zippers tends to be tedious. We’ll therefore introduce an -[additional intermediate structure](https://hackage.haskell.org/package/containers/docs/Data-Tree.html) for which an -[implementation](https://hackage.haskell.org/package/rosezipper) is available readily. -The morphism from this new structure (`Forest BBLabel`) to our `DomTree` will be almost trivial. - -~~~ {.haskell} -import Data.Tree.Zipper (TreePos, Empty, Full) -import qualified Data.Tree.Zipper as Z - -data BBLabel = BBTag Text - | BBPlain Text - -rose :: [BBToken] -> Maybe (Forest BBLabel) -rose = Z.toForest <$> foldM (flip rose') (Z.fromForest []) - -rose' :: BBToken -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) -rose' (BBStr t) = return . Z.nextSpace . Z.insert (Node (BBPlain t) []) -- insert a node with no children and move one step to the right in the forest we’re currently viewing -rose' (BBOpen t) = return . Z.children . Z.insert (Node (BBTag t) []) -- insert the node and move into position to insert it's first child -rose' (BBClose t) = close t -- haskell complains if multiple equations for the same function have a differing number of arguments, therefore: 'close' - where - close :: Text -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) - close tag pos = do - pos' <- Z.parent pos -- fail if we're trying to close a tag that does not have a parent (this indicates imbalanced tags) - let - pTag = (\(BBTag t) -> t) $ Z.label pos' -- yes, this will fail unceremoniously if the parent is not a tag, this poses no problem since we're constructing the structure ourselves. The proof that this failure mode does not occur is left as an exercise for the reader. - guard (pTag == tag) -- The structure shows that this mode of failure (opening tags content does not match the closing tags) is not logically required -- it only serves as a *notification* to the user - return $ Z.nextSpace pos' -- move one level up and to point at the next sibling of the parent -~~~ - -All that is left to do now is present our final morphism: - -~~~ {.haskell} -dom :: Forest BBLabel -> [DomTree] -dom = map dom' - where - dom' (Node (BBPlain t) []) = Content t - dom' (Node (BBTag t) ts = Element t $ map dom' ts -~~~ diff --git a/provider/posts/thermoprint-5.md b/provider/posts/thermoprint-5.md deleted file mode 100644 index 0249734..0000000 --- a/provider/posts/thermoprint-5.md +++ /dev/null @@ -1,198 +0,0 @@ ---- -title: Building an Extensible Framework for Specifying Compile-Time Configuration using Universal Quantification -tags: Thermoprint -published: 2016-02-18 ---- - -When I write *Universal Quantification* I mean what is commonly referred to as -[existential quantification](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/data-type-extensions.html#existential-quantification), -which I think is a misnomer. To wit: - -$( \exists x \ldotp f(x) ) \to y$ is isomorphic to $\forall x \ldotp (f(x) \to y)$ (I -won´t try to back this claim up with actual category theory just now. You might want to -nag me occasionally if this bothers you -- I really should invest some more time into -category theory). Since haskell does not support `exists` we´re required to use the -`forall`-version, which really is universally quantified. - -## Printer Configuration - -What we want is to have the user provide us with a set of specifications of how to -interact with one printer each. -Something like the following: - -~~~ {.haskell} -newtype PrinterMethod = PM { unPM :: Printout -> IO (Maybe PrintingError) } - -data Printer = Printer - { print :: PrinterMethod - , queue :: TVar Queue - } -~~~ - -The first step in refining this is necessitated by having the user provide the -[monad-transformer-stack](http://book.realworldhaskell.org/read/monad-transformers.html) -to use at compile time. -Thus we introduce our first universal quantification (in conjunction with -[polymorphic components](https://prime.haskell.org/wiki/PolymorphicComponents)) -- this -one is not isomorphic to an existential one: - -~~~ {.haskell} -newtype PrinterMethod = PM { unPm :: forall m. MonadResource m => Printout -> m (Maybe PrintingError) } -~~~ - -Since we don´t want to *burden* the user with the details of setting up `TVar Queue`{.haskell} we -also introduce function to help with that: - -~~~ {.haskell} -printer :: MonadResource m => PrinterMethod -> m Printer -printer p = Printer p <$> liftIO (newTVarIO def) -~~~ - -We could at this point provide ways to set up `PrinterMethod`{.haskell}s and have the user -provide us with a list of them. - -We, however, have numerous examples of printers which require some setup (such opening a -file descriptor). The idiomatic way to handle this is to decorate that setup with some -constraints and construct our list of printers in an -[`Applicative`{.haskell}](https://hackage.haskell.org/package/base/docs/Control-Applicative.html#t:Applicative) -fashion: - -~~~ {.haskell} -printer :: MonadResource m => m PrinterMethod -> m Printer -printer p = Printer <$> p <*> liftIO (newTVarIO def) -~~~ - -At this point a toy implementation of a printer we might provide looks like this: - -~~~ {.haskell} -debugPrint :: Applicative m => m PrinterMethod -debugPrint = pure . PM $ const return Nothing <=< liftIO . putStrLn . toString - -toString :: Printout -> String -toString = undefined -~~~ - -## Management of Printer Queues - -We would like the user to be able to modify the printer queues we maintain in arbitrary -ways. -The motivation for this being various cleanup operations such as pruning all successful -jobs older than a few minutes or limiting the size of history to an arbitrary number of -entries. - -A pattern for this type of modification of a value residing in a `TVar`{.haskell} might -look like this: - -~~~ {.haskell} -modify :: TVar a -> StateT a STM () -> IO () -modify q f = atomically $ writeTVar =<< runStateT f =<< readTVar q -~~~ - -A rather natural extension of this is to allow what we will henceforth call a -`QueueManager`{.haskell} (currently `StateT a STM ()`{.haskell}) to return an indication -of when it wants to be run again: - -~~~ {.haskell} -type QueueManager = StateT Queue STM Micro - -runQM :: QueueManager -> TVar Queue -> IO () -runQM qm q = sleep << qm' - where - qm' = atomically $ (\(a, s) -> a <$ writeTVar q s) =<< runStateT qm =<< readTVar q - sleep (abs -> delay) = threadDelay (fromEnum delay) >> runQM qm q -~~~ - -It stands to reason that sometimes we don't want to run the `QueueManager`{.haskell} ever -again (probably causing the thread running it to terminate). -For doing so we -[extend the real numbers](https://en.wikipedia.org/wiki/Extended_real_number_line) as -represented by `Micro`{.haskell} to -[`Extended Micro`{.haskell}](https://hackage.haskell.org/package/extended-reals): - -~~~ {.haskell} -type QueueManager = StateT Queue STM (Extended Micro) - -runQM … - where - … - sleep (abs -> delay) - | (Finite d) <- delay = threadDelay (fromEnum d) >> runQM qm q - | otherwise = return () -~~~ - -`QueueManager`{.haskell}s whose type effectively is `Queue -> STM (Queue, Extended Micro)`{.haskell} -are certainly useful but can carry no state between invocations (which would be useful -e.g. for limiting the rate at which we prune jobs). - -Therefore we allow the user to provide an arbitrary monad functor (we use -`MFunctor`{.haskell} from -[mmorph](https://hackage.haskell.org/package/mmorph-1.0.6/docs/Control-Monad-Morph.html#t:MFunctor) -instead of `Servant.Server.Internal.Enter` because -[servant-server](https://hackage.haskell.org/package/servant-server-0.4.4.6/docs/Servant-Server-Internal-Enter.html#v:Nat) -doesn't provide all the tools we require for this) which can carry all the state we could -ever want: - -~~~ {.haskell} -type QueueManager t = QueueManagerM t (Extended Micro) -type QueueManagerM t = ComposeT (StateT Queue) t STM -- 'ComposeT' is required since we need 'QueueManagerM' to have the form 't' STM' for some 't'' in order to be able to use 'lift' - -runQM :: (MFunctor t, MonadTrans t, MonadIO (t IO), Monad (t STM)) => QueueManager t -> TVar Queue -> t IO () -runQM … -- nearly identical except for a sprinkling of 'lift' -~~~ - -The final touches are to introduce a typeclass `HasQueue`{.haskell} for convenience: - -~~~ {.haskell} -class HasQueue a where - extractQueue :: a -> TVar Queue - -instance HasQueue (TVar Queue) where - extractQueue = id - -instance HasQueue Printer where - extractQueue = queue -~~~ - -and provide some utility functions for composing `QueueManager`{.haskell}s: - -~~~ {.haskell} -intersection :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t --- ^ Combine two 'QueueManager's keeping only 'QueueEntry's both managers decide to keep --- --- Side effects propagate left to right - -idQM :: Monad (QueueManagerM t) => QueueManager t --- ^ Identity of 'intersect' - -union :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t --- ^ Combine two 'QueueManager's keeping all 'QueueEntry's either of the managers decides to keep --- --- Side effects propagate left to right - -nullQM :: MonadState Queue (QueueManagerM t) => QueueManager t --- ^ Identity of 'union' -~~~ - -We merge the effects of two `QueueManager`{.haskell}s by converting the resulting -`Queue`{.haskell}s to `Set`{.haskell}s and using `Set.union`{.haskell} and -`Set.intersection`{.haskell} with appropriate `Ord`{.haskell} and `Eq`{.haskell} -instances. - -### Configuration of `QueueManager`{.haskell}s - -A `QueueManager`{.haskell}s configuration shall be a `QueueManager t`{.haskell} associated -with a specification of how to collapse its monad transformer `t`{.haskell}. -Using universal quantification this is straightforward: - -~~~ {.haskell} -data QMConfig m = forall t. ( MonadTrans t - , MFunctor t - , Monad (t STM) - , MonadIO (t IO) - ) => QMConfig { manager :: QueueManager t - , collapse :: (t IO) :~> m - } - -runQM' :: Printer -> QMConfig m -> m () -runQM' printer (QMConfig qm nat) = unNat nat $ runQM qm printer -~~~ diff --git a/provider/posts/thermoprint-6.lhs b/provider/posts/thermoprint-6.lhs deleted file mode 100644 index 9182427..0000000 --- a/provider/posts/thermoprint-6.lhs +++ /dev/null @@ -1,142 +0,0 @@ ---- -title: Deriving a Client Library for Interacting with Character-Oriented Printers -tags: Thermoprint -published: 2016-02-18 ---- - -> {-# LANGUAGE DataKinds #-} -> {-# LANGUAGE TypeOperators #-} -> {-# LANGUAGE ViewPatterns #-} -> {-# LANGUAGE RecordWildCards #-} -> -> -- | A client library for 'Thermoprint.API' -> module Thermoprint.Client -> ( Client(..) -> , mkClient, mkClient' -> , throwNat, ioNat -> -- = Reexports -> , ServantError(..) -> , module Servant.Common.BaseUrl -> , module Control.Monad.Trans.Either -> , module Servant.Server.Internal.Enter -> ) where -> -> import Thermoprint.API -> import Data.Map (Map) -> import Data.Sequence (Seq) -> import Data.Time (UTCTime) -> -> import Servant.Client hiding (HasClient(..)) -> import qualified Servant.Client as S -> import Servant.Common.BaseUrl -> import Servant.API -> import Servant.Server.Internal.Enter -> import Control.Monad.Trans.Either -> -> import Control.Monad.Catch (Exception, MonadThrow(..)) -> import Control.Monad.IO.Class (MonadIO(..)) -> -> import Control.Monad -> import Control.Category -> import Prelude hiding (id, (.)) -> -> instance Exception ServantError - -We encapsulate all api operations in a single record parametrized over the monad we intend to use -them in. -Construction of such a record is pure since all we require to do so is a `BaseUrl`{.haskell}. -Using RecordWildCards we can bring all operations into scope with extreme ease. - -> -- | All 'ThermoprintAPI'-functions as a record -> -- -> -- Use like this: -> -- -> -- > {-# LANGUAGE RecordWildCards #-} -> -- > -> -- > main :: IO () -> -- > -- ^ Display a list of printers with their status -> -- > main = print =<< printers -> -- > where Client{..} = mkClient' $ Http "localhost" 3000 -> data Client m = Client -> { printers :: m (Map PrinterId PrinterStatus) -> -- ^ List all printers -> , jobs :: Maybe PrinterId -> -> Maybe (Range (JobId)) -> -> Maybe (Range (UTCTime)) -> -> m (Seq (JobId, UTCTime, JobStatus)) -> -- ^ List a selection of jobs -> , jobCreate :: Maybe PrinterId -> Printout -> m JobId -> -- ^ Send a 'Printout' to be queued -> , job :: JobId -> m Printout -> -- ^ Retrieve the contents of a job -> , jobStatus :: JobId -> m JobStatus -> -- ^ Query a jobs status -> , jobDelete :: JobId -> m () -> -- ^ Delete a job from the queue (not from history or while it is being printed) -> , drafts :: m (Map DraftId (Maybe DraftTitle)) -> -- ^ List all saved drafts -> , draftCreate :: Maybe DraftTitle -> -> Printout -> -> m DraftId -> -- ^ Create a new draft -> , draftReplace :: DraftId -> -> Maybe DraftTitle -> -> Printout -> -> m () -> -- ^ Replace the contents and title of an existing draft -> , draft :: DraftId -> m (Maybe DraftTitle, Printout) -> -- ^ Retrieve the contents and title of a draft -> , draftDelete :: DraftId -> m () -> -- ^ Delete a draft -> , draftPrint :: DraftId -> Maybe PrinterId -> m JobId -> -- ^ Send a draft to be printed -> } - -[servant documentation](https://haskell-servant.github.io/tutorial/server.html#nested-apis) -advises factoring out apis to make the specification more concise. -We are rightly advised that doing so has an effect on the types of the -corresponding `Server`{.haskell}s and `Client`{.haskell}s. -To cope with this we introduce a helper function that allows us, when -used with ViewPatterns, to nontheless simply pattern match on `client`{.haskell}. - -> withArgs :: (x -> a :<|> b) -> ((x -> a) :<|> (x -> b)) -> -- ^ Undo factoring of APIs -> withArgs subAPI = (\(a :<|> _) -> a) . subAPI :<|> (\(_ :<|> b) -> b) . subAPI - -`withArgs`{.haskell} as presented here does not recurse and thus -doesn't handle more than one occurence of `:<|>`{.haskell}. -We have to to so ourselves using nested ViewPatterns (see -`mkClient`{.haskell}). - -> mkClient :: (EitherT ServantError IO :~> m) -- ^ A monad functor ('Nat') used to make the api functions work in any monad which can do 'IO' and handle errors -> -> BaseUrl -> -> Client m -> -- ^ Generate a 'Client' - -RecordWildCards also allows us to construct a record from components -in scope. - -> mkClient n url = Client{..} -> where -> printers -> :<|> (jobs :<|> jobCreate) -> :<|> (withArgs -> job :<|> (withArgs -> jobStatus :<|> jobDelete)) -> :<|> (drafts :<|> draftCreate) -> :<|> (withArgs -> draftReplace :<|> (withArgs -> draft :<|> (withArgs -> draftDelete :<|> draftPrint))) -> = enter n $ client thermoprintAPI url - -We also provide some additional convenience functions so the user -doesn't have to construct their own `Nat`{.haskell}ural -transformations. - -> mkClient' :: (MonadThrow m, MonadIO m) => BaseUrl -> Client m -> -- ^ @mkClient' = mkClient $ ioNat . throwNat@ -> mkClient' = mkClient $ ioNat . throwNat -> -> throwNat :: (Exception e, MonadThrow m) => EitherT e m :~> m -> -- ^ Squash a layer of 'EitherT' into the underlying monad supporting 'throwM' -> throwNat = Nat $ either throwM return <=< runEitherT -> -> ioNat :: MonadIO m => IO :~> m -> -- ^ @ioNat = Nat liftIO@ -> ioNat = Nat liftIO diff --git a/provider/posts/thermoprint/1.md b/provider/posts/thermoprint/1.md new file mode 100644 index 0000000..032e2f6 --- /dev/null +++ b/provider/posts/thermoprint/1.md @@ -0,0 +1,130 @@ +--- +title: On the Architecture of a tool-set for interacting with character-oriented printers +published: 2015-12-25 +tags: Thermoprint +--- + +# Motivation + +Some time ago I bought a cheap Chinese +[thermoprinter](https://en.wikipedia.org/wiki/Thermal_printing) off eBay. +As expected the printers firmware is really awkward to use (including binary +control codes used to switch between char sets such as bold, italic, underlined, +etc.). +The obvious solution was to write a library to parse a more sensible +representation and send it to be printed. + +Since there might, at some point, be other users wanting to print to my +acquisition the architecture is intended to be present a somewhat usable +interface to the uninitiated. + +# Implementation + +## Location + +Recently I created a new branch in +[thermoprint](https://git.yggdrasil.li/thermoprint) called +[rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite). + +## Architecture Overview + +The new macroscopic architecture I´m currently aiming for is quite similar to +the old one: + + * A server intended to run on the machine connected to my cheap printer talking + directly to the printer on one end and serving a + [json api](https://hackage.haskell.org/package/servant) on the other. + * A (hopefully) tiny cli tool for debugging and personal use. + * A website (it will probably end up being based on + [yesod](https://hackage.haskell.org/package/yesod)) presenting a web interface + similar to the cli tool. + +## Features + +Features I intend to implement include: + + * A parser for a bbcode-dialect which should be used in both the cli tool and the + website (it will probably end up using + [attoparsec](https://hackage.haskell.org/package/attoparsec)) -- bbcode as + presented on [Wikipedia](https://en.wikipedia.org/wiki/BBCode) is a proper + superset of the feature-set of my cheap Chinese printer. + * Reasonable test coverage using + [QuickCheck](https://hackage.haskell.org/package/QuickCheck), + [HUnit](http://hackage.haskell.org/package/HUnit). + + Automatic testing with [cabal](https://www.haskell.org/cabal/) facilitated by + [hspec](https://hackage.haskell.org/package/hspec). + * Support and server-side storage for drafts. + * The Website should provide some richer formats than bbcode which will + probably find inclusion in the payload datastructure such as lists, + checklists, tables, etc. + + The cli-tool should be able to use these too (the input will probably end up + being json-formatted). + +## Work so far + +### Prototype + +I already have a prototype. +It's quite bug-ridden and has recently developed serious problems actually +printing after working satisfactorily for a few weeks. + +It also does not include a web-interface and I am quite unsatisfied with the +overall code quality. + +The [685 lines of code](http://cloc.sourceforge.net/) can be found in the +[repo](https://git.yggdrasil.li/thermoprint?h=master) as well. + +### Rewrite + +Currently the [rewrite](https://git.yggdrasil.li/thermoprint?h=rewrite) contains a +single file of moment -- spec/src/Thermoprint/Printout.hs -- wherein we define +the payload for the api -- our take on a structured document format (somewhat +inspired by +[pandoc](http://hackage.haskell.org/package/pandoc-types/docs/Text-Pandoc-Definition.html)): + +~~~ {.haskell} +-- | A 'Printout' is a sequence of visually seperated 'Paragraph's +type Printout = Seq Paragraph + +-- | A 'Paragraph' is a non-seperated sequence of 'Chunk's +type Paragraph = Seq Chunk + +-- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. +-- +-- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' +data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer + | Raw ByteString -- ^ direct instructions to the printer + deriving (Generic, NFData, Show, CoArbitrary) + +-- | 'Block' is the entry point for our structured document format +data Block = Line Line -- ^ a single 'Line' of text + | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines + | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines + deriving (Generic, NFData, Show, CoArbitrary) + +{- | A 'Line' is one of: + + * a single word + * horizontal space equivalent to the width of 'Integer' `em`. + * a sequence of words seperated by spaces + +We don't export all constructors and instead encourage the use of 'text'. +-} +data Line = Word Text + | HSpace Integer + | SpaceSep (Seq Line) + deriving (Generic, NFData, Show, CoArbitrary) +~~~ + +(The code is verbatim as of 8307d7e). + + + + + diff --git a/provider/posts/thermoprint/2.lhs b/provider/posts/thermoprint/2.lhs new file mode 100644 index 0000000..a144fb5 --- /dev/null +++ b/provider/posts/thermoprint/2.lhs @@ -0,0 +1,262 @@ +--- +title: On the design of a structured document format compatible with character oriented printers +published: 2016-01-11 +tags: Thermoprint +--- + +This post is an annotated version of the file [spec/src/Thermoprint/Printout.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/Printout.hs?h=rewrite&id=f6dc3d1) as of commit `f6dc3d1`. + +> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} +> {-# LANGUAGE OverloadedStrings #-} +> {-# OPTIONS_HADDOCK show-extensions #-} + +Motivation +---------- + +We want our codebase to be compatible with as many different models of printers as we are willing to implement. +It is therefore desirable to maintain a structured document format which we can transform into a printer-specific representation of the payload to be printed with minimal effort. + +In this post we present one such format. + +Contents +-------- + +> -- | This module contains the definition of the structure -- 'Printout' -- used to represent the content of a job +> module Thermoprint.Printout +> ( Printout(..) +> , Paragraph(..) +> , Chunk(..) +> , Block(..) +> , Line( HSpace +> , SpaceSep +> ) +> , text, cotext +> , prop_text +> ) where + +Preliminaries +------------- + +> import Data.Sequence (Seq, (|>), (<|)) + +A Sequence represents the same structure as the linked lists common in haskell but supports $O(1)$ `snoc`, which is desirable since we intend to iteratively build up the structure when parsing input formats. + +> import Data.Text.Lazy (Text) +> +> import Data.ByteString.Lazy (ByteString) + +The entire structure will be lazy by default but an instance of `NFData`, thus the lazy variants of `Text` and `ByteString`. + +> import GHC.Generics (Generic) + +We will use derived instances of `Generic` to get handed suitable instances of rather complicated classes such as `Arbitrary` and `FromJSON` + +> import Control.DeepSeq (NFData) + +Instances of `NFData` allow us to strictly evaluate our document structure when needed + +> import Data.Aeson (FromJSON(..), ToJSON(..), Value(..)) +> import qualified Data.Aeson as JSON (encode, decode) +> import Data.Aeson.Types (typeMismatch) + +We will encode the document as a [json](https://en.wikipedia.org/wiki/JSON) object during transport + +> import Test.QuickCheck.Arbitrary (Arbitrary(..), CoArbitrary, genericShrink) +> import Test.QuickCheck.Modifiers (NonNegative(..)) +> import Test.QuickCheck.Gen (oneof, suchThat, scale) +> import Test.QuickCheck.Instances +> import Test.QuickCheck (forAll, Property) + +We will use [QuickCheck](https://hackage.haskell.org/package/QuickCheck) for automatic test generation. + +> import qualified Data.Text.Lazy as TL (split, null, pack, filter, intercalate, map) +> import qualified Data.Text as T (pack) +> import Data.Char (isSpace) +> +> import Data.Monoid (Monoid(..), (<>)) +> +> import Data.List (dropWhile, dropWhileEnd, groupBy, genericLength, genericReplicate) +> +> import Data.Sequence as Seq (fromList, null, singleton) +> +> import Data.Function (on) +> +> import Data.Foldable (toList, fold) + +We will need to do some parsing and pretty-printing to implement `text` and `cotext`, respectively. + +> import Data.Encoding (encodeLazyByteStringExplicit, decodeLazyByteString) +> import Data.Encoding.UTF8 +> import qualified Data.ByteString.Base64.Lazy as Base64 (encode, decode) + +Since we want end users to be able to include direct instructions the printer in the form of a lazy [`ByteString`](https://hackage.haskell.org/package/bytestring) we need some way to encode `ByteString`s in JSON. +We chose [base64](https://hackage.haskell.org/package/base64-bytestring). + +> import Prelude hiding (fold) +> +> +> -- | A 'Printout' is a sequence of visually seperated 'Paragraph's +> type Printout = Seq Paragraph + +"visually seperated" will most likely end up meaning "seperated by a single blank line" + +> -- | A 'Paragraph' is a non-seperated sequence of 'Chunk's +> type Paragraph = Seq Chunk +> +> -- | We introduce both 'Chunk' and 'Paragraph' mainly to allow 'Raw'. +> -- +> -- Were we to disallow 'Raw', 'Block' would be identical to 'Paragraph' +> data Chunk = Cooked Block -- ^ text semantically structured to be rendered in accordance with the display format of printer +> | Raw ByteString -- ^ direct instructions to the printer +> deriving (Generic, NFData, Show, CoArbitrary) +> +> instance FromJSON Chunk where +> parseJSON s@(String _) = Raw <$> ((either fail return . decodeBase64) =<< parseJSON s) +> where +> decodeBase64 :: String -> Either String ByteString +> decodeBase64 s = (either (Left . show) Right . encodeLazyByteStringExplicit UTF8Strict $ s) >>= Base64.decode +> parseJSON o@(Object _) = Cooked <$> parseJSON o +> parseJSON v = typeMismatch "Chunk" v +> +> instance ToJSON Chunk where +> toJSON (Raw bs) = String . T.pack . decodeLazyByteString UTF8Strict . Base64.encode $ bs +> toJSON (Cooked block) = toJSON block + +We provide custom instances of `FromJSON Chunk` and `ToJSON Chunk` so that we might reduce the sice of the resulting JSON somewhat (this is an opportune target since disambiguaty is simple) + +> -- | 'Block' is the entry point for our structured document format +> data Block = Line Line -- ^ a single 'Line' of text +> | VSpace Integer -- ^ vertical space of height equivalent to 'Integer' lines +> | NewlSep (Seq Block) -- ^ A sequence of 'Block's seperated by newlines +> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) +> +> {- | A 'Line' is one of: +> +> * a single word +> * horizontal space equivalent to the width of 'Integer' `em`. +> * a sequence of words seperated by spaces +> +> We don't export all constructors and instead encourage the use of 'text'. +> -} +> data Line = Word Text +> | HSpace Integer +> | SpaceSep (Seq Line) +> deriving (Generic, NFData, Show, CoArbitrary, FromJSON, ToJSON) +> +> instance Monoid Block where +> mempty = NewlSep mempty +> x@(NewlSep xs) `mappend` y@(NewlSep ys) +> | Seq.null xs = y +> | Seq.null ys = x +> | otherwise = NewlSep (xs <> ys) +> (NewlSep xs) `mappend` y +> | Seq.null xs = y +> | otherwise = NewlSep (xs |> y) +> x `mappend` (NewlSep ys) +> | Seq.null ys = x +> | otherwise = NewlSep (x <| ys) +> x `mappend` y = NewlSep $ Seq.fromList [x, y] +> +> instance Monoid Line where +> mempty = SpaceSep mempty +> x@(SpaceSep xs) `mappend` y@(SpaceSep ys) +> | Seq.null xs = y +> | Seq.null ys = x +> | otherwise = SpaceSep (xs <> ys) +> (SpaceSep xs) `mappend` y +> | Seq.null xs = y +> | otherwise = SpaceSep (xs |> y) +> x `mappend` (SpaceSep ys) +> | Seq.null ys = x +> | otherwise = SpaceSep (x <| ys) +> x `mappend` y = SpaceSep $ Seq.fromList [x, y] + +The Monoid instances for `Block` and `Line` are somewhat unwieldy since we want to guarantee minimum overhead by reducing expressions such as `SpaceSep (fromList [x])` to `x` whenever possible. + +The same effect would have been possible by introducing the monoid structure *one level higher* -- we could have introduced constructors such as `Line :: Seq Word -> Block`. +This was deemed undesirable since we would not have been able to implement instances such as `Monoid Line` which allow the use of more generic functions during parsing. + +> text :: Text -> Either Block Line +> -- ^ Smart constructor for 'Line'/'Block' which maps word and line boundaries (as determined by 'isSpace' and '(== '\n')' respectively) to the structure of 'Block' and 'Line'. +> -- +> -- Since we are unwilling to duplicate the list of chars from 'isSpace' we cannot reasonably determine a width for the various whitespace 'Char's. +> -- Thus they are all weighted equally as having width 1 `em`. +> text t = case splitLines t of +> [] -> Right mempty +> [Line x] -> Right x +> xs -> Left $ mconcat xs +> where +> splitLines :: Text -> [Block] +> splitLines t = map toBlock +> . groupBy ((==) `on` TL.null) +> $ TL.split (== '\n') t +> splitWords :: Text -> [Line] +> splitWords t = map toLine +> . groupBy ((==) `on` TL.null) +> $ TL.split isSpace t +> toBlock [] = mempty +> toBlock xs@(x:_) +> | TL.null x = VSpace $ genericLength xs - 1 +> | otherwise = mconcat . map (Line . mconcat . splitWords) $ xs +> toLine [] = mempty +> toLine xs@(x:_) +> | TL.null x = HSpace $ genericLength xs - 1 +> | otherwise = mconcat . map Word $ xs +> list :: b -> (a -> [a] -> b) -> [a] -> b +> list c _ [] = c +> list _ f (x:xs) = f x xs + +Implementations using `TL.lines` and `TL.words` were tested. +We chose to use `TL.split`-based solutions instead because the more specific splitting functions provided by [text](https://hackage.haskell.org/package/text) drop information concerning the exact amount of whitespace. + +> cotext :: Block -> Text +> -- ^ inverse of +> -- @ +> -- either id Line . `text` +> -- @ +> cotext (VSpace n) = TL.pack . genericReplicate n $ '\n' +> cotext (NewlSep xs) = TL.intercalate "\n" . map cotext . toList $ xs +> cotext (Line x) = cotext' x +> where +> cotext' (Word x) = x +> cotext' (HSpace n) = TL.pack . genericReplicate n $ ' ' +> cotext' (SpaceSep xs) = TL.intercalate " " . map cotext' . toList $ xs + +We provide cotext for testing `text` and to enable determining semantic equality of `Printout`s at a later date + +> prop_text :: Text -> Bool +> -- ^ prop> (`cotext` . either id Line . `text` $ x) == x +> -- +> -- Where 'x' is restricted to those `TL.Text` which do not contain whitespace besides ' ' and '\n'. +> prop_text x = (cotext . either id Line . text $ x') == x' +> where +> x' = TL.map normSpace x +> normSpace c +> | isSpace c +> , c `elem` keep = c +> | isSpace c = ' ' -- We have to do this because all whitespace gets interpreted as width 1 +> | otherwise = c +> keep = [' ', '\n'] +> +> -- | We don't test 'Raw' 'Chunk's +> instance Arbitrary Chunk where +> shrink = genericShrink +> arbitrary = Cooked <$> arbitrary +> +> instance Arbitrary Block where +> shrink = genericShrink +> arbitrary = oneof [ Line <$> arbitrary +> , VSpace . getNonNegative <$> arbitrary +> , NewlSep <$> scale' arbitrary +> ] +> +> instance Arbitrary Line where +> shrink = genericShrink +> arbitrary = oneof [ Word . TL.filter (not . isSpace) <$> arbitrary -- 'isSpace '\n' == True' +> , HSpace . getNonNegative <$> arbitrary +> , SpaceSep <$> scale' arbitrary +> ] +> +> scale' = scale (round . sqrt . fromInteger . toInteger) + +Failing to properly scale the tested structures was shown to use more than 8GiB of RAM during testing diff --git a/provider/posts/thermoprint/3.lhs b/provider/posts/thermoprint/3.lhs new file mode 100644 index 0000000..5e7eee9 --- /dev/null +++ b/provider/posts/thermoprint/3.lhs @@ -0,0 +1,92 @@ +--- +title: Thoughts on a network protocol for a toolset for interacting with character-oriented printers +published: 2016-01-11 +tags: Thermoprint +--- + +This post is an annotated version of the file [spec/src/Thermoprint/API.hs](https://git.yggdrasil.li/thermoprint/tree/spec/src/Thermoprint/API.hs?h=rewrite&id=3ad700c) as of commit `3ad700c`. + +> {-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} +> {-# LANGUAGE TypeOperators, DataKinds #-} +> {-# LANGUAGE OverloadedStrings #-} +> +> module Thermoprint.API +> ( PrinterStatus(..) +> , JobStatus(..) +> , ThermoprintAPI +> , thermoprintAPI +> , module Thermoprint.Identifiers +> , module Thermoprint.Printout +> ) where +> +> import Thermoprint.Printout + +See [a previous post](https://dirty-haskell.org/posts/thermoprint-2.html). + +> import Thermoprint.Identifiers + +`Thermoprint.Identifiers` provides some newtypes of `Integer` to add some typesafety to dealing with objects identified by autoincremented numbers + +> import Servant.API +> import Servant.Docs +> import Data.Aeson + +We will define our API to be compatible with [servant](https://hackage.haskell.org/package/servant) + +> import Data.Set (Set) +> import Data.Sequence (Seq) + +Higher performance versions of lists for our various applications + +> import GHC.Generics (Generic) +> +> import Data.Proxy (Proxy(..)) +> +> import Control.Exception (Exception) +> import Data.Typeable (Typeable) +> +> data PrinterStatus = Busy JobId +> | Available +> deriving (Generic, Show, FromJSON, ToJSON) +> +> data JobStatus = Queued +> | Printing +> | Done +> | Failed PrintingError +> deriving (Generic, Show, FromJSON, ToJSON) +> +> data PrintingError = UnknownError +> deriving (Typeable, Generic, Show, FromJSON, ToJSON, Exception) + +We expect the definiton of `PrintingError` to grow considerably while implementing a server for this API + +We support the following actions through our API: + +> type ThermoprintAPI = "printers" :> Get '[JSON] (Set PrinterId) -- List the identifiers of all available printers (/printers) +> :<|> "printer" :> Capture "printerId" PrinterId :> ( +> ReqBody '[JSON] Printout :> Post '[JSON] JobId -- Add a new job to the bottom of the queue by sending its content (/printer:printerId) +> :<|> "status" :> Get '[JSON] PrinterStatus -- Query the current status of a printer (/printer:printerId/status) +> ) +> :<|> "jobs" :> ( +> QueryParam "printer" PrinterId :> QueryParam "min" JobId :> QueryParam "max" JobId :> Get '[JSON] (Seq JobId) -- List all jobs allowing for selection by printerId and pagination (/jobs?printer=*&min=*&max=*) +> ) +> :<|> "job" :> Capture "jobId" JobId :> ( +> Get '[JSON] Printout -- Get the contents of a job currently known to the server (/job:jobId) +> :<|> "status" :> Get '[JSON] JobStatus -- Get the status of a job (/job:jobId/status) +> :<|> "printer" :> Get '[JSON] PrinterId -- Find the printer a job was queued for (/job:jobId/printer) +> :<|> Delete '[] () -- Abort a job (which we expect to make it unknown to the server) (/job:jobId) +> ) +> :<|> "drafts" :> ( +> Get '[JSON] (Set DraftId) -- List the identifiers of all drafts known to the server (/drafts) +> :<|> ReqBody '[JSON] Printout :> Post '[JSON] DraftId -- Make a draft known to the server by submitting its contents (/drafts) +> ) +> :<|> "draft" :> Capture "draftId" DraftId :> ( +> ReqBody '[JSON] Printout :> Put '[] () -- Update a draft by replacing its contents (/draft:draftId) +> :<|> Get '[JSON] Printout -- Get the contents of a draft (/draft:draftId) +> :<|> Delete '[] () -- Delete a draft (/draft:draftId) +> ) +> +> thermoprintAPI :: Proxy ThermoprintAPI +> thermoprintAPI = Proxy + +servant needs an object of type `Proxy ThermoprintAPI` in various places diff --git a/provider/posts/thermoprint/4.md b/provider/posts/thermoprint/4.md new file mode 100644 index 0000000..756c166 --- /dev/null +++ b/provider/posts/thermoprint/4.md @@ -0,0 +1,116 @@ +--- +title: On the Design of a Parser +published: 2016-01-12 +tags: Thermoprint +--- + +The concrete application we’ll be walking through is a naive parser for [bbcode](https://en.wikipedia.org/wiki/BBCode) +-- more specifically the contents of the directory `bbcode` in the +[git repo](https://git.yggdrasil.li/thermoprint/tree/bbcode?h=rewrite&id=dc99dae). + +In a manner consistent with designing software as +[compositions of simple morphisms](https://en.wikipedia.org/wiki/Tacit_programming) we start by determining the type of +our solution (as illustrated by the following mockup): + +~~~ {.haskell} +-- | Our target structure -- a rose tree with an explicit terminal constructor +data DomTree = Element Text (Map Text Text) [DomTree] + | Content Text + deriving (Show, Eq) + +bbcode :: Text -> Maybe DomTree +-- ^ Parse BBCode +~~~ + +Writing a parser capable of dealing with `Text` directly from scratch would be unnecessarily abstruse, we’ll be using +the [attoparsec](https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html) family of parser +combinators instead. + +We reproduce an incomplete version of the lexer below (it’s missing tag attributes and self-closing tags). + +We introduce `escapedText`, a helper function for extracting text until we reach one of a set of delimiting characters +(exclusive). +While doing this we also parse any delimiting character iff it's prefixed with an escape character (we use `\`) -- the +escape character itself needs only be escaped if encountered directly before one of the delimiting characters. + +~~~ {.haskell} +data Token = BBOpen Text -- ^ "[open]" + | BBClose Text -- ^ "[/close]" + | BBStr Text -- ^ "text" + +token :: Parser [Token] +token = BBClose <$ "[/" <*> escapedText' [']'] <* "]" + <|> BBOpen <$ "[" <*> escapedText' [']'] <* "]" + <|> BBStr <$> escapedText ['['] + +escapedText' :: [Char] -> Parser Text +escapedText' = option "" . escapedText + +escapedText :: [Char] -> Parser Text +escapedText [] = takeText -- No delimiting characters -- parse all remaining input +escapedText cs = recurse $ choice [ takeWhile1 (not . special) -- a series of characters we don't treat as special + , escapeSeq -- an escaped delimiter + , escapeChar' -- the escape character + ] + where + escapeChar = '\\' + special = inClass $ escapeChar : cs + escapeChar' = string $ T.singleton escapeChar + escapeSeq = escapeChar' *> (T.singleton <$> satisfy special) -- escape character followed by a special character (which encludes the escape character) + recurse p = mappend <$> p <*> escapedText' cs -- parse a prefix and optionally append another chunk of escapedText + +runTokenizer :: Text -> Maybe [Token] +runTokenizer = either (const Nothing) Just . parseOnly (many token <* endOfInput) +~~~ + +We have now reduced the Problem to `[Token] -> DomTree`. +We quickly see that the structure of the problem is that of a +[fold](https://hackage.haskell.org/package/base/docs/Data-Foldable.html). + +Having realised this we require a function of type `Token -> DomTree -> DomTree` to recursively build up our target +structure. + +In general we’ll want to not only keep track of the `DomTree` during recursion but also maintain a reference to the +position at which we’ll be inserting new tokens. +This kind of problem is well understood and solved idiomatically by using a +[zipper](https://en.wikipedia.org/wiki/Zipper_(data_structure)) +([a cursory introduction](http://learnyouahaskell.com/zippers)). + +Writing zippers tends to be tedious. We’ll therefore introduce an +[additional intermediate structure](https://hackage.haskell.org/package/containers/docs/Data-Tree.html) for which an +[implementation](https://hackage.haskell.org/package/rosezipper) is available readily. +The morphism from this new structure (`Forest BBLabel`) to our `DomTree` will be almost trivial. + +~~~ {.haskell} +import Data.Tree.Zipper (TreePos, Empty, Full) +import qualified Data.Tree.Zipper as Z + +data BBLabel = BBTag Text + | BBPlain Text + +rose :: [BBToken] -> Maybe (Forest BBLabel) +rose = Z.toForest <$> foldM (flip rose') (Z.fromForest []) + +rose' :: BBToken -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) +rose' (BBStr t) = return . Z.nextSpace . Z.insert (Node (BBPlain t) []) -- insert a node with no children and move one step to the right in the forest we’re currently viewing +rose' (BBOpen t) = return . Z.children . Z.insert (Node (BBTag t) []) -- insert the node and move into position to insert it's first child +rose' (BBClose t) = close t -- haskell complains if multiple equations for the same function have a differing number of arguments, therefore: 'close' + where + close :: Text -> TreePos Empty BBLabel -> Maybe (TreePos Empty BBLabel) + close tag pos = do + pos' <- Z.parent pos -- fail if we're trying to close a tag that does not have a parent (this indicates imbalanced tags) + let + pTag = (\(BBTag t) -> t) $ Z.label pos' -- yes, this will fail unceremoniously if the parent is not a tag, this poses no problem since we're constructing the structure ourselves. The proof that this failure mode does not occur is left as an exercise for the reader. + guard (pTag == tag) -- The structure shows that this mode of failure (opening tags content does not match the closing tags) is not logically required -- it only serves as a *notification* to the user + return $ Z.nextSpace pos' -- move one level up and to point at the next sibling of the parent +~~~ + +All that is left to do now is present our final morphism: + +~~~ {.haskell} +dom :: Forest BBLabel -> [DomTree] +dom = map dom' + where + dom' (Node (BBPlain t) []) = Content t + dom' (Node (BBTag t) ts = Element t $ map dom' ts +~~~ diff --git a/provider/posts/thermoprint/5.md b/provider/posts/thermoprint/5.md new file mode 100644 index 0000000..0249734 --- /dev/null +++ b/provider/posts/thermoprint/5.md @@ -0,0 +1,198 @@ +--- +title: Building an Extensible Framework for Specifying Compile-Time Configuration using Universal Quantification +tags: Thermoprint +published: 2016-02-18 +--- + +When I write *Universal Quantification* I mean what is commonly referred to as +[existential quantification](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/data-type-extensions.html#existential-quantification), +which I think is a misnomer. To wit: + +$( \exists x \ldotp f(x) ) \to y$ is isomorphic to $\forall x \ldotp (f(x) \to y)$ (I +won´t try to back this claim up with actual category theory just now. You might want to +nag me occasionally if this bothers you -- I really should invest some more time into +category theory). Since haskell does not support `exists` we´re required to use the +`forall`-version, which really is universally quantified. + +## Printer Configuration + +What we want is to have the user provide us with a set of specifications of how to +interact with one printer each. +Something like the following: + +~~~ {.haskell} +newtype PrinterMethod = PM { unPM :: Printout -> IO (Maybe PrintingError) } + +data Printer = Printer + { print :: PrinterMethod + , queue :: TVar Queue + } +~~~ + +The first step in refining this is necessitated by having the user provide the +[monad-transformer-stack](http://book.realworldhaskell.org/read/monad-transformers.html) +to use at compile time. +Thus we introduce our first universal quantification (in conjunction with +[polymorphic components](https://prime.haskell.org/wiki/PolymorphicComponents)) -- this +one is not isomorphic to an existential one: + +~~~ {.haskell} +newtype PrinterMethod = PM { unPm :: forall m. MonadResource m => Printout -> m (Maybe PrintingError) } +~~~ + +Since we don´t want to *burden* the user with the details of setting up `TVar Queue`{.haskell} we +also introduce function to help with that: + +~~~ {.haskell} +printer :: MonadResource m => PrinterMethod -> m Printer +printer p = Printer p <$> liftIO (newTVarIO def) +~~~ + +We could at this point provide ways to set up `PrinterMethod`{.haskell}s and have the user +provide us with a list of them. + +We, however, have numerous examples of printers which require some setup (such opening a +file descriptor). The idiomatic way to handle this is to decorate that setup with some +constraints and construct our list of printers in an +[`Applicative`{.haskell}](https://hackage.haskell.org/package/base/docs/Control-Applicative.html#t:Applicative) +fashion: + +~~~ {.haskell} +printer :: MonadResource m => m PrinterMethod -> m Printer +printer p = Printer <$> p <*> liftIO (newTVarIO def) +~~~ + +At this point a toy implementation of a printer we might provide looks like this: + +~~~ {.haskell} +debugPrint :: Applicative m => m PrinterMethod +debugPrint = pure . PM $ const return Nothing <=< liftIO . putStrLn . toString + +toString :: Printout -> String +toString = undefined +~~~ + +## Management of Printer Queues + +We would like the user to be able to modify the printer queues we maintain in arbitrary +ways. +The motivation for this being various cleanup operations such as pruning all successful +jobs older than a few minutes or limiting the size of history to an arbitrary number of +entries. + +A pattern for this type of modification of a value residing in a `TVar`{.haskell} might +look like this: + +~~~ {.haskell} +modify :: TVar a -> StateT a STM () -> IO () +modify q f = atomically $ writeTVar =<< runStateT f =<< readTVar q +~~~ + +A rather natural extension of this is to allow what we will henceforth call a +`QueueManager`{.haskell} (currently `StateT a STM ()`{.haskell}) to return an indication +of when it wants to be run again: + +~~~ {.haskell} +type QueueManager = StateT Queue STM Micro + +runQM :: QueueManager -> TVar Queue -> IO () +runQM qm q = sleep << qm' + where + qm' = atomically $ (\(a, s) -> a <$ writeTVar q s) =<< runStateT qm =<< readTVar q + sleep (abs -> delay) = threadDelay (fromEnum delay) >> runQM qm q +~~~ + +It stands to reason that sometimes we don't want to run the `QueueManager`{.haskell} ever +again (probably causing the thread running it to terminate). +For doing so we +[extend the real numbers](https://en.wikipedia.org/wiki/Extended_real_number_line) as +represented by `Micro`{.haskell} to +[`Extended Micro`{.haskell}](https://hackage.haskell.org/package/extended-reals): + +~~~ {.haskell} +type QueueManager = StateT Queue STM (Extended Micro) + +runQM … + where + … + sleep (abs -> delay) + | (Finite d) <- delay = threadDelay (fromEnum d) >> runQM qm q + | otherwise = return () +~~~ + +`QueueManager`{.haskell}s whose type effectively is `Queue -> STM (Queue, Extended Micro)`{.haskell} +are certainly useful but can carry no state between invocations (which would be useful +e.g. for limiting the rate at which we prune jobs). + +Therefore we allow the user to provide an arbitrary monad functor (we use +`MFunctor`{.haskell} from +[mmorph](https://hackage.haskell.org/package/mmorph-1.0.6/docs/Control-Monad-Morph.html#t:MFunctor) +instead of `Servant.Server.Internal.Enter` because +[servant-server](https://hackage.haskell.org/package/servant-server-0.4.4.6/docs/Servant-Server-Internal-Enter.html#v:Nat) +doesn't provide all the tools we require for this) which can carry all the state we could +ever want: + +~~~ {.haskell} +type QueueManager t = QueueManagerM t (Extended Micro) +type QueueManagerM t = ComposeT (StateT Queue) t STM -- 'ComposeT' is required since we need 'QueueManagerM' to have the form 't' STM' for some 't'' in order to be able to use 'lift' + +runQM :: (MFunctor t, MonadTrans t, MonadIO (t IO), Monad (t STM)) => QueueManager t -> TVar Queue -> t IO () +runQM … -- nearly identical except for a sprinkling of 'lift' +~~~ + +The final touches are to introduce a typeclass `HasQueue`{.haskell} for convenience: + +~~~ {.haskell} +class HasQueue a where + extractQueue :: a -> TVar Queue + +instance HasQueue (TVar Queue) where + extractQueue = id + +instance HasQueue Printer where + extractQueue = queue +~~~ + +and provide some utility functions for composing `QueueManager`{.haskell}s: + +~~~ {.haskell} +intersection :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t +-- ^ Combine two 'QueueManager's keeping only 'QueueEntry's both managers decide to keep +-- +-- Side effects propagate left to right + +idQM :: Monad (QueueManagerM t) => QueueManager t +-- ^ Identity of 'intersect' + +union :: (Foldable f, MonadState Queue (QueueManagerM t)) => f (QueueManager t) -> QueueManager t +-- ^ Combine two 'QueueManager's keeping all 'QueueEntry's either of the managers decides to keep +-- +-- Side effects propagate left to right + +nullQM :: MonadState Queue (QueueManagerM t) => QueueManager t +-- ^ Identity of 'union' +~~~ + +We merge the effects of two `QueueManager`{.haskell}s by converting the resulting +`Queue`{.haskell}s to `Set`{.haskell}s and using `Set.union`{.haskell} and +`Set.intersection`{.haskell} with appropriate `Ord`{.haskell} and `Eq`{.haskell} +instances. + +### Configuration of `QueueManager`{.haskell}s + +A `QueueManager`{.haskell}s configuration shall be a `QueueManager t`{.haskell} associated +with a specification of how to collapse its monad transformer `t`{.haskell}. +Using universal quantification this is straightforward: + +~~~ {.haskell} +data QMConfig m = forall t. ( MonadTrans t + , MFunctor t + , Monad (t STM) + , MonadIO (t IO) + ) => QMConfig { manager :: QueueManager t + , collapse :: (t IO) :~> m + } + +runQM' :: Printer -> QMConfig m -> m () +runQM' printer (QMConfig qm nat) = unNat nat $ runQM qm printer +~~~ diff --git a/provider/posts/thermoprint/6.lhs b/provider/posts/thermoprint/6.lhs new file mode 100644 index 0000000..9182427 --- /dev/null +++ b/provider/posts/thermoprint/6.lhs @@ -0,0 +1,142 @@ +--- +title: Deriving a Client Library for Interacting with Character-Oriented Printers +tags: Thermoprint +published: 2016-02-18 +--- + +> {-# LANGUAGE DataKinds #-} +> {-# LANGUAGE TypeOperators #-} +> {-# LANGUAGE ViewPatterns #-} +> {-# LANGUAGE RecordWildCards #-} +> +> -- | A client library for 'Thermoprint.API' +> module Thermoprint.Client +> ( Client(..) +> , mkClient, mkClient' +> , throwNat, ioNat +> -- = Reexports +> , ServantError(..) +> , module Servant.Common.BaseUrl +> , module Control.Monad.Trans.Either +> , module Servant.Server.Internal.Enter +> ) where +> +> import Thermoprint.API +> import Data.Map (Map) +> import Data.Sequence (Seq) +> import Data.Time (UTCTime) +> +> import Servant.Client hiding (HasClient(..)) +> import qualified Servant.Client as S +> import Servant.Common.BaseUrl +> import Servant.API +> import Servant.Server.Internal.Enter +> import Control.Monad.Trans.Either +> +> import Control.Monad.Catch (Exception, MonadThrow(..)) +> import Control.Monad.IO.Class (MonadIO(..)) +> +> import Control.Monad +> import Control.Category +> import Prelude hiding (id, (.)) +> +> instance Exception ServantError + +We encapsulate all api operations in a single record parametrized over the monad we intend to use +them in. +Construction of such a record is pure since all we require to do so is a `BaseUrl`{.haskell}. +Using RecordWildCards we can bring all operations into scope with extreme ease. + +> -- | All 'ThermoprintAPI'-functions as a record +> -- +> -- Use like this: +> -- +> -- > {-# LANGUAGE RecordWildCards #-} +> -- > +> -- > main :: IO () +> -- > -- ^ Display a list of printers with their status +> -- > main = print =<< printers +> -- > where Client{..} = mkClient' $ Http "localhost" 3000 +> data Client m = Client +> { printers :: m (Map PrinterId PrinterStatus) +> -- ^ List all printers +> , jobs :: Maybe PrinterId +> -> Maybe (Range (JobId)) +> -> Maybe (Range (UTCTime)) +> -> m (Seq (JobId, UTCTime, JobStatus)) +> -- ^ List a selection of jobs +> , jobCreate :: Maybe PrinterId -> Printout -> m JobId +> -- ^ Send a 'Printout' to be queued +> , job :: JobId -> m Printout +> -- ^ Retrieve the contents of a job +> , jobStatus :: JobId -> m JobStatus +> -- ^ Query a jobs status +> , jobDelete :: JobId -> m () +> -- ^ Delete a job from the queue (not from history or while it is being printed) +> , drafts :: m (Map DraftId (Maybe DraftTitle)) +> -- ^ List all saved drafts +> , draftCreate :: Maybe DraftTitle +> -> Printout +> -> m DraftId +> -- ^ Create a new draft +> , draftReplace :: DraftId +> -> Maybe DraftTitle +> -> Printout +> -> m () +> -- ^ Replace the contents and title of an existing draft +> , draft :: DraftId -> m (Maybe DraftTitle, Printout) +> -- ^ Retrieve the contents and title of a draft +> , draftDelete :: DraftId -> m () +> -- ^ Delete a draft +> , draftPrint :: DraftId -> Maybe PrinterId -> m JobId +> -- ^ Send a draft to be printed +> } + +[servant documentation](https://haskell-servant.github.io/tutorial/server.html#nested-apis) +advises factoring out apis to make the specification more concise. +We are rightly advised that doing so has an effect on the types of the +corresponding `Server`{.haskell}s and `Client`{.haskell}s. +To cope with this we introduce a helper function that allows us, when +used with ViewPatterns, to nontheless simply pattern match on `client`{.haskell}. + +> withArgs :: (x -> a :<|> b) -> ((x -> a) :<|> (x -> b)) +> -- ^ Undo factoring of APIs +> withArgs subAPI = (\(a :<|> _) -> a) . subAPI :<|> (\(_ :<|> b) -> b) . subAPI + +`withArgs`{.haskell} as presented here does not recurse and thus +doesn't handle more than one occurence of `:<|>`{.haskell}. +We have to to so ourselves using nested ViewPatterns (see +`mkClient`{.haskell}). + +> mkClient :: (EitherT ServantError IO :~> m) -- ^ A monad functor ('Nat') used to make the api functions work in any monad which can do 'IO' and handle errors +> -> BaseUrl +> -> Client m +> -- ^ Generate a 'Client' + +RecordWildCards also allows us to construct a record from components +in scope. + +> mkClient n url = Client{..} +> where +> printers +> :<|> (jobs :<|> jobCreate) +> :<|> (withArgs -> job :<|> (withArgs -> jobStatus :<|> jobDelete)) +> :<|> (drafts :<|> draftCreate) +> :<|> (withArgs -> draftReplace :<|> (withArgs -> draft :<|> (withArgs -> draftDelete :<|> draftPrint))) +> = enter n $ client thermoprintAPI url + +We also provide some additional convenience functions so the user +doesn't have to construct their own `Nat`{.haskell}ural +transformations. + +> mkClient' :: (MonadThrow m, MonadIO m) => BaseUrl -> Client m +> -- ^ @mkClient' = mkClient $ ioNat . throwNat@ +> mkClient' = mkClient $ ioNat . throwNat +> +> throwNat :: (Exception e, MonadThrow m) => EitherT e m :~> m +> -- ^ Squash a layer of 'EitherT' into the underlying monad supporting 'throwM' +> throwNat = Nat $ either throwM return <=< runEitherT +> +> ioNat :: MonadIO m => IO :~> m +> -- ^ @ioNat = Nat liftIO@ +> ioNat = Nat liftIO diff --git a/provider/posts/ymir.md b/provider/posts/ymir.md deleted file mode 100644 index 83e5811..0000000 --- a/provider/posts/ymir.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: Moved servers -published: 2015-11-05 -tags: Blog Software ---- - -dirty-haskell.org now lives on ymir.yggdrasil.li. -- cgit v1.2.3