summaryrefslogtreecommitdiff
path: root/provider/posts/beuteltier
diff options
context:
space:
mode:
Diffstat (limited to 'provider/posts/beuteltier')
-rw-r--r--provider/posts/beuteltier/1.lhs327
-rw-r--r--provider/posts/beuteltier/2.lhs179
-rw-r--r--provider/posts/beuteltier/3.md19
-rw-r--r--provider/posts/beuteltier/4.lhs220
4 files changed, 745 insertions, 0 deletions
diff --git a/provider/posts/beuteltier/1.lhs b/provider/posts/beuteltier/1.lhs
new file mode 100644
index 0000000..7789f40
--- /dev/null
+++ b/provider/posts/beuteltier/1.lhs
@@ -0,0 +1,327 @@
1---
2title: On the Design of Overly Complicated Feedreaders
3published: 2015-08-04
4tags: Beuteltier
5---
6
7I like feedreaders.
8Thus, of course, I had to implement my own, because, as always, all existing software does
9not fullfill my exceedingly unrealistic expectations with respect to customizability and
10extendability.
11
12This post marks the start of a series describing and documenting the design of the current
13iteration of `Beuteltier` (`Beutel` kind of sounds like [Beuter](https://newsbeuter.org)
14and is german for bag, which is what we call our backstore since it is held at such an
15universal and unstructured level that the analogy is fitting. `Tier` is german for animal
16and taken to mean "Thing that does stuff". In conjunction `Beuteltier` means
17[Marsupial](https://en.wikipedia.org/wiki/Marsupial)).
18
19It should be noted that the library described here is not finished or ready for use in any
20sense of the word (at the time of writing a "trivial" implementation of a `Beutel` shipped
21with the library supports only `run`, `search`, and `delete`). Searching a way to
22procrastinate implementing the more arduous `insert` (it requires nubbing—deduplication in
23the backstore) I decided to, instead, start this series of posts and put the thought that
24went into the library so far in a form that I can read again for later reference.
25
26We begin, as is to be expected for a haskell project, with type definitions and, thus,
27design philosophy.
28
29This post in particular reproduces the file `beuteltier/Beuteltier/Types.hs` from the
30git repo with annotiations to provide some motivation.
31
32The `Beuteltier` library itself only provides primitives for (and a default implementation
33of) access to what we call a backstore. A backstore is, to us, an instance of the
34typeclass `Beutel` which contains the most primitive of primitives for storing, searching
35for and deleting representations of the objects we care about from the store.
36
37It is recommended that the reader not try to follow the rest of this post linearly but start
38at the end with the definition of the `Beutel` class and work their way backwards.
39
40> {-# LANGUAGE FlexibleInstances, StandaloneDeriving, KindSignatures, MultiParamTypeClasses, TypeFamilies #-}
41>
42> module Beuteltier.Types
43> ( -- * Types
44> Object
45> , ObjectGen(..)
46> , SubObject(..)
47> , MetaData(..)
48> , Thunk(..)
49> , ThunkState(..)
50> , ThunkResult(..)
51> , Tag
52> , Flag(..)
53> , SubObjectName
54> , ThunkName
55> , SearchQuery
56> , Predicate
57> , Beutel(..)
58> ) where
59
60`Flag` ends up being a [sum type](https://en.wikipedia.org/wiki/Sum_type) holding values
61such as `Seen`, `Old`, or `Hidden`.
62We define it externally.
63
64> import Beuteltier.Types.Flags
65
66The `Identity` functor serves as basis for many a Monadtransformer-stack.
67
68> import Data.Functor.Identity
69> import Data.Functor.Classes ()
70
71Binary contents are encoded as `ByteStrings`
72
73> import qualified Data.ByteString.Lazy as Lazy (ByteString)
74> import qualified Data.ByteString.Lazy as LBS
75
76Unicode text as `Text`
77
78> import Data.Text (Text)
79
80Long unicode text as lazy `Text`
81
82> import qualified Data.Text.Lazy as Lazy (Text)
83> import qualified Data.Text.Lazy as LT
84>
85> import Data.Set (Set)
86>
87> import Data.Map (Map)
88>
89> import Data.Time (UTCTime)
90>
91> import Data.Function (on)
92> import Data.Ord (comparing)
93> import Control.Applicative
94
95`Data.Default` provides some convenience when constructing extensive record structures.
96
97> import Data.Default
98
99The `boolexpr` package provides us with a structure for representing boolean expressions
100supporting functor operations and evaluation.
101
102> import Data.BoolExpr
103
104Previous iterations of Beuteltier acted on Objects that were kept completely in RAM during
105all operations.
106This proved to be unsustainable, not only because nubbing (deduplication in the store of
107all objects) tended to exceed all RAM constraints (>4GiB for a few hundred objects), but
108also because cheaper operations on objects, like presentation to the user, got painfully
109slow once large `SubObject`s (like videos) were introduced into the store.
110
111The straight forward solution was to enrich the `Object` structure with provisions for
112explicit lazyness and partial construction.
113
114> -- | We deal in, at runtime, partially retrieved Objects
115> data ObjectGen (f :: * -> *) = ObjectGen
116> { _oMeta :: f MetaData
117> -- ^ An undetermined set of Metainformation
118> , _oContent :: f (Map SubObjectName (f SubObject))
119> -- ^ A list of undetermined length of undetermined
120> -- 'SubObject's with guaranteed unique 'SubObjectName's
121> , _oThunks :: f [f Thunk]
122> -- ^ A list of undetermined length of undetermined Thunks.
123> -- There is such a thing as thunk colissions (i.e.: two
124> -- thunks promise or even create 'SubObject's with the
125> -- same name).
126> -- Precedence in such a case is to be as suggested by
127> -- the list structure (later thunks override earlier ones).
128> }
129>
130> instance Monad f => Default (ObjectGen f) where
131> def = ObjectGen { _oContent = return def
132> , _oThunks = return def
133> , _oMeta = return def
134> }
135
136It is straight forward to collapse the more advanced representation of `Object`s back to
137the old behaviour by parametrising over the Identity functor, which is simply a newtype
138wrapper over the contained structure.
139
140> -- | An entirely retrieved Object
141> type Object = ObjectGen Identity
142>
143> -- -- | The default 'Object' is empty except for metadata
144> -- instance Default Object where
145> -- def = ObjectGen { _oContent = return def
146> -- , _oThunks = return def
147> -- , _oMeta = return def
148> -- }
149>
150> -- | Equality simply gets deferred to all subcomponents
151> deriving instance Eq Object
152>
153> -- | 'Object's compare as their 'MetaData'
154> instance Ord Object where
155> compare = comparing _oMeta
156
157We would like to associate some set of meta information with all objects.
158Therefore, we do.
159
160> -- | Metadata associated with an Object
161> data MetaData = MetaData
162> { _mRetrieved :: UTCTime -- ^ Time of creation
163> , _mTags :: Set Tag -- ^ Tags such as the name of the author,
164> -- the title of the work represented in
165> -- the 'Object', ….
166> -- We use something like @show . _mTags@
167> -- to identify an 'Object' to the user
168> , _mFlags :: Set Flag -- ^ Flags such as \"Read\" or \"Spam\"
169> } deriving (Show, Ord)
170> -- | Tags are unicode text
171> type Tag = Text
172>
173> -- | 'MetaData' equates as the contained tags
174> instance Eq MetaData where
175> (==) = (==) `on` _mTags
176>
177> -- | The default MetaData has no tags, no flags, and an undefined timestamp
178> instance Default MetaData where
179> def = MetaData { _mFlags = def
180> , _mTags = def
181> , _mRetrieved = undefined -- There really is no such thing as a default time
182> }
183
184Objects are no fun if they don´t contain anything of interest in the end.
185
186Below we see a remnant of an older model of associating names to `SubObject`s. We switched
187to using a `Map` for reasons of deduplication. Inserting into a `Map` carries some
188guarantees that keys end up being unique.
189
190Note below: creation of a `SubObject` is an update. It is thus expected, that `SubObject`s
191created at the same time as the `Object` they are associated to encode an update
192time that matches the `Object`s creation time.
193
194> -- | Contents of an object
195> data SubObject = SubObject
196> -- { _sId :: SubObjectName
197> -- ^ We associate a name to every chunk of content to determine
198> -- how to present an object to the user
199> { _sContent :: Lazy.ByteString
200> , _sUpdates :: [UTCTime]
201> -- ^ Times of witnessed updates to this 'SubObject'
202> } deriving (Show)
203>
204> -- | No content, no witnessed updates
205> instance Default SubObject where
206> def = SubObject { _sContent = def
207> , _sUpdates = def
208> }
209>
210> -- | Extensionality for 'SubObject's:
211> --
212> -- > (==) = (==) `on` _sContent
213> instance Eq SubObject where
214> (==) = (==) `on` _sContent
215
216The distinguishing feature of Beuteltier is it´s support for `Thunk`s. They are, as the
217name suggests, loosly based on the concept of lazy evaluation. They are, however, less
218transparent and thus more explicit than implementations as they are used in, for example
219haskell.
220
221As far as Beuteltier is concerned `Thunk`s are executables that are expected to produce
222files in the directory they are executed in in a pure manner. That is to say they do not
223access external resources, where possible. A `Thunk` that downloads a video from the
224internet will, of course, access the internet and can thus fail. We expect it, however, to
225not to try and access the users home directory to look for e.g. credentials for
226authentication it intends to use to its primary job.
227
228When a `Thunk`s executable gets executed the files it creates (excluding itself) get
229translated to `SubObject`s with the filenames (directories stripped of course) as their
230`SubObjectName`s and the file contents as their… well, their contents. It is understood,
231that not all possible `SubObjectName`s can be created thus (we restrict ourselves to valid
232filenames on whatever system we happen to be on). We do not consider this to be a great
233loss.
234
235The advanced equality checks mentioned below are, in fact, implemented and will be explained
236in more detail in a later post concerned with the file `beuteltier/Beuteltier/Types/Util.hs`.
237
238> -- | Thunks are at runtime not yet known parts of an object
239> data Thunk = Thunk
240> { _tId :: ThunkName -- ^ For debugging
241> , _tScript :: Lazy.ByteString
242> -- ^ A Thunk is, in the end, a shell script that is expected to generate
243> -- 'SubObject's
244> , _tPromises :: Maybe [SubObjectName]
245> -- ^ Maybe we already know what our script is going to generate?
246> -- This would enable us to do some more advanced equality checks under
247> -- the assumption that scripts are pure
248> , _tState :: ThunkState
249> }
250> deriving (Show)
251>
252> -- | Empty id, empty script, promises nothing, and with default state
253> instance Default Thunk where
254> def = Thunk { _tId = def
255> , _tScript = def
256> , _tPromises = def
257> , _tState = def
258> }
259>
260> -- | Equality on 'Thunk's ignores '_tState' and '_tId'
261> instance Eq Thunk where
262> a == b = and $ [ (==) `on` _tScript
263> , (==) `on` _tPromises
264> ] <*> pure a <*> pure b
265>
266> -- | The states in which a 'Thunk' can be encountered.
267> data ThunkState = NotExecuted
268> | Executed [SubObjectName] ThunkResult
269> deriving (Show)
270>
271> -- | Return the default 'ThunkResult' upon forcing
272> instance Default ThunkState where
273> def = NotExecuted
274>
275> -- | Thunks generate some data during execution
276> data ThunkResult = ThunkResult
277> { _rOutErr, _rOutStd :: Lazy.Text
278> , _rExit :: Integer -- ^ Numerical exit code (0 usually means success)
279> }
280> deriving (Show)
281>
282> -- | Empty output, and with undefined exit code (no execution took place and we can´t
283> -- encode this in a numerical exit code)
284> instance Default ThunkResult where
285> def = ThunkResult { _rOutErr = LT.empty, _rOutStd = LT.empty
286> , _rExit = undefined
287> }
288>
289> -- | We expect identifiers for 'SubObject's to be short, thus 'String'
290> type SubObjectName = String
291> -- | We expect identifiers for 'Thunk's to be short, thus 'String'
292> type ThunkName = String
293>
294> -- | @LBS.empty@
295> instance Default (Lazy.ByteString) where
296> def = LBS.empty
297
298What good is a library for managing a backstore if it does not support search operations?
299We consider the answer to be "very little" and, thus, support searches.
300
301> type SearchQuery f = BoolExpr (Predicate f)
302> -- data Predicate f = Prim (ObjectGen f -> f Bool)
303> -- | Meta (MetaData -> Bool)
304> type Predicate f = ObjectGen f -> f Bool
305
306The heart of the `Beuteltier` library is the typeclass reproduced below. We expect
307implementations of backstores to be `Monad`s so that we may be able to construct
308complicated actions that act on the backstore in question.
309Once we have constructed such an action using the three primitives `search`, `insert`, and
310`delete` we additionally require a way to execute that action from within the `IO`
311`Monad`.
312
313Additional primitives, such as those for "forcing" and resetting thunks, are provided in
314additional libraries and, thus, later posts.
315
316> -- | We have the user supply the functions we use to interact with whatever backstore
317> -- she uses
318> class Monad functor => Beutel (functor :: * -> *) where
319> data Config :: *
320> run :: Config -> functor a -> IO a
321> -- ^ Actually run whatever action we constructed against the backstore
322> search :: SearchQuery functor -> functor [ObjectGen functor]
323> -- ^ Perform a search
324> insert :: Object -> functor ()
325> -- ^ Insert an object
326> delete :: SearchQuery functor -> functor ()
327> -- ^ Delete the results of a search
diff --git a/provider/posts/beuteltier/2.lhs b/provider/posts/beuteltier/2.lhs
new file mode 100644
index 0000000..8da4711
--- /dev/null
+++ b/provider/posts/beuteltier/2.lhs
@@ -0,0 +1,179 @@
1---
2title: "\"Type level\" utilities for an overly complicated feedreader"
3published: 2015-08-05
4tags: Beuteltier
5---
6
7By popular (n=1) demand we will, in this post, be taking a look at
8`beuteltier/Beuteltier/Types/Util.hs` the, creatively named, module providing some "type
9level" utilities.
10
11What I mean when I say "type level" is: additional instances (placed here when they
12contain major design decisions and are not "Ord" or "Eq"), utilities not connected to
13beuteltier itself (like the different flavours of `alter` below)
14
15In contrast to the first, this post is straightforward enough to be read linearly.
16
17> {-# LANGUAGE TypeSynonymInstances, FlexibleInstances #-}
18>
19> module Beuteltier.Types.Util
20> ( -- * Constructing structures
21> construct
22> , construct'
23> , alter
24> , alter'
25> -- * Dealing with 'ObjectGen's (here be dragons)
26> , generateObject
27> , liftGen
28> -- * Equivalence on 'Object's (for nubbing)
29> , Equivalent(..)
30> -- * Operations on 'SearchQuery's
31> , runQuery
32> -- , runExpr
33> ) where
34>
35> import Beuteltier.Types
36> import Beuteltier.Types.Lenses
37
38We make use of lenses (as provided by [lens](http://hackage.haskell.org/package/lens))
39extensively.
40We won´t dedicate a post to `beuteltier/Beuteltier/Types/Lenses.hs` because it consists
41mostly of the canonical invocations of
42[makeLenses](http://hackage.haskell.org/package/lens-4.12.3/docs/Control-Lens-TH.html#v:makeLenses).
43
44> import Data.Default
45>
46> import Prelude hiding (sequence)
47> import Data.Traversable (sequence)
48>
49> import Control.Lens
50>
51> import Control.Monad.State.Lazy hiding (sequence) -- Why is this exported?
52>
53> import Data.Map (Map)
54> import qualified Data.Map as Map
55>
56> import Data.Set (Set)
57> import qualified Data.Set as Set
58>
59> import Data.Hashable (Hashable(..), hashUsing)
60>
61> import Data.Monoid ((<>))
62>
63> import Data.Function (on)
64> import Data.Maybe (mapMaybe)
65>
66> import Data.BoolExpr
67
68Quite often we find ourselves in the position that we want to alter some small parts of a
69complicated structure. We would therefore like to write the following:
70
71~~~ {.haskell}
72updateFoo :: Foo -> Monad Foo
73updateFoo x = alter x $ do
74 bar <~ (constructNewBar :: Monad Bar)
75 buz .= (makeConstantBuz :: Buz)
76~~~
77
78The definitions below allow us not only to do so, but also provide some convenience
79functions for constructing entirely new values and performing both operations in a pure
80context.
81
82> alter :: Monad m => s -> StateT s m a -> m s
83> -- ^ Alter a complex structure monodically
84> alter = flip execStateT
85>
86> alter' :: s -> State s a -> s
87> -- ^ Specialization of 'alter' to 'Identity'
88> alter' s = runIdentity . alter s
89>
90> construct :: (Monad m, Default s) => StateT s m a -> m s
91> -- ^ Compute a complex structure monadically
92> construct = alter def
93>
94> construct' :: Default s => State s a -> s
95> -- ^ Specialization of 'construct' to 'Identity'
96> construct' = runIdentity . construct
97
98Sometimes we just really want to translate an `ObjectGen` to an `Object`.
99
100> generateObject :: Monad f => ObjectGen f -> f Object
101> -- ^ Run an object generator.
102> -- Use iff /all/ components of an object are needed /in RAM now/.
103> generateObject gen = construct $ do
104> content <- lift $ gen ^. oContent >>= sequence
105> thunks <- lift $ gen ^. oThunks >>= sequence
106> meta <- lift $ gen ^. oMeta
107> oContent .= return (fmap return content)
108> oThunks .= return (fmap return thunks)
109> oMeta .= return meta
110>
111> liftGen :: Monad f => Object -> ObjectGen f
112> -- ^ Lift an 'Object' to be an 'ObjectGen' in any 'Monad' by the power of 'return'
113> liftGen obj = construct' $ do
114> oContent .= return (Map.map return $ obj ^. oContent')
115> oThunks .= return (map return $ obj ^. oThunks')
116> oMeta .= return (obj ^. oMeta')
117
118We expect implementations of `insert` to perform what we call nubbing. That is removal of
119`Object`s that are, in some sense, `Equivalent` to the new one we´re currently
120inserting. Thus we provide a definition of what we mean, when we say `Equivalent`.
121
122> class Equivalent a where
123> (~~) :: a -> a -> Bool
124>
125> -- | Two 'Object's are equivalent iff their content is identical as follows:
126> -- the set of 'SubObjectName's both promised and actually occurring is identical
127> -- and all 'SubObject's that actually occurr and share a 'SubObjectName' are
128> -- identical (as per '(==)')
129> --
130> -- Additionally we expect their 'Metadata' to be identical (as per '(==)')
131> instance Equivalent Object where
132> a ~~ b = (contentCompare `on` content) a b && ((==) `on` (^. oMeta')) a b
133> where
134> contentCompare :: (Ord k, Eq v) => Map k (Maybe v) -> Map k (Maybe v) -> Bool
135> contentCompare a b = Map.foldl (&&) True $ Map.mergeWithKey combine setFalse setFalse a b
136> combine _ a b = Just $ cmpMaybes a b
137> setFalse = Map.map $ const False
138>
139> cmpMaybes Nothing _ = True
140> cmpMaybes _ Nothing = True
141> cmpMaybes (Just a) (Just b) = a == b
142
143To speed up nubbing we also provide a quick way to "cache results". To make caching
144meaningful we of course expect the following to hold:
145
146~~~
147a ~~ b ⇒ (hash a) == (hash b)
148~~~
149
150Note that we do not expect the converse to hold. We will thus require a second pass over
151all objects sharing a hash to determine true equivalency.
152
153> -- | Two 'Object's´ hashes are a first indication of whether they are 'Equivalent'
154> instance Hashable Object where
155> hashWithSalt = hashUsing $ \a -> (a ^. oMeta', Map.keys $ content a)
156>
157> instance Hashable MetaData where
158> hashWithSalt = hashUsing $ Set.toList . (^. mTags)
159>
160> content :: Object -> Map SubObjectName (Maybe SubObject)
161> content obj = promised obj <> actual obj
162> actual :: Object -> Map SubObjectName (Maybe SubObject)
163> actual = fmap Just . (^. oContent')
164> promised :: Object -> Map SubObjectName (Maybe SubObject)
165> promised = Map.fromList . map (\n -> (n, Nothing)) . concat . promises
166> promises :: Object -> [[SubObjectName]]
167> promises = mapMaybe (^. tPromises) . (^. oThunks')
168
169Evaluating a `SearchQuery` against an `ObjectGen` is, due to the structure of elementary
170`SearchQuery`s quite straightforward.
171
172> runQuery :: Monad f => SearchQuery f -> ObjectGen f -> f Bool
173> -- ^ Run a 'SearchQuery' against an 'ObjectGen'
174> runQuery query obj = liftM reduceBoolExpr $ sequence $ fmap ($ obj) query
175>
176> -- runExpr :: Monad f => ObjectGen f -> Predicate f -> f Bool
177> -- -- ^ Run a 'Predicate' (»an atomic 'SearchQuery'«) against an 'ObjectGen'
178> -- runExpr obj (Prim f) = f obj
179> -- runExpr obj (Meta f) = liftM f (obj ^. oMeta)
diff --git a/provider/posts/beuteltier/3.md b/provider/posts/beuteltier/3.md
new file mode 100644
index 0000000..9b699c1
--- /dev/null
+++ b/provider/posts/beuteltier/3.md
@@ -0,0 +1,19 @@
1---
2title: An Update to the Type Level Utilities of an Overly Complicated Feedreader
3published: 2015-08-12
4tags: Beuteltier
5---
6
7I commited a change to `beuteltier/Beuteltier/Types/Util.hs` ("[…] Hashable update"). I
8replaced the `Hashable` instances for `Object` and `MetaData` with a single, better
9optimized function:
10
11~~~ {.haskell}
12objHash :: Applicative f => ObjectGen f -> f Int
13-- ^ Two 'ObjectGen's hashes are a first indication of whether they are 'Equivalent'
14objHash o = fmap hash $ (,) <$> (Set.toList . (^. mTags) <$> o ^. oMeta) <*> (Map.keys <$> o ^. oContent)
15~~~
16
17The new implementation allows computation of hashes without calling `generateObject` (that
18function is evil — it makes sure the entire `Object` is "in RAM" (it isn´t actually, of
19course (because haskell is lazy)—but I have no guarantee of that)).
diff --git a/provider/posts/beuteltier/4.lhs b/provider/posts/beuteltier/4.lhs
new file mode 100644
index 0000000..478cbac
--- /dev/null
+++ b/provider/posts/beuteltier/4.lhs
@@ -0,0 +1,220 @@
1---
2title: Business Logic for an Overly Complicated Feedreader
3published: 2015-08-12
4tags: Beuteltier
5---
6
7It turns out I don´t have to write much in the way of comments—the source file is already
8quite well documented.
9
10> {-# LANGUAGE KindSignatures #-}
11>
12> module Beuteltier
13> ( -- * Forcing (Executing) 'Thunk's
14> forceAllThunks
15> , WithObjects
16> , forceThunk
17> , resetAllThunks
18> , resetThunk
19> -- * Higher level interactions with a 'Beutel'
20> , replace
21> , eqTo
22> , update
23> , forceAllThunks'
24> , resetAllThunks'
25> , module Beuteltier.Util
26> , module Beuteltier.Types.Common
27> ) where
28>
29> import Beuteltier.Types.Common
30> import Beuteltier.Util
31>
32> import Data.Map (Map)
33> import qualified Data.Map as Map
34>
35> import Control.Lens
36>
37> import qualified Data.ByteString.Lazy as LBS
38> import qualified Data.ByteString as BS
39>
40> import qualified Data.Text as T
41> import qualified Data.Text.IO as T
42> import qualified Data.Text.Lazy as TL
43>
44> import System.FilePath
45> import System.Directory
46> import System.Posix.Files
47> import System.Posix.Temp
48>
49> import System.Environment
50> import System.IO
51> import GHC.IO.Handle
52> import System.Process as P
53> import System.Exit
54>
55> import Control.Concurrent
56> import Control.Applicative
57> import Control.Monad.Morph
58> import Control.Monad.Trans.State
59> import Control.Monad.Writer
60> import Control.Monad.Trans.Resource
61> import Control.Monad (liftM)
62>
63> import Data.Time.Clock
64>
65> import Data.BoolExpr
66
67The distinguishing feature of our Overly Complicated Feedreader™ is it´s support for
68`Thunk`s. It is thus reasonable to expect, that we have some functions to actually
69interact with them. Most striking in that hypothetical set of functions would be one that
70executes all thunks associated with a single `ObjectGen` and return a new one with the
71content generated by `Thunk`s filled in.
72
73Enter `forceAllThunks`.
74
75> forceAllThunks :: (MonadIO f, MonadResource f)
76> => (Thunk -> Bool) -- ^ Select which 'Thunk's to force
77> -> ObjectGen f -> f (ObjectGen f)
78> -- ^ Force all thunks in place and update '_oContent'
79> --
80> -- The reason we require a 'MonadResource' instance is that we would like to store our expensive to hold in RAM 'SubObject' contents in temporary files.
81> forceAllThunks pred = flip alter $ do
82> pureThunks <- use oThunks >>= lift >>= mapM lift
83> (newThunks, objectResults) <- mapAndUnzipM forceThunk' pureThunks
84> assign oThunks $ return $ map return newThunks
85> oContent %= liftM (<> mconcat objectResults)
86> where
87> forceThunk' thunk
88> | pred thunk = forceThunk thunk
89> | otherwise = return (thunk, Map.empty)
90
91`forceAllThunks'` (`resetAllThunks'` below, too) is tainted by the evil of
92`generateObject` but included for convenience.
93
94> forceAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f ()
95> -- ^ Version of 'forceAllThunks' suitable for use with 'update'
96> forceAllThunks' pred = get >>= lift . forceAllThunks pred . liftGen >>= lift . generateObject >>= put
97>
98> -- | Internal helper to track computations creating 'SubObject's
99> type WithObjects (n :: * -> *) (m :: * -> *) = WriterT (Map SubObjectName (n SubObject)) m
100>
101> forceThunk_ :: (MonadIO m, MonadResource m) => Thunk -> WithObjects (ResourceT IO) m Thunk
102> -- ^ Make sure the 'ThunkState' contained within a 'Thunk' is 'Executed'
103> forceThunk_ = flip alter $ do
104> -- (result, subObjects) <- liftIO $ runWriterT $ parseThunk thunk
105> -- writer ((), subObjects)
106> (result, subObjects) <- listen . hoist (hoist liftResourceT) . lift . parseThunk =<< get
107> tState .= Executed (Map.keys subObjects) result
108>
109> forceThunk :: (MonadResource m, MonadResource n) => Thunk -> m (Thunk, Map SubObjectName (n SubObject))
110> -- ^ Force a 'Thunk' and return it in 'Executed' state together with the 'SubObject's it created during execution
111> forceThunk thunk = liftM (_2 %~ fmap liftResourceT) $ (runWriterT . forceThunk_) thunk
112
113Quite often we want to undue the harm done by `forceAllThunks` (to save space, usually).
114
115> resetAllThunks :: Monad f
116> => (Thunk -> Bool) -- ^ Select which 'Thunk's to reset
117> -> ObjectGen f -> f (ObjectGen f)
118> -- ^ Undoes 'forceAllThunks':
119> --
120> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) >>= forceAllThunks (const True) = forceAllThunks (const True) obj
121> --
122> -- This inevitably drops information ('ThunkResult's for one).
123> --
124> -- In the case where 'forceAllThunks' does not drop information (i.e.: no 'SubObjectName' collisions ocurr) the following, stronger property holds:
125> --
126> -- prop> forceAllThunks (const True) obj >>= resetAllThunks (const True) = return obj
127> resetAllThunks pred = flip alter $ do
128> thunks <- liftM (map lift) (use oThunks >>= lift) >>= sequence
129> let
130> (subObjectNames, newThunks) = over _1 concat $ unzip $ map resetThunk' thunks
131> oThunks .= return (map return newThunks)
132> oContent %= (>>= return . Map.filterWithKey (\k _ -> k `notElem` subObjectNames))
133> where
134> resetThunk' thunk
135> | pred thunk = resetThunk thunk
136> | otherwise = ([], thunk)
137>
138> resetAllThunks' :: (MonadIO f, MonadResource f) => (Thunk -> Bool) -> StateT Object f ()
139> -- ^ Version of 'resetAllThunks' suitable for use with 'update'
140> resetAllThunks' pred = get >>= lift . resetAllThunks pred . liftGen >>= lift . generateObject >>= put
141>
142> resetThunk :: Thunk -> ([SubObjectName], Thunk)
143> -- ^ Reset a thunk and return the 'SubObjectName's of the 'SubObject's it once created.
144> -- This forgets information.
145> resetThunk thunk = case thunk ^. tState of
146> NotExecuted -> ([], thunk)
147> Executed created _ -> (created, set tState NotExecuted thunk)
148>
149> parseThunk :: Thunk -> WithObjects (ResourceT IO) (ResourceT IO) ThunkResult
150> -- ^ Generate a runnable action from a 'Thunk'
151> --
152> -- Regarding the "inner" and "outer" 'Monad' here being 'IO': We have not, at time of forcing, a neccessary connection to our backstore and thus cannot expect the monads to be anything else.
153> parseThunk thunk = do
154> tmpDirName <- liftIO getTemporaryDirectory
155> progName <- liftIO getProgName
156> let
157> tmpDirName' = tmpDirName </> progName
158> (_, tmpDir) <- allocate (mkdtemp tmpDirName') removeDirectoryRecursive
159> let exec = tmpDir </> "exec"
160> out = tmpDir </> "out"
161> result <- liftIO $ do
162> createDirectory out
163> LBS.writeFile exec script
164> setFileMode exec $ foldl unionFileModes nullFileMode [ownerReadMode, ownerExecuteMode]
165> (Just std_in, Just std_out, Just std_err, ph) <- createProcess $ (P.proc exec []) { cwd = Just out, std_in = CreatePipe, std_out = CreatePipe, std_err = CreatePipe }
166> hClose std_in
167> hSetBinaryMode std_out True
168> hSetBinaryMode std_err True
169> std_out `sendTo` stdout
170> std_err `sendTo` stderr
171> construct $ do
172> rOutStd <~ TL.fromStrict <$> liftIO (T.hGetContents std_out) -- Yes, sadly we have to be strict here
173> rOutErr <~ TL.fromStrict <$> liftIO (T.hGetContents std_err)
174> rExit <~ toNum <$> liftIO (waitForProcess ph)
175> outputFiles <- liftIO $ getDirectoryContents out
176> let
177> outputFiles' = filter fileFilter outputFiles
178> fileFilter = and . (<*>) [(/=) ".", (/=) ".."] . pure . takeFileName
179> mapM_ tell =<< mapM (liftResourceT . toSubObject) outputFiles'
180> return result
181> where
182> script = thunk ^. tScript
183> toSubObject :: FilePath -> ResIO (Map SubObjectName (ResIO SubObject))
184> -- ^ Using 'ResourceT' provides us with the guarantee, that the 'FilePath' we´re referring to should still exist when we actually try to get the 'SubObject'´s contents
185> toSubObject name = fmap (Map.singleton name' . return) $ construct $ do
186> sContent <~ liftIO (LBS.readFile name)
187> sUpdates <~ pure <$> liftIO getCurrentTime
188> where
189> name' = takeFileName name
190> sendTo input output = do
191> input' <- hDuplicate input
192> forkIO $ do
193> hSetBuffering input' NoBuffering
194> LBS.hGetContents input' >>= LBS.hPutStr output
195> return ()
196> toNum :: Num a => ExitCode -> a
197> toNum ExitSuccess = 0
198> toNum (ExitFailure i) = fromInteger $ toInteger i
199
200We provide `update`, a convenience function for high-level interactions (though costly on
201large sets of equivalent objects (which should not exist due to nubbing)).
202
203> eqTo :: Monad f => Object -> SearchQuery f
204> -- ^ @eqTo o@ constructs a 'SearchQuery' that matches all 'Object's 'Equivalent' to @o@
205> --
206> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel'.
207> eqTo o = BConst ((>>= return . (~~) o) . generateObject)
208>
209> update :: Beutel f => SearchQuery f -> StateT Object f a -> f ()
210> -- ^ @update search action@ replaces /all/ 'Object's matching @search@ within the 'Beutel' by versions of themselves modified by applying @action@.
211> --
212> -- Does not handle '_sUpdates'.
213> --
214> -- This is costly because it calls 'generateObject' on the contents of the entire 'Beutel' /and/ all results of @search@ (in order to use 'Eq' on 'Object's to delete the results of the initial 'search').
215> update query alteration = do
216> matches <- search query
217> matches' <- mapM generateObject matches
218> delete $ BConst ((>>= return . (`elem` matches')) . generateObject)
219> mapM_ (\o -> alter o alteration >>= insert) matches'
220> return ()