Archive for Uncategorized

Comonads in everyday life.

This post is a literate haskell file. As is usual with such things, you can go ahead and paste it into a .lhs file and load it right up in ghci. As such, first some boring preliminaries.


> module CoMenu where
> import Control.Applicative; import Data.List; import Data.Tree; import Data.Maybe
> import Network.Frameworks.HVAC; import Network.Frameworks.HVAC.AltController

Anyway, say you’re serving up a website. And say this website has hierarchical menus. Maybe you want to display them with javascript, maybe statically, whatever. But of course, you also need some component of these menus to be generated server-side regardless, so that the structure matches up with the location of the current page in the hierarchy of content. One common way to do this, which isn’t all that bad actually, is to hardcode the whole menu, or to generate it from some list of sections/subsections, etc. that associates them with individual actions, be they .php or .jsp pages, or be they in servlet mappings or, you know, whatever. And in this menu, be it hardcoded or partly generated, there’s some additional code that conditionally displays subhierarchies (i.e. if you’re in the “About” section it will display the subsection selection as well, or etc.) or at a minimum disables the link to the current page, to give some indication of where the user currently is.

There are ways to fancy this up and reduce the overhead, but that’s the general notion — you have a dispatch system, and you have a menu system, and the information in the dispatch and menu systems don’t necessarily correspond. Or if you’re using “configuration by convention” maybe you’ve built something that relies on a correspondence, so that, e.g., a page can “autodiscover” its location in the hierarchy by introspection on its url.

So anyway, you’re describing the structure of your website twice, in perhaps very different formats, and the extent that you can improve on things is really improving the brevity of the formats and throwing some logic to the wind on the assumption that the descriptions really are rendundant.

The natural big little idea that I’m going to discuss then, is how we might use the same structure to describe both the dispatch and the menu, and so cut out all that nonsense. Now doing this is at least slightly trickier than it seems. That’s because in the typical imperative approach we’re thinking in passes from the top down — so even assuming we have a dispatch tree that we walk based on a url parse — and with, e.g., the hvac framework I’m working on, that’s a only a few lines — well, even assuming we have the dispatch down pat, consider the standard imperative approach to generating the menu itself.

For the purposes of this discussion the menu will be a horizontal one, with the top row containing the top level, the next row containing the paths of the selected sublevel, etc. The approach itself will generalize to any sort of menu, however. In any case, we start by rendering the top row. But wait! We have to distinguish the selected section. So, uh, on rendering the link to each section we check if it matches the first portion of the path. And then we react accordingly. And then, using the section we’ve found that matches that first portion we recurse into the next row and soforth.

But wait, that’s not how menus work! Generally, though not always, if you pick the a top section then it won’t show a special “top section” page but will show the page that you get if you “drill down” through first options until you eventually find a leaf. So we’re not talking about a rose tree here, but about a tree with content only at the leaf nodes, which is an irregular structure. On the other hand, maybe there’s content only at the leaf nodes 90% of the time, and 10% of the time there’s content elsewhere. So we’ll actually use a rose tree structure (from Data.Tree) with each node containing the following:


> data MItem a = MItem {
>       miName :: String,
>       miPath :: String,
>       miAction :: Maybe a } deriving Show

Of course with this there’s no guarantee that a leaf node contains a page action either, but it’ll do for the purposes of this discussion.

In any case, you see the issue — the dispatcher “drills down” to an action page, but the rendering of that page depends on the whole context of the menu, not just the initial path we were given. There are various ways to hack around this, but I won’t enumerate them all here. And then of course there’s the insight that even though the menu is “dynamic” on the page that it is rendered in, its static with regards to any given page and as such, a proper pre-traversal of the entire tree can (lazily, mind you) render the menus for every page at once.

So what we want is to traverse the tree in such a way that we preserve the context of the location we’re in with regards to everywhere else in the tree, so that we can render the menu properly, relative urls and all. And we want to do this entire traversal only once, rather than piecewise and repeatedly with each request. So what recursion scheme works? Well, fmap doesn’t, nor are any varients of traverse or mapAccum strong enough — they only tell us where we’ve been, not where we’re going. If only we could associate the context with each node beforehand — then a simple map would do the trick!

Well, the contents of a node plus all the associated context of a node have a common representation — the zipper, which is precisely the “one-hole context” of a datastructure. By definition, the zipper, or derivative, of any data structure, will contain enough information to reconstruct both the structure as well as an indext to a node within it. So here’s a simple and partial implementation of a zipper over a Tree. The included functions actually take it to and from a Forest, which is really just an alias for [Tree a].


> data Zipper a = Zipper {
>       prevLevels :: [Zipper a],
>       leftForest :: Forest a,
>       rightForest :: Forest a } deriving Show

> instance Functor Zipper where
>     fmap f (Zipper p l r) = Zipper (fmap f <$> p) (fmap f <$> l) (fmap f <$> r)

> forest2zip :: Forest a -> Zipper a
> forest2zip ts = Zipper [] [] ts

> zip2forest :: Zipper a -> Forest a
> zip2forest z = case prevLevels z of
>                  [] -> rightForest z
>                  pl -> rightForest . head . reverse $ pl

Now actually we’d typically include a host of functions for “walking around” within the zipper, including generalized depth-first traversal, etc. But it turns out that for our purposes we only ever need to step down, so the rest is ommited for brevity.


> zDown :: Zipper a -> Maybe (Zipper a)
> zDown x = case rightForest x of
>             [] -> Nothing
>             (t:_) -> case subForest t of
>                      (t':ts) -> Just $ Zipper {
>                                         prevLevels = x : prevLevels x,
>                                         leftForest = [],
>                                         rightForest = t':ts }
>                      _ -> Nothing

This gives us a representation of each node and its context. But still, how to associate them with each node, so as to preserve the overall structure? We could concievably turn our entire menu into a zipper, and then step through it, at each location replacing the action by one which includes the proper menu. But this would mean introducing another irregularity — holding

Either (Zipper a, a) a

or some equivalent (c.f. Conor McBride’s Clowns and Jokers). Yeesh! We want to do it all at once. I.e. we want a signature of

Zipper a -> Zipper (Zipper a)

. Wait! Hang on! That’s a specialization of “duplicate” which is to a comonad as “join” is to a monad! And in fact, as any fule kno, The dual of (monadic) substitution is (comonadic) redecoration. Oh, and it gets better! Behind every zipper is a comonad. And better yet! We also know that a rose tree is a cofree comonad over the list functor, which should somewhat help to tie this all together in terms of why and how a zipper is a comonad.

In any case, we provide the basic comonad class, and an instance for our zipper:


> class Functor w => Comonad w where
>         extract :: w a -> a
>         duplicate :: w a -> w (w a)
>         extend :: (w a -> b) -> w a -> w b
>         extend f = fmap f . duplicate
>         duplicate = extend id
>
> instance Comonad Zipper where
>     extract = rootLabel . head . rightForest
>     duplicate z@(Zipper p l r) = Zipper p' l' r'
>        where p' = fmap duplicate p
>              (l',r') = splitAt (length l) fs
>              lrs = l ++ r
>              fs = map (go p) $ zip3 (inits lrs) (tails lrs) lrs
>              go :: [Zipper a] -> ([Tree a],[Tree a],Tree a) -> Tree (Zipper a)
>              go ls (i,t,tr) =
>                         let z = Zipper {prevLevels = ls,
>                                         leftForest = i,
>                                         rightForest = t}
>                             sf' = subForest tr
>                         in tr {rootLabel = z,
>                                subForest = map (go (z:ls)) $
>                                            zip3 (inits sf') (tails sf') sf'}

Ok, so that duplicate code isn’t the easiest to write. Nonetheless, the beautiful thing is you only have to write it once. And it’s not that painful once you realize that we’ve really just got a generalization of the list comonad. But with all that traversal out of the way, a function to render menus almost writes itself. This one is parameterized over the function that draws the actual menu, and includes a little tweak to deal with irregular trees — if there’s no action at the given node, it traverses down until it finds a node that does provide an action.


> renderMenu ::
>     (Zipper (MItem (String -> a)) -> String)
>     -> Forest (MItem (String -> a)) -> Forest (MItem a)
> renderMenu renderRows = zip2forest . extend go . forest2zip
>     where
>       drillToAct :: Zipper (MItem a) -> Zipper (MItem a)
>       drillToAct z = case rightForest z of
>                        [] -> z
>                        (t:ts) -> case miAction . rootLabel $ t of
>                                    Nothing -> case zDown z of
>                                                 Nothing -> z
>                                                 Just z' -> drillToAct z'
>                                    _ -> z
>       go z = (extract z) {miAction = ($ renderRows z) <$>
>                           miAction (extract . drillToAct $ z)}

And here’s one example of a function that turns that zipper into a rendered menu. It makes lots of controversial design choices, has some quirks, and is really just hacked up for the purposes of this discussion, but the main point is that the legwork we did before lets us abstract the rendering from the traversal, so we can swap in whatever we want with no hassle.


> menuRows :: Zipper (MItem a) -> String
> menuRows z = concatMap (renderRow curDepth "") (reverse $ prevLevels z) ++
>              renderRow curDepth "" z ++
>              (case zDown z of
>                         Just z' -> renderRow curDepth ((miPath . extract) z ++ "/") z'
>                         Nothing -> "")
>     where curDepth = (length . prevLevels) z
>           renderRow curDepth prevPath z =
>                     "<div>" ++
>                     intercalate " | " (map mkLink (leftForest z)) ++
>                     (if null prevPath
>                        then " | " ++ (miName . extract) z ++ " | " ++
>                             intercalate " | "
>                                (map mkLink (tail . rightForest $ z))
>                        else " | " ++ intercalate " | "
>                                (map mkLink (rightForest z))) ++
>                     "</div>"
>               where mkLink x = "<a href='" ++ concat (replicate depth "../") ++
>                                prevPath ++
>                                (miPath . rootLabel) x ++ "/'>" ++
>                                (miName . rootLabel) x ++ "</a>"
>                     depth = curDepth - (length . prevLevels $ z)

And to round out the picture, the promised simple dispatch function, and below it, some sample data.


> menu2dispatch :: Forest (MItem (String -> HCGI q s CGIResult))
>                 -> HCGI q s CGIResult
> menu2dispatch mis' = endPath *>
>                        (fromJust . miAction . rootLabel . head) renderedMenu
>                      <|> go renderedMenu
>     where renderedMenu = renderMenu menuRows mis'
>           hasPath t pth = (miPath . rootLabel) t == pth
>           go mis = do
>             p <- takePath
>             case find (`hasPath` p) mis of
>               Nothing -> continue
>               Just mi -> endPath *> (fromJust . miAction . rootLabel) mi
>                          <|> go (subForest mi)
>
> foo = [
>        Node 2 [Node 10 [Node 11 []], Node 23 []],
>        Node 4 [Node 5 [], Node 7 []],
>        Node 8 [Node 9 [], Node 13 []],
>        Node 25 []
>       ]
>
> bar = map (fmap (\x -> MItem {
>                       miName = show x,
>                       miPath = show x,
>                       miAction = if even x
>                                  then Nothing
>                                  else Just $
>                                        \y -> y ++ "\n" ++ "body: " ++ show x}))
>       foo

> {-
> *CoMenu> fromJust . miAction . rootLabel . head $ renderMenu menuRows bar
> "<div> | 2 | <a href='4/'>4</a> | <a href='8/'>8</a> | <a href='25/'>25</a></div><div> | <a href='2/10/'>10</a> | <a href='2/23/'>23</a></div>\nbody: 11"
> -}

So yeah, comonads in practical day-to-day programming. Not as rare as you would suspect, and occasionally just what the doctor ordered.

Comments (3)

Some concepts behind hvac

Having written the hvac and HStringTemplate libraries, I need to get around to documenting them better, not to mention polishing hvac up for a hackage release. But in the meantime, by way of introduction to (hopefully) a series of tutorials, I figured I’d go through some of the basic principles behind what I’ve been doing. Which is sort of an arrogant thing, so apologies therefore for how dull or obvious the following may be.

Web programming is an exercise in managing scope

Even a reasonably well-architected web framework in Java or Python or what-have-you will at some point throw scope and typing to the wind. You have page-level scope, request-level scope, cookie-level scope, session-level scope, database-level persistent scope, application-level scope, server-level scope, and on top of that maybe conversation-level, or a few others depending on how fancy things get. (Oh, and GET and POST params fall into different scopes too… and then there are rest-style url params too).

In php you generally get this stuff (application and db aside) shoved into what are called “superglobal” variables, which are maps from strings to… stuff. Rails does the same thing with hashes, and sometimes with symbols rather than strings. Java generally tends to shove stuff into hash-like things as well (e.g. request, pageContext [which is, horrors, indexed by scope *and* name]). And with java frameworks using beans, you tend to get a stack in the page scope as well, so tags have something implicit to operate over (although this tends to be hidden from the end-user).

So scope then gives us a number of problems — anything can be anywhere, named anything, and of any type. And most webapps are built on, at some level, just trusting that some component somewhere up the chain got the right thing and set it to the right type, and that therefore when you look up the name you get it, and when you try to use it as the type you expect nothing goes terribly wrong. And don’t even get me started on xml configs (injection-style or otherwise).

Thus to make web programming sane, we want some guarantees that would be elementary in any language since, I guess, Fortran. We want to have a guarantee that the things you expect to exist do exist,and furthermore, that they are what you expect them to be. We’re not talking Coq or even Haskell level strong static typing here, but more like Java, or C for that matter.

So wipe the slate clean and pretend we have no scope at all outside of the request itself — i.e. that our server is a pure function from Request -> String. Now, our request gives us a url, some get and post params, and some other possible stuff (user-agent, language, cookies, what-have-you). We want to map that request to a function that handles it. The normal way to do this is to do some silly regexp or write an ugly xml file, or both, or just to write a naive switch expression, or finally, to simply have a php model where each request maps to a file on the directory hierarchy (Oh, and of course, throw in a smattering of mod_rewrite rules as well!).

The directory model actually turns out to be the most interesting. This is because, each step into the tree we take, we narrow the possibilities of each further step. So we’re actually traversing a very tiny grammar. And if we match a file, then the path is a well-formed statement in that grammar. This brings us to a core principle of hvac — do you see it?

Controllers/dispatchers are parsers

We can explicitly write our controller function now as something which parses a request, and whose result is an http response. Or, to generalize back out, whose result is a function on the execution environment as a whole, yielding an http response.

Naturally, we want a backtracking parser, because we want to allow the end-user to construct an arbitrary grammar rather than a tightly restricted one. But not everything needs to backtrack. We “tokenize” our url into /-separated strings and consume and backtrack on those. But, GET params, etc? We don’t need to consume them, just check their existence.

And the really nice thing about Haskell here is that it comes (in the form of the Alternative class in Control.Applicative) with a built-in almost-dsl for writing neat little monadic parsers. Rather than write plain old functions, we can write combinator functions and so wrap the parser functionality up in something that reads like what it does. So a small snippet of hvac code from a controller might look like this:

 h |/ "board" |\ \(boardId::Int) ->
     (h |/ "new" >>
         h |// "POST" *> ... {- POST stuff -}
	  <|> ... {- GET stuff -} )
      <|> ... {-OTHER stuff-}

The hs are null parsers necessary to set off a chain of combinators. This code says: if the next part of the path is “board” and the part after that can be parsed to an Int, then if the part after that is “new” and its a POST request, do one set of things. But if its a GET request, do another. And if the part after the boardId *isn’t* “new” then go do some other stuff instead.

So we see already how treating our controller as a parser both lets us cut down on code duplication (we’re only getting boardId once, for all paths that have it), lets us embed fancy logic very compactly, and most importantly, guarantees that, in this case, we’ll have something named boardId in scope, and of the right type.

There are a few more dsl combinators in hvac, but they all operate essentially the same. So now, this gets us partway there. However, it hasn’t addressed get/post params, cookies, sessions, databases, etc. at all yet (although there are combinators to match simple equality on get/post params too). Let’s add just get/post params for the next level of complexity. Now get/post params are used slightly differently, because they’re based generally on user input. So our criteria for them need to be different — there are optional fields that may not be filled in, “valid” values might not meet other criteria (too short, invalid dates, not a genuine email address etc.) and soforth. Furthermore we don’t generally want bad values to prevent matching a page, rather we want an error page that highlights *all* the errors. So we know we need another layer for this.

Generally, web frameworks will hold another validation layer that purely performs validation. ASP bakes this in rather obscurely. PHP generally provides no particular solution. Python, as I understand it uses newForms and bakes validation into a forms layer in general. Ruby does it through activeRecord (which begs the question, how to validate data that isn’t destined for an activeRecord?). Java frameworks, unsurprisingly, will tend to use XML (Or nowadays, some annotation monstrosity too, I imagine, as tied to, e.g., Hibernate).

These styles of validation can lead to lost information at times — e.g., even once a validator passes you still may need subsequent manual casting or parsing. Furthermore, they tend to mix layers in that “valid” for the database (i.e. well-typed) should well be different than “valid” in a more narrow sense (i.e. a proper email address). And then an “invalid” userid depends on whether that userid is in the database as well as if that userid has enough of only the allowed characters, etc. You can guess where I’m going, right?

Validation is parsing (in context)

We can place all our constraints on incoming data, as well as instructions on how to parse it into the data type we want (which might even mean, e.g., turning a user id into a full-fledged hunk of user data as built from a database query). Here’s an example validation function that does just that (positing that mkUser takes the returned map and shoves it into a user object).

validUser :: ValidationFunc s String User
validUser s = do
  u <- selectRow "* from users where name=?"
  if M.null u
    then fail $ "No such user: " ++ s
    else return mkUser u

The type signature here says that this function operates on any session type (which we’ll ignore for now) and takes a String and returns a User (or, of course, an error). One neat thing worth mentioning here, even though I haven’t touched on databases and application state yet, is that the type of ValidationFunc allows us to read things such as databases, but *not* to write to them, since it would lead to all sorts of trouble if validators started preforming unrestricted side-effects.

We might invoke the validator (and a few others, just to show how nifty it is to chain them) like this:

withValidation
  (("user", trim >=> lengthAtLeast 2 >=> userValid),
   ("pass", lengthAtLeast 3))
  (\ (user, pass) -> ... )

Here, we see that the action inside of the ellipsis is performed only if the validation and parsing succeeds, and again, when it is performed, it is guaranteed that the values it recieves are in the data type we expect. (If the validation fails, the enclosing “parse” fails the controller takes over again, but first the specific validation failures are recorded elsewhere for handy reference).

That’s enough for one post, I think, and I haven’t even dealt with templates yet, much less scope and mutable state.

Comments (7)

Simple Extensible Records — Quick Generic Tricks, Pt. 1

There have been a few discussions lately about how to do quick and easy typesafe extensible records in Haskell. And there have been a number of discussions lately about extensions to do it more cleanly (see, e.g. this one on haskell-cafe). This came up on the channel the other day and somebody immediately suggested HList, which seems too heavyweight for this sort of thing to me. HaskellDB, for example, also rolls its own fairly heavyweight new records system.

Anyway, in discussion, we cooked up an idea which Kamina, who originally asked the question, later implemented very cleanly. The gist is that one defines a map of properties, and implements a typeclass for accessing them. Subsequently, one defines a typeclass for each property, whose default methods know how to query them from the properties map. Then, providing any data type with a new property is just a quick instance declaration away. There’s a slight messiness in that one needs to directly index properties by strings, or by a Property enumeration that’s set in stone. (One can actually index them by TypeReps using Typeable, but that’s sort of messy.)

However, there’s another issue as well — we want records extensible in two ways. The typical way is just being able to reuse an accessor and setter function over multiple data types with the same property. But we also want to be able to actually extend records with something like inheritance — i.e. to declare some records strict supersets of others. The following solution, with generics, both solves that issue, and is significantly more lightweight, although it still requires mild boilerplate. Additionally, while the former solution lets one “break” it by circumventing the typeclass-bound “smart accessor” functions and operating on the properties map directly, this solution can also be “broken” by giving it a bad instance declaration.

Both solutions, one should note, can easily be augmented with Template Haskell to reduce the boilerplate still further.

We start by realizing that rather than creating a properties map, it would be nice if we could walk directly over the data in our record and select something of the right type. This, of course, is something that the Scrap Your Boilerplate generics library excels at. In fact, it turns out we have the function we need right at our fingertips: gfindtype, which is of type (Data x, Typeable y) => x -> Maybe y . We ask it for a specific type, and if that type is an immediate subterm of x, then it returns it, otherwise it fails.

Now we need a modifier. There isn’t one built in, but its simple enough to add one. The type signature is immediately evident: greplace :: (Data a, Typeable b) => a -> b -> Maybe a . However, what to do with this type signature is a bit trickier. Data.Generics gives us gmapMo of type (Data a, MonadPlus m) => (a -> m a) -> a -> m a. It traverses the immediate subterms of a type, and applies a transformation. We build our traverser using `extM` which composes a chain of actions to try, matching from right to left the first with an appropriate type. Our default action when nothing matches is to const Nothing (i.e., to leave the term unchanged). When the term does match, we return just the term. This gives us:

greplace x y = gmapMo (const Nothing `extM` (const (Just y))) x

And indeed, this does what we want. However, we don’t have any type safety constraints yet. Every query and update gives us a maybe, depending if it matches. But we can combine this system with the typeclass constraints from the other system and get a very clean interface. Assume we have properties of type Color and Size:

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
data Size = Size Int deriving (Show, Data, Typeable)

Now we can just write:

class Data a => HasColor a where
    color :: a -> Color
    color = fromJust . gfindtype
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

And for anything with a Color, e.g., a car, we just can write instance HasColor Car. Not too shabby!

However, as we’ve written it, this only works for immediate subterms. So, just like our old solution with an explicit properties map, we still can’t extend our records just by wrapping them as part of a larger record.

Luckily, the generics libraries make it trivial to do deep searches as well. We can replace gfindnew with gfind = something (const Nothing `extQ` Just) . This function is built much the same as our greplace was — yielding Just a value of the appropriate type, or Nothing otherwise. The something combinator just maps our query function deeply over every term, from top to bottom, until it hits a match. And as for greplace, we can just replace gmapMo with “somewhere” for a similar effect. Now, if we have data DrivenCar = DrivenCar Person Car we can give the appropriate instance declarations, and get and set the color of the car directly here as well.

One more problem though: the somewhere combinator will apply our transformation at least once — which means that it might apply it many multiple times, and even if it doesn’t, it’ll keep working ever after it succeeds. This too is easy enough to fix. Look at the source and we see that somewhere is just defined as f x `mplus` gmapMq (somewhere f) x. So its trying the transform, and if the transform fails, its mapping the transform over all the immediate subterms, recursively. All we have to do is write a function “once” that looks exactly the same, but with gmapMo instead of gmapMq, and we have what we really need.

A complete working example follows below:

{-# LANGUAGE FlexibleInstances, FlexibleContexts, DeriveDataTypeable  #-}

module Records where

import Data.Generics.Basics
import Data.Generics.Aliases
import Data.Generics.Schemes
import Control.Monad
import Data.Maybe

greplace :: (Data a, Typeable b) => a -> b -> Maybe a
greplace x y = once (const Nothing `extM` (const (Just y))) x

once :: MonadPlus m => GenericM m -> GenericM m
once f x = f x `mplus` gmapMo (once f) x

gfind :: (Data a, Typeable b) => a -> Maybe b
gfind = something (const Nothing `extQ` Just)

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
newtype Size = Size Int deriving (Show, Data, Typeable)
unSize (Size a) = a

data Car = Car Color Size deriving (Data, Typeable, Show)
data Cube = Cube Color deriving (Data, Typeable, Show)

data Two = Two Int Car Car deriving (Data, Typeable, Show)

class Data a => HasColor a where
    getColor :: a -> Color
    getColor = fromJust . gfind
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

class Data a => HasSize a where
    getSize :: a -> Int
    getSize = unSize . fromJust . gfind
    setSize :: a -> Int -> a
    setSize = (fromJust .) . (. Size) . greplace

instance HasColor Car
instance HasColor Cube
instance HasSize Car
instance HasColor Two
instance HasSize Two

{-
*Records> getColor (Two 12 (Car Red (Size 34)) (Car Blue (Size 93)))
Red

*Records> setSize (Two 12 (Car Red (Size 34)) (Car Blue (Size 93))) 232
Two 12 (Car Red (Size 232)) (Car Blue (Size 93))
-}

Comments (1)

Type hackery for the practical programmer pt. II

This post is a long time coming, and sort of anti-climactic, but I wanted to just finish off what I’d begun describing in the previous post.

We have, if you will recall:

class MapFromTuple a b where
      mapFromTuple :: a -> [b]

This is our handy way to marshal a heterogenous tuple into a uniformly-typed list. Now, we can perform our operations on this uniform list. The next step, however, is to go back from our transformed list to a heterogenous tuple. Immediately an answer presents itself:

class MapToTuple a b where
    mapToTuple :: [b] -> a

The instance declarations are correspondingly the inverse of those for MapFromTuple:

instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x))
    => MapToTuple (a,b) x where
    mapToTuple [a,b] = (fromGenD dict a, fromGenD dict b)
instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x), Sat (MapToTupleD c x))
    => MapToTuple (a,b,c) x where
    mapToTuple [a,b,c] = (fromGenD dict a, fromGenD dict b, fromGenD dict c)

But there’s a problem here. While mapFromTuple was a total function, mapToTuple is not, because it is polymorphic on its return type, not its argument type. So if we use mapToTuple in a casual fashion, we introduce a great deal of extra partiality. What we need is a way to carry the type context of our original tuple over to the return argument of mapToTuple, and thus guarantee that it yields the correct types in a correctly-sized tuple. And thus, we have the complete type signature for withValidation':

withValidation' :: (MapFromTuple a (String, ValidationFuncIntern s String Dynamic),
                    MapToTuple a1 Dynamic,
                    TupleMatchTwo a a1) =>
                   a -> (a1 -> HCGI r s t) -> HCGI r s t

The first restriction says that a is something which can be transformed into something of type (String, ValidationFunctionIntern s String Dynamic). The second restriction says that a1 is something which can be transformed *from* Dynamic. The trick here is the ridiculous TupleMatchTwo class, defined as such:

class TupleMatchTwo a b | a -> b where {}
instance TupleMatchTwo () ()
instance TupleMatchTwo (Box (foo, bar-> mk a)) (Box a)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b)) (a,b)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b), (foo, bar -> mkc c)) (a,b,c)

… and soforth. Lots of boilerplate, but luckily, as with the other boilerplate instances, this is write-once boilerplate that saves us lots of work elsewhere. Although I’m probably butchering the term, what TupleMatchTwo gives us is a very limited type-level logic-programming function that “extracts” the last bit of a rather complicated signature from each of a set of tupled values, and returns them alone. Unfortunately, as its not a real type-level function, it has to be unrolled by hand. The end result, however, is that now we can extract what we “know” our return type must be from our argument type, and by asserting that return type statically at compile time, we can guarantee what we sought to from the beginning — that withValidation’ can be passed an arbitrary tuple of validator functions and will produce only a well typed tuple of the corresponding validated inputs.

I might be asserting too much here, but it seems like this basic paradigm maps to 90% of the use cases for heterogeneous lists — we want to operate over them in a uniform way, and then we want to properly varigate our input again before exiting the library function.

This style of input is now in the current repo of hvac, for both validation and database functions. There are also a few other nice changes I plan to blog about soon, mainly along the lines of cleaning things up and reducing dependencies.

I’ve also run into a really nifty example of what can be done with this “tuple-level programming” in my current project — a little mini-dsl for SQL suitable for embedding into hvac, on which more later.

Leave a Comment

Type-hackery for the practical programmer

So hvac has a pretty nice validation framework built on the withSomething idiom (i.e. withSomething (\something -> etc)) which is a sort of continuation passing model. More particularly, it has: withValidation :: [(String, ValidationFunc s String String)] -> ([String] -> HCGI q s CGIResult) -> HCGI q s CGIResult. This is to say that it takes a list of pairs of validation functions and the request parameters to retrieve/process/validate, and a success handler which takes a list of those retrieved/processed strings, and it returns a result. If the validation fails, the errors are stashed into an appropriate location, and the cgi action fails.

So far so good — but basic strong static typing gives us a wart here. It takes a list and returns (so to speak) a list — this means that everything must be of a uniform type. Currently that type is String, although it can be generalized. But even then, every validator associated with a particular call to withValidation has to produce something of the same type.

So say you want to retrieve and Int that is greater than 3 and a String. Well, you have a validator for the String. All good. And you have a validator that read the Int, and then checks its value is greater than three, and then finally, maddeningly, you have to show it again to convert it back to a string.

The handler function then must read the sucker again, which duplicates work, although not too much. Even worse, even though our validation has “proved” that the value can be read without error, there’s no way to express this. So there are important properties for the soundness of our program that are true and exist, but only in our heads, and only to the extent we remember them — what we want is a way to express this knowledge concretely.

The second concern is the bigger concern, but let’s ignore it for now and concentrate first on just getting this thing working with polymorphic types at all. Our use case is actually just what Dynamics are for. Dynamics give you two core handy functions — toDyn :: Typeable a => a -> Dynamic and fromDynamic :: Typeable a => Dynamic -> Maybe a, which, obviously, returns Just something if the Dynamic is of the right type, and Nothing otherwise. So now we can pack arbitrary (Typeable) objects into a single list of a uniform type, and unpack them afterwards.

Similar idioms are used very frequently — HDBC, which is the database interface used by hvac, uses toSql and fromSql. The Takusen database interface uses bindP, which is like toSql but, erm, fancier. Cleverly, it avoids using fromSql and instead allows for runtime failure (I think) should the result set of a query not type properly into what the code expects. I’d very much appreciate comments on where else this sort of thing crops up.

In any case, in the withValidation function, and the other examples given, there’s a significant silliness here. These are library functions, supposedly with nice clean interfaces — but we get an ugly situation with all these toDyns or toSqls cluttering up our pretty list of values. Intolerable!

Typeclasses to the rescue. toDyn and friends all share a certain trait — they’re all, in essence, of the form func :: (Typeclass a) => a -> ConcreteType. And conceptually, we’re mapping this polymorphic function over a heterogeneous list to get a homogenous one that we can then operate on. The first thing that comes to mind is, therefore, HList. But HList is pretty complicated, and furthermore, my sense is that once you use HList, you’re pretty much locked into the “HList way,” although I’m welcome to be proven wrong.

Instead, we can implement a solution without requiring all of that HList stuff. The end result here is an extensible, simple way to implement a mapFromTuple function that takes an arbitrary length tuple, all of whose members are subject to a typeclass constraint, and returns a list with the proper function mapped over them. This can then be used by any code that runs into a similar issue.

Rather than HList, we’ll be borrowing ideas from Scrap Your Boilerplate, and particularly “syb3″ — i.e. the “Scrap your boilerplate with class: extensible generic functions” paper.

The trick here is pretty elegant, based on using typeclasses to abstract over dictionaries, apparently originally from Restricted Data Types in Haskell by John Hughes. Pretty much all the magic is in the one class declaration:

class Sat a where dict :: a

Seriously. That’s it. Reification of typeclass dictionaries in one simple line. Now we can write a concrete dictionary:

data MapFromTupleD a b = MapFromTupleD {toGenD :: a -> b}

And then a class which “unpacks” the dictionary:

class MapFromTuple a b where
    mapFromTuple :: a -> [b]

And boilerplate instances for tuples of various lengths:

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x))
    => MapFromTuple (a,b) x where
    mapFromTuple (a,b) = [toGenD dict a, toGenD dict b]

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x), Sat (MapFromTupleD c x))
    => MapFromTuple (a,b,c) x where
    mapFromTuple (a,b,c) = [toGenD dict a, toGenD dict b, toGenD dict c]

…and soforth. The key thing to think about is the magic of Sat in the context constraint. The context constraint on Sat specifies which dict gets pulled out of thin air, and so lets us write a function that abstracts over typeclasses the same way normal polymorphic functions abstract over concrete types.

Finally, we’re done with the machinery. Now to implement the dictionary for Typeable:

instance (Typeable x) => Sat (MapFromTupleD x Dynamic) where
    dict = MapFromTupleD { toGenD = toDyn }

And for Show while we’re at it:

instance (Show x) => Sat (MapFromTupleD x String) where
    dict = MapFromTupleD { toGenD = show }

Now we can do:

*Main> mapFromTuple (1,2) :: [String]
["1","2"]
*Main> mapFromTuple (1,2) :: [Dynamic]
[<<Integer>>,<<[Integer]>>]
*Main> mapFromTuple ("foo",2.3, Just 43) :: [String]
["\"foo\"","2.3","Just 43"]

Now if we wanted to get this functionality for sql types, we’d just have to add the single instance declaration for Sat, along the lines of the previous two… and soforth.

This gets us everything we want for the basic database-style examples, actually. But for hvac’s validation, things are more complicated, because even if we can marshall a tuple into a set of dynamics, then we’re still just returning a list of dynamics, rather than a tuple of actual values. However, we’re partway there on that, and I’ll leave off the rest until the next post.

Comments (5)

ANN: hvac 0.1b, a transactional, declarative framework for lightweight web applications

hvac (short for http view and controller) has been my project for the last little while, and is finally in a fairly usable state, so I’m opening up the repo (darcs get http://community.haskell.org/~sclv/hvac/) for folks to play with and to get some feedback. While its not yet ready for hackage, the package as provided should be fully cabal installable. Documentation is available at http://community.haskell.org/~sclv/hvac/html_docs/hvac/

The aim of hvac is to provide an environment that makes the creation of lightweight fastcgi-based web applications as simple as possible, with an emphasis on concise, declarative style code, correct concurrent transactional logic, and transparency in adding caching combinators.

There are two included example programs, naturally neither of which is feature complete. They share a common login module of about 50 lines of code, excluding imports and templates.

The first program is a classic, greenspun-style message board with basic login functionality. It totals roughly 40 lines and tends to use just under 4mb of memory on my system.

The second is a wiki based on Pandoc and the PandocWiki code. The code totals roughly 30 lines (rendering borrowed from PandocWiki aside) and uses about 5mb of memory on my system.

hvac processes all requests in the STM monad, with some bells attached to properly interleave STM with session, database and filesystem operations such that they all conceptually occur together in a single transaction per request. Currently it is only fully tested with sqlite, but it should operate, modulo a few tweaks, with an database accessible via HDBC.

hvac is particularly designed to use the HStringTemplate library as an output layer, in a simple declarative fashion, and HStringTemplate has been improved in the process (about which more below). As the StringTemplate grammar is explicitly sub-turing, this ensures a clean separation of program logic from presentation, while providing a nonetheless fairly powerful language to express typical display tasks.

The included cache combinators, still experimental, should allow a simple and fine-grained control over the level of caching of various disk-bound operations. Phantom types are used to ensure that no functions that modify state may be cached.

To give a flavor of hvac code, the following is the complete (twenty lines!) source of the wiki controller (due to sql statements, some lines are rather long, and I’ve added a few odd breaks to make them work on this page):

wikiController tmpl =
 h |/ "login" *> login_plug tmpl
 <|>
 (h |/ "wiki" |\\ \pageName ->
    h |// "POST" *>
          withValidation [ ("contents", return) ]
          (\ [contents] -> do
             pageId <- selectVal "id from pages where name=?" [toSql pageName]
             maybe (addErrors [("Login","must be logged in.")] >> continue)
                (\user -> case fromSql pageId of
                            Just (_::Int) ->
                              execStat
     "insert into page_hist(pageId,contents,author) values(?,?,?)"
          [pageId, toSql contents, toSql . userName $ user]
                            Nothing -> do
                              execStat
     "insert into pages(name,locked) values(?,?)"
          [toSql pageName, toSql (0::Int)]
                              pid <- selectVal "max(id) from pages" []
                              execStat
     "insert into page_hist(pageId,contents,author) values(?,?,?)"
          [pid, toSql contents, toSql . userName $ user]) =<< getSes
             continue)
         <|> do
         pageId <- selectVal "id from pages where name=?" [toSql pageName]
         (join $ renderf (tmpl "showPage") ("pageName", pageName)
               "pageContents" |= selectRow
     "* from page_hist where pageId=? order by time desc limit 1" [pageId] ))
   <|> (redirect . ( ++ "/wiki/Index") =<< scriptName)

Future directions for work on hvac include: Stress testing for correctness of transactional logic and benchmarks. Exploration of various efficiency tweaks. Unit tests. Further development of the cache combinator API. Improvement of the example apps and addition of a few others (a blog maybe). Expansion of the library of validator functions. Exploration of transferring hvac to the standard FastCGI bindings (currently it uses a custom modified version to work properly with STM). Improvement of the database layer, particularly with regards to common paging functions. Creation of a set of simple combinators for generating CRUD (create read update delete) pages. Creation of a minimal set of standard templates (maybe).

Comments (12)

ANN: HStringTemplate 0.3.1

This release of HStringTemplate (up now at Hackage) fixes a number of bugs pointed out to me by its small but growing user base (thanks, cinema, elliottt!) ranging from the minor (a particular two-level iteration pattern wasn’t working properly) to the truly irritating (poor handling of file groups). It’s still unfortunately skimpy on the docs, outside of the haddocks and the main StringTemplate grammar documentation at http://www.stringtemplate.org (although the examples from my new project [coming soon!] should also prove helpful). However, it does have a set of very nice and handy new features for development.

* renderf, a function similar in spirit to printf, that takes an arbitrary number of heterogeneous (String, value) tuples as arguments. This should cut down considerably on long setAttribute chains. Additionally, with custom instances (not, I’ll grant, trivial to write) it can be used to declaratively chain together strings of attribute retrieval functions in arbitrary monads, as in the above code example from hvac.

* dumpAttribs, a function/template that prints out the tree of the entire attribute environment a template is operating in — extremely handy for development.

* nullGroup, also for use in development, a simple way to display more information about templates that can’t be found. Error messages in usafeVolatileDirectoryGroup have also been significantly improved.

* getStringTemplate’, a version of getStringTemplate guaranteed not to be inlined. While the optimizer will still sometimes rearrange code such that a volatile group is not updated properly, this at least helps remedy the situation (I think).

* Some minor changes: For grammar reasons, dots have been removed from template names — however, underscores and slashes are now available. Additionally, there’s a much improved logic for which aspects of a local environment are overridden and preserved when a template is called from another.

Comments (2)

Older Posts »
Follow

Get every new post delivered to your Inbox.