Simple Extensible Records — Quick Generic Tricks, Pt. 1

There have been a few discussions lately about how to do quick and easy typesafe extensible records in Haskell. And there have been a number of discussions lately about extensions to do it more cleanly (see, e.g. this one on haskell-cafe). This came up on the channel the other day and somebody immediately suggested HList, which seems too heavyweight for this sort of thing to me. HaskellDB, for example, also rolls its own fairly heavyweight new records system.

Anyway, in discussion, we cooked up an idea which Kamina, who originally asked the question, later implemented very cleanly. The gist is that one defines a map of properties, and implements a typeclass for accessing them. Subsequently, one defines a typeclass for each property, whose default methods know how to query them from the properties map. Then, providing any data type with a new property is just a quick instance declaration away. There’s a slight messiness in that one needs to directly index properties by strings, or by a Property enumeration that’s set in stone. (One can actually index them by TypeReps using Typeable, but that’s sort of messy.)

However, there’s another issue as well — we want records extensible in two ways. The typical way is just being able to reuse an accessor and setter function over multiple data types with the same property. But we also want to be able to actually extend records with something like inheritance — i.e. to declare some records strict supersets of others. The following solution, with generics, both solves that issue, and is significantly more lightweight, although it still requires mild boilerplate. Additionally, while the former solution lets one “break” it by circumventing the typeclass-bound “smart accessor” functions and operating on the properties map directly, this solution can also be “broken” by giving it a bad instance declaration.

Both solutions, one should note, can easily be augmented with Template Haskell to reduce the boilerplate still further.

We start by realizing that rather than creating a properties map, it would be nice if we could walk directly over the data in our record and select something of the right type. This, of course, is something that the Scrap Your Boilerplate generics library excels at. In fact, it turns out we have the function we need right at our fingertips: gfindtype, which is of type (Data x, Typeable y) => x -> Maybe y . We ask it for a specific type, and if that type is an immediate subterm of x, then it returns it, otherwise it fails.

Now we need a modifier. There isn’t one built in, but its simple enough to add one. The type signature is immediately evident: greplace :: (Data a, Typeable b) => a -> b -> Maybe a . However, what to do with this type signature is a bit trickier. Data.Generics gives us gmapMo of type (Data a, MonadPlus m) => (a -> m a) -> a -> m a. It traverses the immediate subterms of a type, and applies a transformation. We build our traverser using `extM` which composes a chain of actions to try, matching from right to left the first with an appropriate type. Our default action when nothing matches is to const Nothing (i.e., to leave the term unchanged). When the term does match, we return just the term. This gives us:

greplace x y = gmapMo (const Nothing `extM` (const (Just y))) x

And indeed, this does what we want. However, we don’t have any type safety constraints yet. Every query and update gives us a maybe, depending if it matches. But we can combine this system with the typeclass constraints from the other system and get a very clean interface. Assume we have properties of type Color and Size:

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
data Size = Size Int deriving (Show, Data, Typeable)

Now we can just write:

class Data a => HasColor a where
    color :: a -> Color
    color = fromJust . gfindtype
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

And for anything with a Color, e.g., a car, we just can write instance HasColor Car. Not too shabby!

However, as we’ve written it, this only works for immediate subterms. So, just like our old solution with an explicit properties map, we still can’t extend our records just by wrapping them as part of a larger record.

Luckily, the generics libraries make it trivial to do deep searches as well. We can replace gfindnew with gfind = something (const Nothing `extQ` Just) . This function is built much the same as our greplace was — yielding Just a value of the appropriate type, or Nothing otherwise. The something combinator just maps our query function deeply over every term, from top to bottom, until it hits a match. And as for greplace, we can just replace gmapMo with “somewhere” for a similar effect. Now, if we have data DrivenCar = DrivenCar Person Car we can give the appropriate instance declarations, and get and set the color of the car directly here as well.

One more problem though: the somewhere combinator will apply our transformation at least once — which means that it might apply it many multiple times, and even if it doesn’t, it’ll keep working ever after it succeeds. This too is easy enough to fix. Look at the source and we see that somewhere is just defined as f x `mplus` gmapMq (somewhere f) x. So its trying the transform, and if the transform fails, its mapping the transform over all the immediate subterms, recursively. All we have to do is write a function “once” that looks exactly the same, but with gmapMo instead of gmapMq, and we have what we really need.

A complete working example follows below:

{-# LANGUAGE FlexibleInstances, FlexibleContexts, DeriveDataTypeable  #-}

module Records where

import Data.Generics.Basics
import Data.Generics.Aliases
import Data.Generics.Schemes
import Control.Monad
import Data.Maybe

greplace :: (Data a, Typeable b) => a -> b -> Maybe a
greplace x y = once (const Nothing `extM` (const (Just y))) x

once :: MonadPlus m => GenericM m -> GenericM m
once f x = f x `mplus` gmapMo (once f) x

gfind :: (Data a, Typeable b) => a -> Maybe b
gfind = something (const Nothing `extQ` Just)

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
newtype Size = Size Int deriving (Show, Data, Typeable)
unSize (Size a) = a

data Car = Car Color Size deriving (Data, Typeable, Show)
data Cube = Cube Color deriving (Data, Typeable, Show)

data Two = Two Int Car Car deriving (Data, Typeable, Show)

class Data a => HasColor a where
    getColor :: a -> Color
    getColor = fromJust . gfind
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

class Data a => HasSize a where
    getSize :: a -> Int
    getSize = unSize . fromJust . gfind
    setSize :: a -> Int -> a
    setSize = (fromJust .) . (. Size) . greplace

instance HasColor Car
instance HasColor Cube
instance HasSize Car
instance HasColor Two
instance HasSize Two

{-
*Records> getColor (Two 12 (Car Red (Size 34)) (Car Blue (Size 93)))
Red

*Records> setSize (Two 12 (Car Red (Size 34)) (Car Blue (Size 93))) 232
Two 12 (Car Red (Size 232)) (Car Blue (Size 93))
-}

Comments (1)

Type hackery for the practical programmer pt. II

This post is a long time coming, and sort of anti-climactic, but I wanted to just finish off what I’d begun describing in the previous post.

We have, if you will recall:

class MapFromTuple a b where
      mapFromTuple :: a -> [b]

This is our handy way to marshal a heterogenous tuple into a uniformly-typed list. Now, we can perform our operations on this uniform list. The next step, however, is to go back from our transformed list to a heterogenous tuple. Immediately an answer presents itself:

class MapToTuple a b where
    mapToTuple :: [b] -> a

The instance declarations are correspondingly the inverse of those for MapFromTuple:

instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x))
    => MapToTuple (a,b) x where
    mapToTuple [a,b] = (fromGenD dict a, fromGenD dict b)
instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x), Sat (MapToTupleD c x))
    => MapToTuple (a,b,c) x where
    mapToTuple [a,b,c] = (fromGenD dict a, fromGenD dict b, fromGenD dict c)

But there’s a problem here. While mapFromTuple was a total function, mapToTuple is not, because it is polymorphic on its return type, not its argument type. So if we use mapToTuple in a casual fashion, we introduce a great deal of extra partiality. What we need is a way to carry the type context of our original tuple over to the return argument of mapToTuple, and thus guarantee that it yields the correct types in a correctly-sized tuple. And thus, we have the complete type signature for withValidation’:

withValidation' :: (MapFromTuple a (String, ValidationFuncIntern s String Dynamic),
                    MapToTuple a1 Dynamic,
                    TupleMatchTwo a a1) =>
                   a -> (a1 -> HCGI r s t) -> HCGI r s t

The first restriction says that a is something which can be transformed into something of type (String, ValidationFunctionIntern s String Dynamic). The second restriction says that a1 is something which can be transformed *from* Dynamic. The trick here is the ridiculous TupleMatchTwo class, defined as such:

class TupleMatchTwo a b | a -> b where {}
instance TupleMatchTwo () ()
instance TupleMatchTwo (Box (foo, bar-> mk a)) (Box a)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b)) (a,b)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b), (foo, bar -> mkc c)) (a,b,c)

… and soforth. Lots of boilerplate, but luckily, as with the other boilerplate instances, this is write-once boilerplate that saves us lots of work elsewhere. Although I’m probably butchering the term, what TupleMatchTwo gives us is a very limited type-level logic-programming function that “extracts” the last bit of a rather complicated signature from each of a set of tupled values, and returns them alone. Unfortunately, as its not a real type-level function, it has to be unrolled by hand. The end result, however, is that now we can extract what we “know” our return type must be from our argument type, and by asserting that return type statically at compile time, we can guarantee what we sought to from the beginning — that withValidation’ can be passed an arbitrary tuple of validator functions and will produce only a well typed tuple of the corresponding validated inputs.

I might be asserting too much here, but it seems like this basic paradigm maps to 90% of the use cases for heterogeneous lists — we want to operate over them in a uniform way, and then we want to properly varigate our input again before exiting the library function.

This style of input is now in the current repo of hvac, for both validation and database functions. There are also a few other nice changes I plan to blog about soon, mainly along the lines of cleaning things up and reducing dependencies.

I’ve also run into a really nifty example of what can be done with this “tuple-level programming” in my current project — a little mini-dsl for SQL suitable for embedding into hvac, on which more later.

Comments

Type-hackery for the practical programmer

So hvac has a pretty nice validation framework built on the withSomething idiom (i.e. withSomething (\something -> etc)) which is a sort of continuation passing model. More particularly, it has: withValidation :: [(String, ValidationFunc s String String)] -> ([String] -> HCGI q s CGIResult) -> HCGI q s CGIResult. This is to say that it takes a list of pairs of validation functions and the request parameters to retrieve/process/validate, and a success handler which takes a list of those retrieved/processed strings, and it returns a result. If the validation fails, the errors are stashed into an appropriate location, and the cgi action fails.

So far so good — but basic strong static typing gives us a wart here. It takes a list and returns (so to speak) a list — this means that everything must be of a uniform type. Currently that type is String, although it can be generalized. But even then, every validator associated with a particular call to withValidation has to produce something of the same type.

So say you want to retrieve and Int that is greater than 3 and a String. Well, you have a validator for the String. All good. And you have a validator that read the Int, and then checks its value is greater than three, and then finally, maddeningly, you have to show it again to convert it back to a string.

The handler function then must read the sucker again, which duplicates work, although not too much. Even worse, even though our validation has “proved” that the value can be read without error, there’s no way to express this. So there are important properties for the soundness of our program that are true and exist, but only in our heads, and only to the extent we remember them — what we want is a way to express this knowledge concretely.

The second concern is the bigger concern, but let’s ignore it for now and concentrate first on just getting this thing working with polymorphic types at all. Our use case is actually just what Dynamics are for. Dynamics give you two core handy functions — toDyn :: Typeable a => a -> Dynamic and fromDynamic :: Typeable a => Dynamic -> Maybe a, which, obviously, returns Just something if the Dynamic is of the right type, and Nothing otherwise. So now we can pack arbitrary (Typeable) objects into a single list of a uniform type, and unpack them afterwards.

Similar idioms are used very frequently — HDBC, which is the database interface used by hvac, uses toSql and fromSql. The Takusen database interface uses bindP, which is like toSql but, erm, fancier. Cleverly, it avoids using fromSql and instead allows for runtime failure (I think) should the result set of a query not type properly into what the code expects. I’d very much appreciate comments on where else this sort of thing crops up.

In any case, in the withValidation function, and the other examples given, there’s a significant silliness here. These are library functions, supposedly with nice clean interfaces — but we get an ugly situation with all these toDyns or toSqls cluttering up our pretty list of values. Intolerable!

Typeclasses to the rescue. toDyn and friends all share a certain trait — they’re all, in essence, of the form func :: (Typeclass a) => a -> ConcreteType. And conceptually, we’re mapping this polymorphic function over a heterogeneous list to get a homogenous one that we can then operate on. The first thing that comes to mind is, therefore, HList. But HList is pretty complicated, and furthermore, my sense is that once you use HList, you’re pretty much locked into the “HList way,” although I’m welcome to be proven wrong.

Instead, we can implement a solution without requiring all of that HList stuff. The end result here is an extensible, simple way to implement a mapFromTuple function that takes an arbitrary length tuple, all of whose members are subject to a typeclass constraint, and returns a list with the proper function mapped over them. This can then be used by any code that runs into a similar issue.

Rather than HList, we’ll be borrowing ideas from Scrap Your Boilerplate, and particularly “syb3″ — i.e. the “Scrap your boilerplate with class: extensible generic functions” paper.

The trick here is pretty elegant, based on using typeclasses to abstract over dictionaries, apparently originally from Restricted Data Types in Haskell by John Hughes. Pretty much all the magic is in the one class declaration:

class Sat a where dict :: a

Seriously. That’s it. Reification of typeclass dictionaries in one simple line. Now we can write a concrete dictionary:

data MapFromTupleD a b = MapFromTupleD {toGenD :: a -> b}

And then a class which “unpacks” the dictionary:

class MapFromTuple a b where
    mapFromTuple :: a -> [b]

And boilerplate instances for tuples of various lengths:

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x))
    => MapFromTuple (a,b) x where
    mapFromTuple (a,b) = [toGenD dict a, toGenD dict b]

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x), Sat (MapFromTupleD c x))
    => MapFromTuple (a,b,c) x where
    mapFromTuple (a,b,c) = [toGenD dict a, toGenD dict b, toGenD dict c]

…and soforth. The key thing to think about is the magic of Sat in the context constraint. The context constraint on Sat specifies which dict gets pulled out of thin air, and so lets us write a function that abstracts over typeclasses the same way normal polymorphic functions abstract over concrete types.

Finally, we’re done with the machinery. Now to implement the dictionary for Typeable:

instance (Typeable x) => Sat (MapFromTupleD x Dynamic) where
    dict = MapFromTupleD { toGenD = toDyn }

And for Show while we’re at it:

instance (Show x) => Sat (MapFromTupleD x String) where
    dict = MapFromTupleD { toGenD = show }

Now we can do:

*Main> mapFromTuple (1,2) :: [String]
["1","2"]
*Main> mapFromTuple (1,2) :: [Dynamic]
[<<Integer>>,<<[Integer]>>]
*Main> mapFromTuple (”foo”,2.3, Just 43) :: [String]
["\"foo\"","2.3","Just 43"]

Now if we wanted to get this functionality for sql types, we’d just have to add the single instance declaration for Sat, along the lines of the previous two… and soforth.

This gets us everything we want for the basic database-style examples, actually. But for hvac’s validation, things are more complicated, because even if we can marshall a tuple into a set of dynamics, then we’re still just returning a list of dynamics, rather than a tuple of actual values. However, we’re partway there on that, and I’ll leave off the rest until the next post.

Comments (4)

ANN: hvac 0.1b, a transactional, declarative framework for lightweight web applications

hvac (short for http view and controller) has been my project for the last little while, and is finally in a fairly usable state, so I’m opening up the repo (darcs get http://community.haskell.org/~sclv/hvac/) for folks to play with and to get some feedback. While its not yet ready for hackage, the package as provided should be fully cabal installable. Documentation is available at http://community.haskell.org/~sclv/hvac/html_docs/hvac/

The aim of hvac is to provide an environment that makes the creation of lightweight fastcgi-based web applications as simple as possible, with an emphasis on concise, declarative style code, correct concurrent transactional logic, and transparency in adding caching combinators.

There are two included example programs, naturally neither of which is feature complete. They share a common login module of about 50 lines of code, excluding imports and templates.

The first program is a classic, greenspun-style message board with basic login functionality. It totals roughly 40 lines and tends to use just under 4mb of memory on my system.

The second is a wiki based on Pandoc and the PandocWiki code. The code totals roughly 30 lines (rendering borrowed from PandocWiki aside) and uses about 5mb of memory on my system.

hvac processes all requests in the STM monad, with some bells attached to properly interleave STM with session, database and filesystem operations such that they all conceptually occur together in a single transaction per request. Currently it is only fully tested with sqlite, but it should operate, modulo a few tweaks, with an database accessible via HDBC.

hvac is particularly designed to use the HStringTemplate library as an output layer, in a simple declarative fashion, and HStringTemplate has been improved in the process (about which more below). As the StringTemplate grammar is explicitly sub-turing, this ensures a clean separation of program logic from presentation, while providing a nonetheless fairly powerful language to express typical display tasks.

The included cache combinators, still experimental, should allow a simple and fine-grained control over the level of caching of various disk-bound operations. Phantom types are used to ensure that no functions that modify state may be cached.

To give a flavor of hvac code, the following is the complete (twenty lines!) source of the wiki controller (due to sql statements, some lines are rather long, and I’ve added a few odd breaks to make them work on this page):

wikiController tmpl =
 h |/ "login" *> login_plug tmpl
 <|>
 (h |/ "wiki" |\\ \pageName ->
    h |// "POST" *>
          withValidation [ ("contents", return) ]
          (\ [contents] -> do
             pageId <- selectVal “id from pages where name=?” [toSql pageName]
             maybe (addErrors [("Login","must be logged in.")] >> continue)
                (\user -> case fromSql pageId of
                            Just (_::Int) ->
                              execStat
     “insert into page_hist(pageId,contents,author) values(?,?,?)”
          [pageId, toSql contents, toSql . userName $ user]
                            Nothing -> do
                              execStat
     “insert into pages(name,locked) values(?,?)”
          [toSql pageName, toSql (0::Int)]
                              pid <- selectVal “max(id) from pages” []
                              execStat
     “insert into page_hist(pageId,contents,author) values(?,?,?)”
          [pid, toSql contents, toSql . userName $ user]) =<< getSes
             continue)
         <|> do
         pageId <- selectVal “id from pages where name=?” [toSql pageName]
         (join $ renderf (tmpl “showPage”) (”pageName”, pageName)
               “pageContents” |= selectRow
     “* from page_hist where pageId=? order by time desc limit 1″ [pageId] ))
   <|> (redirect . ( ++ “/wiki/Index”) =<< scriptName)

Future directions for work on hvac include: Stress testing for correctness of transactional logic and benchmarks. Exploration of various efficiency tweaks. Unit tests. Further development of the cache combinator API. Improvement of the example apps and addition of a few others (a blog maybe). Expansion of the library of validator functions. Exploration of transferring hvac to the standard FastCGI bindings (currently it uses a custom modified version to work properly with STM). Improvement of the database layer, particularly with regards to common paging functions. Creation of a set of simple combinators for generating CRUD (create read update delete) pages. Creation of a minimal set of standard templates (maybe).

Comments (9)

ANN: HStringTemplate 0.3.1

This release of HStringTemplate (up now at Hackage) fixes a number of bugs pointed out to me by its small but growing user base (thanks, cinema, elliottt!) ranging from the minor (a particular two-level iteration pattern wasn’t working properly) to the truly irritating (poor handling of file groups). It’s still unfortunately skimpy on the docs, outside of the haddocks and the main StringTemplate grammar documentation at http://www.stringtemplate.org (although the examples from my new project [coming soon!] should also prove helpful). However, it does have a set of very nice and handy new features for development.

* renderf, a function similar in spirit to printf, that takes an arbitrary number of heterogeneous (String, value) tuples as arguments. This should cut down considerably on long setAttribute chains. Additionally, with custom instances (not, I’ll grant, trivial to write) it can be used to declaratively chain together strings of attribute retrieval functions in arbitrary monads, as in the above code example from hvac.

* dumpAttribs, a function/template that prints out the tree of the entire attribute environment a template is operating in — extremely handy for development.

* nullGroup, also for use in development, a simple way to display more information about templates that can’t be found. Error messages in usafeVolatileDirectoryGroup have also been significantly improved.

* getStringTemplate’, a version of getStringTemplate guaranteed not to be inlined. While the optimizer will still sometimes rearrange code such that a volatile group is not updated properly, this at least helps remedy the situation (I think).

* Some minor changes: For grammar reasons, dots have been removed from template names — however, underscores and slashes are now available. Additionally, there’s a much improved logic for which aspects of a local environment are overridden and preserved when a template is called from another.

Comments (2)

More Fun With Monoids (and some Functors)

Having based the core of HStringTemplate on Monoids, I started looking for other places to apply them in my code. A natural choice was a string template group. Conceptually, I wanted a very open API — a group is anything you can query for a specific template, be it a data structure, or a wrapper for an unsafe function that accesses a directory, or something that makes a StringTemplate on the fly even. But how do we compose these? I.e., we have two groups, and wish to implement “inheritance” such that one group is queried and if the query fails, then we check for the “higher” group. And most importantly, how do we build this such that any combination of different types of groups can be composed with any other? The first obvious thing to do is to use types to whittle everything down to a basic interface.

From an OO mindset, we would tend to, thinking of an “interface,” use a typeclass with a query function: class Group where query :: Group -> String -> Maybe StringTemplate. But oy vey! Now we need a function addGroups :: (Group x, Group y) => x -> y -> ???. Well, to a new concrete group type, I suppose. data ComposedGroup x y => ComposedGroup x y. And: instance (Group x, Group y) -> Group (ComposedGroup x y) where query s = case query x s of Just a -> Just a; Nothing -> query y s. Well, this is fine, if a bit bulky. But compose a few groups and a few more and we can see that even if we don’t have to interact with them, the type signatures will be pretty huge. We could avoid this with existential types, but it hardly seems like that improves the elegance of our solution.

There’s a useful insight, however, that I’ve been able to apply at least 50% of the time that I’ve been tempted to reach for existentials — for all practical purposes, a single function typeclass is equivalent to a lambda expression. In fact, all you’re doing when you make such a substitution is rendering the dictionary passing explicit. So switching over to type STGroup = String -> (Maybe StringTemplate) we see immediate benefits. First, this directly represents what a group is — not even a newtype constructor to get in the way. Second, we can write addGroups x y s = case query x s of Just a -> Just a; Nothing -> query y s directly, without all the typeclass machinery getting in our way.

Ok, where I’m going with this should be howling obvious by now. We have a bunch of things which have an append operation that is associative, and an obvious zero (const Nothing) — we have a monoid! There’s actually a slight catch here — we have two monoids, depending on whether the append operation favors the left or right value passed in. And hence, the Data.Monoid library wraps Maybe as newtype First a = First { getFirst :: Maybe a }, and Last which gives you the last Just value instead of the first. (Side not here: I’d much prefer First was simply the default Monoid instance for Maybe, as Last is its dual [i.e. the same thing with the arguments to mappend reversed], and you as you can wrap any monoid in Dual to get its Dual for free.)

So now we have STGroup = String -> (First StringTemplate). And thanks to our old friend the reader monoid [instance Monoid b => Monoid (a -> b)], we again get the function we really wanted for free: addGroups becomes mappend!

Another side note: First and Last, absurdly don’t come with built-in functor instances, even though as wrappers for Maybe, the functor instance is obvious: instance Functor First where fmap f x = First . fmap f . getFirst $ x

Now we can “drill down” directly into our group, also for free. So say we have, and we do, a function optInsertTmpl that sets options (as a list of string pairs) for a single template. Now, let’s define a helper function () x y = (fmap . fmap) x y. [This function should be in everyone's standard libs by the way. As a generalization from the "owl" which composes a function with two arguments into a function with one argument: o :: (b -> c) -> (a -> a1 -> b) -> a -> a1 -> c; o = (.).(.), I end up using it all the time.] The reader monad (i.e. (String ->) ) is a functor. We’ve just defined First as a functor. So we can simply write:

optInsertGroup :: [(String, String)] -> Group -> Group
optInsertGroup opts f = optInsertTmpl opts <$$> f

We can think of this as “reaching in” two levels deep (past the reader, past the first) and operating directly on every single possible template our group can generate, at once.

Different context than the last example, same conclusion: Get the types and classes right, and the rest of the code almost seems to melt away.

Comments (3)

On Monoids

So HStringTemplate was built around a simple notion: Everything takes an environment, which is a data structure holding attributes and some other good stuff, and eventually returns a string. This was abstracted later on, but we’ll come back to that. Plan text? That returns a string. A reference to an attribute? It returns a string. A nested template? It returns a string. So when we parse/interpret any template, we end up with a list of functions of type Env -> String. Simple enough to get going with.

So how do we execute this template? We need a function of type [Env -> String] -> Env -> String. If we weren’t thinking functionally, we’d have to write a loop. Create an empty string, and for each function in the list, apply it to our environment, append it to the string, and repeat:

function render(template, env) {
	str = "";
	foreach (func in template) {
		str += func(env);
	}
	return str;
}

In functional terms we can write (ignoring stack and efficiency issues) a fold:
render tmp env = foldr (flip (++) . ($ env)) [] tmp

If we think about it a bit, this is actually concat $ map ($env) tmp, or simpler still concatMap ($env) tmp

Step back from the function signature again, and think of it one type at a time and we get an additional abstraction. Instead of ([Env -> String], Env) -> String. we can view it as [Env -> String] -> (Env -> String). But our code doesn’t really represent this, as it explicitly asks for the env parameter. But then again, we could rewrite it as render tmp = \env -> concatMap ($env) tmp, and then pointfreeing the lambda expression we get (tmp >>=) . flip ($) or (concatMap tmp) . flip ($) depending if we want the in this case useless obfuscation of using the list monad.

Good enough, right? Well, imagine that we’ve got our perfect-abstractionometer turned really high for some reason today. There’s a flip and a ($). It looks a bit ugly. It looks a bit obscure. We’ve got compression sure, but not necessarily any more clarity than when we started with. Well, instead of starting from the imperative mindset of how we want this function to work, let’s look at the type signature again and start thinking about what could produce what we’re describing. First, let’s desugar String back to its underlying type, [Char]. That gives us [Env -> [Char]] -> Env -> [Char]. Now since we’re thinking of Env and Char here as abstract atomic types, we can ask, what gives us [a -> [b]] -> a -> [b]? Well, this might be a job for the unwrapped reader monad, since the sequence operation (Monad m => [m a] -> m [a]) resembles what we want. Substitute in (e->) for the m, and we get [e -> a] -> e -> [a], which in our case becomes [Env -> [Char]] -> Env -> [[Char]]. Almost there! Throw in a concat and we get: (concat .) . sequence of type [a -> [b]] -> a -> [b]. And now, not only is our code short, pointfree, and elegant, but it expresses what we wanted all along more clearly than any previous construct.

And yet.. and yet… some nagging bit of our brain says that this isn’t a compound operation, but an elementary one, for a proper algebra. The only question is what that algebra is. Enter monoids. The Data.Monoid library isn’t extensively documented, but the types almost explain themselves. A monoid is about the simplest algebraic structure that one can imagine. In category theory, a monoid can be viewed as a category with one object.

Ok yeah, but what is it? A monoid is a set of elements, an associative binary operator between them, and an identity element. So, you might have a monoid (Integers,+,0) because + is associative between all integers, and X + 0 = X. Integers also give rise to another monoid: (Integers,*,1). And indeed the Sum and Product monoids are both defined in Data.Monoid. We also get other simple ones for booleans. And, of course, we get, for all a, ([a],++,[]). In fact, the list monoid is also known as the free monoid, essentially because, as best I (sorta) understand it, it comes without any constraints. I.e. if for the integers we have [1,4,2,3] we have “1423″ (from the monoid on integers derived by turning them to strings and appending them) and we have 24 from the product monoid and we have 10 from the sum monoid and etc. Thus, the free monoid keeps all information that goes into it, and therefore, so to speak freely arises (without constraints) for any set of objects.

What does all this get us? Well, it gets us generality, especially when we throw higher-order functions back in the mix. In particular, we can now notice that Data.Monoid gives us, for instance the monoid of endomorphisms. So we can conceptually “add” (or rather, mappend), for example, a whole set of functions of type “String -> String” into one big “String -> String” function.

Well, that looks helpful, but it’s not exactly what we want. There is, however, an even cooler Monoid: The reader monoid. Monoid b => Monoid (a -> b). Conceptually, there’s only one way its mappend function can work: feed the value a into two functions, yielding two bs. Now, given that these bs are, by definition, Monoids as well, we mappend their results.

Along with mappend, which is the associative element, we of course got mconcat for free, which is defined by default as foldr mappend mempty, and as such is the obvious generalization of concat. So now we can look at the type of mconcat specialized to the reader monoid: Monoid b => [a -> b] -> a -> b. And that, ladies and gentlemen, is what we’ve been looking for all along. (concat .) . sequence becomes simply mconcat! Without a lick of coding on our part.

Oh yeah, here’s the icing: we get an even more general type signature. The function as we initially defined it worked for Strings, and by extension, other lists. But it didn’t work for values of, for example, type ShowS (i.e. String -> String, which is a common hack to avoid stack issues with concatenation by encoding strings as functions). But look! Even if ShowS isn’t a list, it is, after all, an endomorphism! And we know by our discussion above endomorphisms are monoids too. And you know what else is a monoid? Oh yeah, a bytestring! And soforth.

I eventually took this idea even further in the HStringTemplate library, and also found another nice use of monoids that I’ll blog about later. The lesson I got out of this though: sometimes, it turns out the simplest abstractions are the most powerful.

Comments (1)

ANN: strictify 0.1

Dear Friend,

Have you ever stared at a maze of Haskell code and thought “Gosh, I’m just feeling lazy today. Everything works, and I mainly understand the strictness characteristics of my program. But I just can’t decide where to put those last few bangs. Maybe I don’t need to micro-optimize anyway.” Sure you have. It’s ok. You can let it out. You’re among friends here. Why, I bet you’ve even gone and sprinkled some bangs randomly, hoping for the best. We’ve all been there.

Maybe you even got those pesky strictness annotations right, but a new version of GHC with its ever improving strictness analyzer came along and rendered some obsolete. And yet there they still lie, consuming whole tens of picoseconds worth of your hard-earned processor cycles and troubling your dreams.

Well, friend, your dreams need be troubled no more. The time has come for a new era! The era of strictifiericization version 0.1. An era where you can kiss those micro-optimization troubles over silly shootout entries goodbye! Instead of brute-forcing your way through permutations of strictness entries let your computer do the work for you. That’s why we invented them after all!

Rather than putting “!” in just the right places, put “{- ! -}” in every place you _might_ want a strictness annotation. Then sit back, pop open a cool refreshing brew, and wait (and wait and wait) as strictify does the rest!*

Available now now now at hackage: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/strictify-0.1
Darcs repository at: http://code.haskell.org/~sclv/strictify/

*To prevent exponential running time, strictify only produces a local rather than global optimum of strictness patterns, which should be sufficient in 99% of all cases. Strictify only works on POSIX-compliant systems. Strictify is currently only suitable for “shootout”-style executables which are contained in a single main file, accepting input from argv, stdin, or both. Strictify will not fully execute until it recieves an EOF in stdin. If you cannot produce an EOF through ctrl-D and are not streaming an input file in, you may be forced to run “echo “” | strictify”. No guarantee is made, express or implied, as to strictify improving the speed of your program or your mental acuity regarding strictness. Offer void where prohibited. As strictify contains small parts, it is not suitable for children under the age of five. Do not ingest strictify more than seventy five times in a one hour period. If symptoms persist, consult your physician.

Comments

ANN: HStringTemplate 0.2

HStringTemplate is a general purpose templating system, geared especially towards HTML and based on Terrence Parr’s Java library.

On Hackage at: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/HStringTemplate-0.2

Development version at: darcs get http://code.haskell.org/HStringTemplate/

Haddocks at: http://code.haskell.org/HStringTemplate/dist/doc/html/HStringTemplate/Text-StringTemplate.html

Additional documentation on the grammar at: http://www.antlr.org/wiki/display/ST/StringTemplate+3.1+Documentation as well as in some posts at http://fmapfixreturn.wordpress.com

Lots of cleanup, lots of additions, hopefully this is an extremely usable release. Still no group or interface files, but I suspect that those are mainly useful for code-generation, which seems like something Haskell programmers would want to use something other than a templating system for in any case.

On to the good stuff:

  • Now on hackage!
  • Generics. Not one but two types. One set of simple bindings for the standard Data class, and one for syb-with-class. (Alex Drummond’s RJson library, which is really cool, was very helpful in figuring out how to do this).
  • A withContext method that turns any set of name-value bindings into the context for a StringTemplate. Along with the syb-with-class bindings, this should make for relatively seamless interoperability with HAppS.
  • Lots of other additional bindings for working with standard time formats, numeric types, etc.
  • Encoders. A standard mechanism for HTML-escaping strings (or javascript-escaping, or urlencoding them, or etc.) that A) ensures it happens uniformly and B) ensures it happens no more than once.
  • Improved pretty printing support, eliminating corner-cases where wrapping did not occur.
  • Creation of directory groups is now done properly in the IO monad, with caching functionality moved to the scarily-named unsafeVolatileDirectoryGroup.
  • 80% Top-Level testing coverage in the base. (And more to come, but I only have so much time).

And of course, patches and feedback always welcome.

Comments

Turtles all the way.

HStringTemplate had a very irritating hack in it up until about twenty minutes ago. Every time we worked with some of StringTemplate’s fancier constructs, particularly chaining template application (i.e. $foo:bar():baz():etc()$ which applies foo to bar, feeds it to baz, then feeds it in turn to etc) or using a template as an attribute (i.e. ${$foo$}$ which creates an anonymous template, that in turn grabs foo from the outer scope) the value had to be rendered down to a string before passing it back into the next scope. Now, this normally would have been unfortunately ugly, but not that harmful. However, because of the wrapping and indentation support that HStringTemplate provides, it meant that we couldn’t get proper indentation in such a case since we didn’t know where to wrap the expression yet (i.e. how far it would be indented) at the time we evaluated it.

That last ugliness is now gone. The approach is genuinely turtles all the way down, as the saying goes. This not only makes things work right, but it makes the entire data model cleaner and more efficient. And, while there were a number of changes to type signatures to pass the right information around (sometimes I feel type signatures are the Dorian Gray’s portrait of the Haskell world — the more elegant your code gets, the more complicated and unwieldy your signatures), the end result required only three changes to lines of actual code, and actually resulted in something slightly smaller!

I hope to get to this in a later blog post, but the lesson is that static typing and type inference aren’t just about bondage and discipline and safety. They can also give you a sort of incredible expressive power, where the information on what you’re doing is carried around invisibly in the type system as a sort of secret parameter that can give you a great deal of control over the way your code actually behaves. While the Java version of StringTemplate, for example, uses separate renderers that control how output displays, everything in HStringTemplate (lexing, parsing, construction of functions, establishing pretty-printing relationships) can be done in a single pass. And thanks to laziness and higher-order functions, this pass can produce partially applied functions where everything else (such as indentation levels) can be figured out in a second single pass at runtime. (NB: this idiom, completely natural to Haskell, is sometimes known as runtime compilation although it’s a different use of the term than is usually the case.)

So what does that mean to the end user? Well, you can do this, for one:


> putStr $ PP.renderStyle (PP.style {PP.lineLength=80}) $ toPPDoc $
  setAttribute “foo” [[(1::Double)..30],[200..230]] $ newSTMP
  ”Some long text: $foo:{Some label: [$it$]}:{#$it$#};separator=’, ‘$ END”

Some long text: #Some label: [1.0, 2.0, 3.0, 4.0,
                              5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0,
                              14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0,
                              22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0,
                              30.0]
                 Some label: [200.0, 201.0, 202.0, 203.0, 204.0,
                              205.0, 206.0, 207.0, 208.0, 209.0, 210.0, 211.0,
                              212.0, 213.0, 214.0, 215.0, 216.0, 217.0, 218.0,
                              219.0, 220.0, 221.0, 222.0, 223.0, 224.0, 225.0,
                              226.0, 227.0, 228.0, 229.0, 230.0]# END

So what’s happening with this expression? The newSTMP call, as usual, parses a string into a template. The setAttribute call inserts, “foo”, a list of two lists of doubles, into the template. The toPPDoc call evaluates the template in the context of the attributes we’ve set, and returns a Pretty Printer Doc. The render call is to the Text.PrettyPrint.HughesPJ library (which I’ve imported, qualified as PP), and tells it how long we want the lines to be, and finally, the putStr call displays the returned string with newline characters appropriately rendered.

Now onto the mechanics of the StringTemplate itself. First we print some text, and then foo, as a list, is iterated over and repeatedly passed into the next part of the expression. The sublists of foo are each printed, preceded by “Some label: ” and surrounded in braces. The values of each sublist are separated by commas, because we set that option in the top level expression, and that gets propagated downwards as well, and then the whole resulting expression gets wrapped in pound signs by the final template application. And the new cool part is, because everything is “turtles all the way,” those sublists are not rendered and wrapped until the entire shebang is called, meaning that not only do we preserve multiple levels of indentation, but that when those sublists are displayed, everything, including the pound signs that were only inserted *after* we evaluated the sublists of foo, is taken into account such that they indent and wrap properly.

The beauty of programming in a language that supports abstractions as elegantly as Haskell does is that doing it this way not only gives nicer performance, and nicer features, but that it creates cleaner, simpler code. Implementing this feature properly didn’t mean fighting the language, but just giving into to what the types had been whispering to me all along.

Comments (1)

« Previous entries