Archive for February, 2008

More Fun With Monoids (and some Functors)

Having based the core of HStringTemplate on Monoids, I started looking for other places to apply them in my code. A natural choice was a string template group. Conceptually, I wanted a very open API — a group is anything you can query for a specific template, be it a data structure, or a wrapper for an unsafe function that accesses a directory, or something that makes a StringTemplate on the fly even. But how do we compose these? I.e., we have two groups, and wish to implement “inheritance” such that one group is queried and if the query fails, then we check for the “higher” group. And most importantly, how do we build this such that any combination of different types of groups can be composed with any other? The first obvious thing to do is to use types to whittle everything down to a basic interface.

From an OO mindset, we would tend to, thinking of an “interface,” use a typeclass with a query function: class Group where query :: Group -> String -> Maybe StringTemplate. But oy vey! Now we need a function addGroups :: (Group x, Group y) => x -> y -> ???. Well, to a new concrete group type, I suppose. data ComposedGroup x y => ComposedGroup x y. And: instance (Group x, Group y) -> Group (ComposedGroup x y) where query s = case query x s of Just a -> Just a; Nothing -> query y s. Well, this is fine, if a bit bulky. But compose a few groups and a few more and we can see that even if we don’t have to interact with them, the type signatures will be pretty huge. We could avoid this with existential types, but it hardly seems like that improves the elegance of our solution.

There’s a useful insight, however, that I’ve been able to apply at least 50% of the time that I’ve been tempted to reach for existentials — for all practical purposes, a single function typeclass is equivalent to a lambda expression. In fact, all you’re doing when you make such a substitution is rendering the dictionary passing explicit. So switching over to type STGroup = String -> (Maybe StringTemplate) we see immediate benefits. First, this directly represents what a group is — not even a newtype constructor to get in the way. Second, we can write addGroups x y s = case query x s of Just a -> Just a; Nothing -> query y s directly, without all the typeclass machinery getting in our way.

Ok, where I’m going with this should be howling obvious by now. We have a bunch of things which have an append operation that is associative, and an obvious zero (const Nothing) — we have a monoid! There’s actually a slight catch here — we have two monoids, depending on whether the append operation favors the left or right value passed in. And hence, the Data.Monoid library wraps Maybe as newtype First a = First { getFirst :: Maybe a }, and Last which gives you the last Just value instead of the first. (Side not here: I’d much prefer First was simply the default Monoid instance for Maybe, as Last is its dual [i.e. the same thing with the arguments to mappend reversed], and you as you can wrap any monoid in Dual to get its Dual for free.)

So now we have STGroup = String -> (First StringTemplate). And thanks to our old friend the reader monoid [instance Monoid b => Monoid (a -> b)], we again get the function we really wanted for free: addGroups becomes mappend!

Another side note: First and Last, absurdly don’t come with built-in functor instances, even though as wrappers for Maybe, the functor instance is obvious: instance Functor First where fmap f x = First . fmap f . getFirst $ x

Now we can “drill down” directly into our group, also for free. So say we have, and we do, a function optInsertTmpl that sets options (as a list of string pairs) for a single template. Now, let’s define a helper function () x y = (fmap . fmap) x y. [This function should be in everyone's standard libs by the way. As a generalization from the "owl" which composes a function with two arguments into a function with one argument: o :: (b -> c) -> (a -> a1 -> b) -> a -> a1 -> c; o = (.).(.), I end up using it all the time.] The reader monad (i.e. (String ->) ) is a functor. We’ve just defined First as a functor. So we can simply write:

optInsertGroup :: [(String, String)] -> Group -> Group
optInsertGroup opts f = optInsertTmpl opts <$$> f

We can think of this as “reaching in” two levels deep (past the reader, past the first) and operating directly on every single possible template our group can generate, at once.

Different context than the last example, same conclusion: Get the types and classes right, and the rest of the code almost seems to melt away.

Comments (3)

On Monoids

So HStringTemplate was built around a simple notion: Everything takes an environment, which is a data structure holding attributes and some other good stuff, and eventually returns a string. This was abstracted later on, but we’ll come back to that. Plan text? That returns a string. A reference to an attribute? It returns a string. A nested template? It returns a string. So when we parse/interpret any template, we end up with a list of functions of type Env -> String. Simple enough to get going with.

So how do we execute this template? We need a function of type [Env -> String] -> Env -> String. If we weren’t thinking functionally, we’d have to write a loop. Create an empty string, and for each function in the list, apply it to our environment, append it to the string, and repeat:

function render(template, env) {
	str = "";
	foreach (func in template) {
		str += func(env);
	}
	return str;
}

In functional terms we can write (ignoring stack and efficiency issues) a fold:
render tmp env = foldr (flip (++) . ($ env)) [] tmp

If we think about it a bit, this is actually concat $ map ($env) tmp, or simpler still concatMap ($env) tmp

Step back from the function signature again, and think of it one type at a time and we get an additional abstraction. Instead of ([Env -> String], Env) -> String. we can view it as [Env -> String] -> (Env -> String). But our code doesn’t really represent this, as it explicitly asks for the env parameter. But then again, we could rewrite it as render tmp = \env -> concatMap ($env) tmp, and then pointfreeing the lambda expression we get (tmp >>=) . flip ($) or (concatMap tmp) . flip ($) depending if we want the in this case useless obfuscation of using the list monad.

Good enough, right? Well, imagine that we’ve got our perfect-abstractionometer turned really high for some reason today. There’s a flip and a ($). It looks a bit ugly. It looks a bit obscure. We’ve got compression sure, but not necessarily any more clarity than when we started with. Well, instead of starting from the imperative mindset of how we want this function to work, let’s look at the type signature again and start thinking about what could produce what we’re describing. First, let’s desugar String back to its underlying type, [Char]. That gives us [Env -> [Char]] -> Env -> [Char]. Now since we’re thinking of Env and Char here as abstract atomic types, we can ask, what gives us [a -> [b]] -> a -> [b]? Well, this might be a job for the unwrapped reader monad, since the sequence operation (Monad m => [m a] -> m [a]) resembles what we want. Substitute in (e->) for the m, and we get [e -> a] -> e -> [a], which in our case becomes [Env -> [Char]] -> Env -> [[Char]]. Almost there! Throw in a concat and we get: (concat .) . sequence of type [a -> [b]] -> a -> [b]. And now, not only is our code short, pointfree, and elegant, but it expresses what we wanted all along more clearly than any previous construct.

And yet.. and yet… some nagging bit of our brain says that this isn’t a compound operation, but an elementary one, for a proper algebra. The only question is what that algebra is. Enter monoids. The Data.Monoid library isn’t extensively documented, but the types almost explain themselves. A monoid is about the simplest algebraic structure that one can imagine. In category theory, a monoid can be viewed as a category with one object.

Ok yeah, but what is it? A monoid is a set of elements, an associative binary operator between them, and an identity element. So, you might have a monoid (Integers,+,0) because + is associative between all integers, and X + 0 = X. Integers also give rise to another monoid: (Integers,*,1). And indeed the Sum and Product monoids are both defined in Data.Monoid. We also get other simple ones for booleans. And, of course, we get, for all a, ([a],++,[]). In fact, the list monoid is also known as the free monoid, essentially because, as best I (sorta) understand it, it comes without any constraints. I.e. if for the integers we have [1,4,2,3] we have “1423″ (from the monoid on integers derived by turning them to strings and appending them) and we have 24 from the product monoid and we have 10 from the sum monoid and etc. Thus, the free monoid keeps all information that goes into it, and therefore, so to speak freely arises (without constraints) for any set of objects.

What does all this get us? Well, it gets us generality, especially when we throw higher-order functions back in the mix. In particular, we can now notice that Data.Monoid gives us, for instance the monoid of endomorphisms. So we can conceptually “add” (or rather, mappend), for example, a whole set of functions of type “String -> String” into one big “String -> String” function.

Well, that looks helpful, but it’s not exactly what we want. There is, however, an even cooler Monoid: The reader monoid. Monoid b => Monoid (a -> b). Conceptually, there’s only one way its mappend function can work: feed the value a into two functions, yielding two bs. Now, given that these bs are, by definition, Monoids as well, we mappend their results.

Along with mappend, which is the associative element, we of course got mconcat for free, which is defined by default as foldr mappend mempty, and as such is the obvious generalization of concat. So now we can look at the type of mconcat specialized to the reader monoid: Monoid b => [a -> b] -> a -> b. And that, ladies and gentlemen, is what we’ve been looking for all along. (concat .) . sequence becomes simply mconcat! Without a lick of coding on our part.

Oh yeah, here’s the icing: we get an even more general type signature. The function as we initially defined it worked for Strings, and by extension, other lists. But it didn’t work for values of, for example, type ShowS (i.e. String -> String, which is a common hack to avoid stack issues with concatenation by encoding strings as functions). But look! Even if ShowS isn’t a list, it is, after all, an endomorphism! And we know by our discussion above endomorphisms are monoids too. And you know what else is a monoid? Oh yeah, a bytestring! And soforth.

I eventually took this idea even further in the HStringTemplate library, and also found another nice use of monoids that I’ll blog about later. The lesson I got out of this though: sometimes, it turns out the simplest abstractions are the most powerful.

Comments (1)

ANN: strictify 0.1

Dear Friend,

Have you ever stared at a maze of Haskell code and thought “Gosh, I’m just feeling lazy today. Everything works, and I mainly understand the strictness characteristics of my program. But I just can’t decide where to put those last few bangs. Maybe I don’t need to micro-optimize anyway.” Sure you have. It’s ok. You can let it out. You’re among friends here. Why, I bet you’ve even gone and sprinkled some bangs randomly, hoping for the best. We’ve all been there.

Maybe you even got those pesky strictness annotations right, but a new version of GHC with its ever improving strictness analyzer came along and rendered some obsolete. And yet there they still lie, consuming whole tens of picoseconds worth of your hard-earned processor cycles and troubling your dreams.

Well, friend, your dreams need be troubled no more. The time has come for a new era! The era of strictifiericization version 0.1. An era where you can kiss those micro-optimization troubles over silly shootout entries goodbye! Instead of brute-forcing your way through permutations of strictness entries let your computer do the work for you. That’s why we invented them after all!

Rather than putting “!” in just the right places, put “{- ! -}” in every place you _might_ want a strictness annotation. Then sit back, pop open a cool refreshing brew, and wait (and wait and wait) as strictify does the rest!*

Available now now now at hackage: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/strictify-0.1
Darcs repository at: http://code.haskell.org/~sclv/strictify/

*To prevent exponential running time, strictify only produces a local rather than global optimum of strictness patterns, which should be sufficient in 99% of all cases. Strictify only works on POSIX-compliant systems. Strictify is currently only suitable for “shootout”-style executables which are contained in a single main file, accepting input from argv, stdin, or both. Strictify will not fully execute until it recieves an EOF in stdin. If you cannot produce an EOF through ctrl-D and are not streaming an input file in, you may be forced to run “echo “” | strictify”. No guarantee is made, express or implied, as to strictify improving the speed of your program or your mental acuity regarding strictness. Offer void where prohibited. As strictify contains small parts, it is not suitable for children under the age of five. Do not ingest strictify more than seventy five times in a one hour period. If symptoms persist, consult your physician.

Leave a Comment

Follow

Get every new post delivered to your Inbox.