Posts Tagged polymorphism

Simple Extensible Records — Quick Generic Tricks, Pt. 1

There have been a few discussions lately about how to do quick and easy typesafe extensible records in Haskell. And there have been a number of discussions lately about extensions to do it more cleanly (see, e.g. this one on haskell-cafe). This came up on the channel the other day and somebody immediately suggested HList, which seems too heavyweight for this sort of thing to me. HaskellDB, for example, also rolls its own fairly heavyweight new records system.

Anyway, in discussion, we cooked up an idea which Kamina, who originally asked the question, later implemented very cleanly. The gist is that one defines a map of properties, and implements a typeclass for accessing them. Subsequently, one defines a typeclass for each property, whose default methods know how to query them from the properties map. Then, providing any data type with a new property is just a quick instance declaration away. There’s a slight messiness in that one needs to directly index properties by strings, or by a Property enumeration that’s set in stone. (One can actually index them by TypeReps using Typeable, but that’s sort of messy.)

However, there’s another issue as well — we want records extensible in two ways. The typical way is just being able to reuse an accessor and setter function over multiple data types with the same property. But we also want to be able to actually extend records with something like inheritance — i.e. to declare some records strict supersets of others. The following solution, with generics, both solves that issue, and is significantly more lightweight, although it still requires mild boilerplate. Additionally, while the former solution lets one “break” it by circumventing the typeclass-bound “smart accessor” functions and operating on the properties map directly, this solution can also be “broken” by giving it a bad instance declaration.

Both solutions, one should note, can easily be augmented with Template Haskell to reduce the boilerplate still further.

We start by realizing that rather than creating a properties map, it would be nice if we could walk directly over the data in our record and select something of the right type. This, of course, is something that the Scrap Your Boilerplate generics library excels at. In fact, it turns out we have the function we need right at our fingertips: gfindtype, which is of type (Data x, Typeable y) => x -> Maybe y . We ask it for a specific type, and if that type is an immediate subterm of x, then it returns it, otherwise it fails.

Now we need a modifier. There isn’t one built in, but its simple enough to add one. The type signature is immediately evident: greplace :: (Data a, Typeable b) => a -> b -> Maybe a . However, what to do with this type signature is a bit trickier. Data.Generics gives us gmapMo of type (Data a, MonadPlus m) => (a -> m a) -> a -> m a. It traverses the immediate subterms of a type, and applies a transformation. We build our traverser using `extM` which composes a chain of actions to try, matching from right to left the first with an appropriate type. Our default action when nothing matches is to const Nothing (i.e., to leave the term unchanged). When the term does match, we return just the term. This gives us:

greplace x y = gmapMo (const Nothing `extM` (const (Just y))) x

And indeed, this does what we want. However, we don’t have any type safety constraints yet. Every query and update gives us a maybe, depending if it matches. But we can combine this system with the typeclass constraints from the other system and get a very clean interface. Assume we have properties of type Color and Size:

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
data Size = Size Int deriving (Show, Data, Typeable)

Now we can just write:

class Data a => HasColor a where
    color :: a -> Color
    color = fromJust . gfindtype
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

And for anything with a Color, e.g., a car, we just can write instance HasColor Car. Not too shabby!

However, as we’ve written it, this only works for immediate subterms. So, just like our old solution with an explicit properties map, we still can’t extend our records just by wrapping them as part of a larger record.

Luckily, the generics libraries make it trivial to do deep searches as well. We can replace gfindnew with gfind = something (const Nothing `extQ` Just) . This function is built much the same as our greplace was — yielding Just a value of the appropriate type, or Nothing otherwise. The something combinator just maps our query function deeply over every term, from top to bottom, until it hits a match. And as for greplace, we can just replace gmapMo with “somewhere” for a similar effect. Now, if we have data DrivenCar = DrivenCar Person Car we can give the appropriate instance declarations, and get and set the color of the car directly here as well.

One more problem though: the somewhere combinator will apply our transformation at least once — which means that it might apply it many multiple times, and even if it doesn’t, it’ll keep working ever after it succeeds. This too is easy enough to fix. Look at the source and we see that somewhere is just defined as f x `mplus` gmapMq (somewhere f) x. So its trying the transform, and if the transform fails, its mapping the transform over all the immediate subterms, recursively. All we have to do is write a function “once” that looks exactly the same, but with gmapMo instead of gmapMq, and we have what we really need.

A complete working example follows below:

{-# LANGUAGE FlexibleInstances, FlexibleContexts, DeriveDataTypeable  #-}

module Records where

import Data.Generics.Basics
import Data.Generics.Aliases
import Data.Generics.Schemes
import Control.Monad
import Data.Maybe

greplace :: (Data a, Typeable b) => a -> b -> Maybe a
greplace x y = once (const Nothing `extM` (const (Just y))) x

once :: MonadPlus m => GenericM m -> GenericM m
once f x = f x `mplus` gmapMo (once f) x

gfind :: (Data a, Typeable b) => a -> Maybe b
gfind = something (const Nothing `extQ` Just)

data Color = Red | Yellow | Green | Blue | White | Black deriving (Show, Data, Typeable)
newtype Size = Size Int deriving (Show, Data, Typeable)
unSize (Size a) = a

data Car = Car Color Size deriving (Data, Typeable, Show)
data Cube = Cube Color deriving (Data, Typeable, Show)

data Two = Two Int Car Car deriving (Data, Typeable, Show)

class Data a => HasColor a where
    getColor :: a -> Color
    getColor = fromJust . gfind
    setColor :: a -> Color -> a
    setColor = (fromJust .) . greplace

class Data a => HasSize a where
    getSize :: a -> Int
    getSize = unSize . fromJust . gfind
    setSize :: a -> Int -> a
    setSize = (fromJust .) . (. Size) . greplace

instance HasColor Car
instance HasColor Cube
instance HasSize Car
instance HasColor Two
instance HasSize Two

{-
*Records> getColor (Two 12 (Car Red (Size 34)) (Car Blue (Size 93)))
Red

*Records> setSize (Two 12 (Car Red (Size 34)) (Car Blue (Size 93))) 232
Two 12 (Car Red (Size 232)) (Car Blue (Size 93))
-}

Comments (1)

Type hackery for the practical programmer pt. II

This post is a long time coming, and sort of anti-climactic, but I wanted to just finish off what I’d begun describing in the previous post.

We have, if you will recall:

class MapFromTuple a b where
      mapFromTuple :: a -> [b]

This is our handy way to marshal a heterogenous tuple into a uniformly-typed list. Now, we can perform our operations on this uniform list. The next step, however, is to go back from our transformed list to a heterogenous tuple. Immediately an answer presents itself:

class MapToTuple a b where
    mapToTuple :: [b] -> a

The instance declarations are correspondingly the inverse of those for MapFromTuple:

instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x))
    => MapToTuple (a,b) x where
    mapToTuple [a,b] = (fromGenD dict a, fromGenD dict b)
instance (Sat (MapToTupleD a x), Sat (MapToTupleD b x), Sat (MapToTupleD c x))
    => MapToTuple (a,b,c) x where
    mapToTuple [a,b,c] = (fromGenD dict a, fromGenD dict b, fromGenD dict c)

But there’s a problem here. While mapFromTuple was a total function, mapToTuple is not, because it is polymorphic on its return type, not its argument type. So if we use mapToTuple in a casual fashion, we introduce a great deal of extra partiality. What we need is a way to carry the type context of our original tuple over to the return argument of mapToTuple, and thus guarantee that it yields the correct types in a correctly-sized tuple. And thus, we have the complete type signature for withValidation’:

withValidation' :: (MapFromTuple a (String, ValidationFuncIntern s String Dynamic),
                    MapToTuple a1 Dynamic,
                    TupleMatchTwo a a1) =>
                   a -> (a1 -> HCGI r s t) -> HCGI r s t

The first restriction says that a is something which can be transformed into something of type (String, ValidationFunctionIntern s String Dynamic). The second restriction says that a1 is something which can be transformed *from* Dynamic. The trick here is the ridiculous TupleMatchTwo class, defined as such:

class TupleMatchTwo a b | a -> b where {}
instance TupleMatchTwo () ()
instance TupleMatchTwo (Box (foo, bar-> mk a)) (Box a)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b)) (a,b)
instance TupleMatchTwo ((foo, bar -> mka a), (foo, bar -> mkb b), (foo, bar -> mkc c)) (a,b,c)

… and soforth. Lots of boilerplate, but luckily, as with the other boilerplate instances, this is write-once boilerplate that saves us lots of work elsewhere. Although I’m probably butchering the term, what TupleMatchTwo gives us is a very limited type-level logic-programming function that “extracts” the last bit of a rather complicated signature from each of a set of tupled values, and returns them alone. Unfortunately, as its not a real type-level function, it has to be unrolled by hand. The end result, however, is that now we can extract what we “know” our return type must be from our argument type, and by asserting that return type statically at compile time, we can guarantee what we sought to from the beginning — that withValidation’ can be passed an arbitrary tuple of validator functions and will produce only a well typed tuple of the corresponding validated inputs.

I might be asserting too much here, but it seems like this basic paradigm maps to 90% of the use cases for heterogeneous lists — we want to operate over them in a uniform way, and then we want to properly varigate our input again before exiting the library function.

This style of input is now in the current repo of hvac, for both validation and database functions. There are also a few other nice changes I plan to blog about soon, mainly along the lines of cleaning things up and reducing dependencies.

I’ve also run into a really nifty example of what can be done with this “tuple-level programming” in my current project — a little mini-dsl for SQL suitable for embedding into hvac, on which more later.

Leave a Comment

Type-hackery for the practical programmer

So hvac has a pretty nice validation framework built on the withSomething idiom (i.e. withSomething (\something -> etc)) which is a sort of continuation passing model. More particularly, it has: withValidation :: [(String, ValidationFunc s String String)] -> ([String] -> HCGI q s CGIResult) -> HCGI q s CGIResult. This is to say that it takes a list of pairs of validation functions and the request parameters to retrieve/process/validate, and a success handler which takes a list of those retrieved/processed strings, and it returns a result. If the validation fails, the errors are stashed into an appropriate location, and the cgi action fails.

So far so good — but basic strong static typing gives us a wart here. It takes a list and returns (so to speak) a list — this means that everything must be of a uniform type. Currently that type is String, although it can be generalized. But even then, every validator associated with a particular call to withValidation has to produce something of the same type.

So say you want to retrieve and Int that is greater than 3 and a String. Well, you have a validator for the String. All good. And you have a validator that read the Int, and then checks its value is greater than three, and then finally, maddeningly, you have to show it again to convert it back to a string.

The handler function then must read the sucker again, which duplicates work, although not too much. Even worse, even though our validation has “proved” that the value can be read without error, there’s no way to express this. So there are important properties for the soundness of our program that are true and exist, but only in our heads, and only to the extent we remember them — what we want is a way to express this knowledge concretely.

The second concern is the bigger concern, but let’s ignore it for now and concentrate first on just getting this thing working with polymorphic types at all. Our use case is actually just what Dynamics are for. Dynamics give you two core handy functions — toDyn :: Typeable a => a -> Dynamic and fromDynamic :: Typeable a => Dynamic -> Maybe a, which, obviously, returns Just something if the Dynamic is of the right type, and Nothing otherwise. So now we can pack arbitrary (Typeable) objects into a single list of a uniform type, and unpack them afterwards.

Similar idioms are used very frequently — HDBC, which is the database interface used by hvac, uses toSql and fromSql. The Takusen database interface uses bindP, which is like toSql but, erm, fancier. Cleverly, it avoids using fromSql and instead allows for runtime failure (I think) should the result set of a query not type properly into what the code expects. I’d very much appreciate comments on where else this sort of thing crops up.

In any case, in the withValidation function, and the other examples given, there’s a significant silliness here. These are library functions, supposedly with nice clean interfaces — but we get an ugly situation with all these toDyns or toSqls cluttering up our pretty list of values. Intolerable!

Typeclasses to the rescue. toDyn and friends all share a certain trait — they’re all, in essence, of the form func :: (Typeclass a) => a -> ConcreteType. And conceptually, we’re mapping this polymorphic function over a heterogeneous list to get a homogenous one that we can then operate on. The first thing that comes to mind is, therefore, HList. But HList is pretty complicated, and furthermore, my sense is that once you use HList, you’re pretty much locked into the “HList way,” although I’m welcome to be proven wrong.

Instead, we can implement a solution without requiring all of that HList stuff. The end result here is an extensible, simple way to implement a mapFromTuple function that takes an arbitrary length tuple, all of whose members are subject to a typeclass constraint, and returns a list with the proper function mapped over them. This can then be used by any code that runs into a similar issue.

Rather than HList, we’ll be borrowing ideas from Scrap Your Boilerplate, and particularly “syb3” — i.e. the “Scrap your boilerplate with class: extensible generic functions” paper.

The trick here is pretty elegant, based on using typeclasses to abstract over dictionaries, apparently originally from Restricted Data Types in Haskell by John Hughes. Pretty much all the magic is in the one class declaration:

class Sat a where dict :: a

Seriously. That’s it. Reification of typeclass dictionaries in one simple line. Now we can write a concrete dictionary:

data MapFromTupleD a b = MapFromTupleD {toGenD :: a -> b}

And then a class which “unpacks” the dictionary:

class MapFromTuple a b where
    mapFromTuple :: a -> [b]

And boilerplate instances for tuples of various lengths:

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x))
    => MapFromTuple (a,b) x where
    mapFromTuple (a,b) = [toGenD dict a, toGenD dict b]

instance (Sat (MapFromTupleD a x), Sat (MapFromTupleD b x), Sat (MapFromTupleD c x))
    => MapFromTuple (a,b,c) x where
    mapFromTuple (a,b,c) = [toGenD dict a, toGenD dict b, toGenD dict c]

…and soforth. The key thing to think about is the magic of Sat in the context constraint. The context constraint on Sat specifies which dict gets pulled out of thin air, and so lets us write a function that abstracts over typeclasses the same way normal polymorphic functions abstract over concrete types.

Finally, we’re done with the machinery. Now to implement the dictionary for Typeable:

instance (Typeable x) => Sat (MapFromTupleD x Dynamic) where
    dict = MapFromTupleD { toGenD = toDyn }

And for Show while we’re at it:

instance (Show x) => Sat (MapFromTupleD x String) where
    dict = MapFromTupleD { toGenD = show }

Now we can do:

*Main> mapFromTuple (1,2) :: [String]
["1","2"]
*Main> mapFromTuple (1,2) :: [Dynamic]
[<<Integer>>,<<[Integer]>>]
*Main> mapFromTuple ("foo",2.3, Just 43) :: [String]
["\"foo\"","2.3","Just 43"]

Now if we wanted to get this functionality for sql types, we’d just have to add the single instance declaration for Sat, along the lines of the previous two… and soforth.

This gets us everything we want for the basic database-style examples, actually. But for hvac’s validation, things are more complicated, because even if we can marshall a tuple into a set of dynamics, then we’re still just returning a list of dynamics, rather than a tuple of actual values. However, we’re partway there on that, and I’ll leave off the rest until the next post.

Comments (5)