Finally figured out how to make haskell process a big pile of data without passing each drop of data around as an explicit argument. I can’t believe it has taken me a month to achieve such a simple outcome. The difficulty with the most obvious approach of bundling all the data together into a record and just changing the fields you are interested in is that you rewrite the master record each time. That’s a lot of bookkeeping for the savings. A Monadic approach using a state monad is the way around that problem but it still scares me from an efficiency point of view. Even though this will be a shallow copy per update and not that expensive it still rankles to be do all that work for no reason.
The solution to my concerns turns out to be a special haskell monad called the ST monad. It is a monad much like the state monad in its conception but to the programmer seems more like a lightweight version of the IO monad. It advances a state through the series of computations and throws away the old state immediately upon use, just as an imperative language should, and enforces this through an extension to the type system available in all newer versions of haskell. Having enforced the imperative mindset, the ST monad allows you to define references (STRefs) which are pointers to a modifyable object; this is an analogue to IORefs which do the same thing in the IO monad.
The IO monad is monolithic and indivisible. When writing imperative code you really need modifyable objects and IORefs give you that ability but at a cost of ruining modularity. Every thing in the IO monad occurs at the global level and even though the main monad can be assembled by functions this has a totally different feel than subroutines in an imperative language. Assemble is the key word here. You assemble a main monad through functions that are akin to a macro language when considered from an imperative mindset. The functions may use data only available at run-time but they do so in a way which assembles a program to be run at a single level of abstraction and with global variables only. This is not an imperative language without OO but an imperative language without subroutines.
ST, which stands State Threads, is the answer. The name is overly ambitious, however. What is provides is not the equivalent of threading, or even object-orientation, but simple subroutines. My latest code hase a data type I call Frame (yup, you have to define your own stack frame when you build an imperative language out of functional bits) which contains a pile of STRefs to the data that will be local to my ‘subroutine’ which I then pass through a ReaderT monad to all the functions which comprise the whole. Voila. I have a subroutine. I call runST on my monad and I have called a subroutine. It has local variables that progress imperatively from initialization to final values and the answers to the caller. I can call this code in a purely functional setting (hence the silly State Thread name) and get an answer in a nice respectable functional setting without dirtying the IO monad or even caring what is going on at the global IO level. Just what an imperative programmer wants from a functional language.
Comments
Sun, 2008/03/16 - 1:31pm — mejd
It turns out I was mixing
It turns out I was mixing things up a bit. You can just use a (ReaderT XFrame IO x) monad to call a `subroutine’ directly from the IO monad without going through the functional wrapped ST monad at all. Because you create and then wrap-up the frame of IORefs explicitly you get state local to the sub-monad. This gives you a subroutine of the main IO monad just as if you were dealing in an exclusively imperative language and offers a fair degree of modularity. So IO versus ST is orthogonal to the modularity provided by subroutines (ie local frames) and instead differentiates itself on the calling context; i.e. whether from the main monad or from a function.