Logic and Implementation
24 Nov 2006Thesis
Lately, in addition to spending more time writing fiction than reading it, I’ve been studying Haskell, a programming language. Familiarity with Haskell in this case is not really necessary, as the theory that that I’ve been formulating while comparing Haskell and Lisp has little to do with the practice of programming. This might be old hat to old time programmers, but I’ve never seen it formulated in quite this manner before. People often talk about ‘power’ and ‘expressiveness’, when talking of high-level languages like Haskell and Lisp, but I wonder if this isn’t missing the point, or expressing the correct point in a way that obscures what’s really going on. I’ve begun to think about programming as having two parts, Logic and Implementation. This could be extended to have a third level, Runtime System, but even then, I’m thinking of the Runtime as parts of the Implementation that are so popular and useful to users of a given language that they’ve calcified into the substrate of that language; there is a level at which this distinction is important, however, and that is compatibility. The thrust of the argument that I’m putting forward is this: The looser the coupling between the Logic and Implementation in a particular programming language, the more powerful and expressive that programming language is. Which is to say, for those of you who abhor semi-circular definitions, the looser this coupling is, the easier the language is to use for truly wizardtastic (technical term) uses of computing power.
DWIM, ye beastie!
At its most basic level, programming is the task of telling a computer what to do. Ideally we’d be able to tell a computer to ‘log all the incoming connections to port 1337 and sue the sender’ or ‘reply to all the wfm postings on craigslist with that picture of Brad Pitt with my face’ or just ‘fix that shit!’. It doesn’t work that way, fortunately, which is why the rest of you pay those of us who can so very much money. So in getting a computer to do something, you have to tell it what to do (Logic) and how to do it (Implementation). In most languages, telling the machine these things are so entangled that they’re essentially the same step. In C, programming is the process of telling the machine in really fucking small, fiddly steps, do a then do b then do c with a and b.blerg. I make it sound horrible, but it isn’t. C is a great implementation language. It’s a terrible language for expressing high level program logic in a modular way, though, because there is no clean separation between what you’re doing and how you’re doing it. In C, you are what you do. This is fine for prototyping, but it’s a pain in the ass when it comes to refactoring and maintenance. If, in a C program, something that we’re doing is too slow, we actually have to visibly change what we’re doing to alter what should rightly be an implementation detail.
This is not to say that you cannot create this separation in C or any of the other languages out there. They’re all Turing complete and can all do the same things. The problem is that you have to think about it all the time, which means that you’re spending less time thinking about more important things, like how that obscure corner case on multi-core machines could really bite you when a switch port starts to flap, or whether the attractive person in the office across the hall is available this weekend for a date. Not having to think about the separation between what and how means it’s easier to design the system in a way that the Implementation is as orthogonal as possible to the Logic, meaning that it’s easier to chunk it up and give different parts to other people, or to improve performance without having to alter your understanding of how the entire system works. The ultimate expression of this is moving to a compiler that’s smarter about how it does things, and then turns out faster code without any effort on your part, but it functions on the applications level as well.
I would argue that it also makes it easier to reason about parts of your program as well. Not in the high level, formal correctness reasoning type of way, but just to break it into chunks and move them around in your head, thinking of new ways to solve problems, or better ways to do what you’re already doing. There’s nothing magic about this. The brain (oh, here he goes, appealing to science…) can only hold so many parts of a complicated problem at any one time. So the coarser chunks you can break a problem up into, while still being able to usefully think about the way that they interact and interrelate, the better off you are.
Additionally, I would like to put forward that OO-type implementation hiding isn’t really a useful exemplar of this strategy all on its own. This may have nothing to do with what it’s capable of and more to do with the way that it is used, which seems to me to be overly focused upon code sharing and ‘defining good APIs’. I am an OO skeptic in general and I don’t think much of code sharing as a target in and of itself (‘First, order within. Second, order within the familiy…’).
How I think it works.
In Lisp, you have macros. Not like crappy C preprocessor macros that
take code as input and return text to be parsed, but functions that
take ASTs and then can manipulate them in arbitrary ways. A trivial
example is something like with-mutex
which you pass a chunk of code,
presumably containing stuff that pertains to the mutex that you’ve
grabbed. The macro then grabs the mutex that you’ve specified,
executes the code that you’ve passed in, and releases the mutex. More
complex uses loop
a deeply complex mini-language within Lisp having
to do with simply expressing really hairy looping constructs. It’s
code that writes other code.
In Haskell, you have monads. Disregarding their category theory use, monads are a really clever way of abstracting away state changes in a purely functional language, which otherwise does not allow the mutation of values. A monad returns actions which then can be taken by a program, which may then alter the program’s state. Examples include IO, parser state, and most of the other interesting things that you want to do with a program. It’s code that expresses actions in a packagable, cleanly reusable way.
Although their implementation could not be more different, I say that these two mechanisms are doing more or less the same thing, which is enabling the separation of what you’re doing from how you’re doing it. They’re both often cited as things in these languages that are hard to wrap your head around, which I think stems from the fact that this separation isn’t always an easy one to make, at least on the level of thinking about a program. Once you get it, though, they seem almost like magical tools, allowing you to act as if your language arbitrarily powerful primitives. You can define things that act like new control constructs or operators. This is a huge win because once you have them right you don’t have to think about them anymore. On the flip side, if I need to change something about how something is done, either for robustness or efficiency, I can change it in one place, without having to worry about the rest of the program. We, as programmers, like to think of ourselves as smart people, perhaps uniquely skilled, but it all comes back to our limitations. The smaller bits of the program that we can work with, the better we are at it.
This, I would argue, is why more excellent programmers are drawn to languages with these properties (lest I be accused of arrogance, I am explicitly not an excellent programmer. I am, however, quite good at learning from the mistakes that other, smarter people make). It allows them to elide away the niggling details of the huge problems that they’re hacking away at, building a structure for reasoning more effectively about the problem that they are trying to solve. Paul Graham has talked about bottom up programming, which, I suppose is the root of the ideas that I’m trying to express here, but I think that it’s useful to re-express the process as defining what you’re trying to do, then defining how to do it, which sounds top down, but which has a lot of bottom-up parts, mostly because if you find that the way that what you’re doing is wrong, you can often re-use parts of the implementation of the previous strategy without having to change them. Instead of spending a lot of time defining a spec and interfaces and objects, you just write a quick spec right there in the code, then write the code that implements that code right under it.
This method of programming, I think, eludes a quick, visual metaphor like an arch or a lintel or a blueprint, mostly because its consequences are so subtle. Most of the work is done in the bottom-up style, but there are many top-down aspects, as they allow you to better reason about which bits at the base are important to get to first. Perhaps a good metaphor would be found in Haskell’s default of lazy evaluation. The top level spec defines a massive solution space comprised of all of the possible programs that you could write to do the thing that you’re trying to do, and then you only write just the bits that you need to write to get it working, much as Haskell can define a list or matrix of theoretically infinite extent and then pluck out just the values it actually needs, rather than precomputing the entire thing. By exposing the shape of this solution space, the high level spec allows us to better reason about what low level chunks of the program we need to attack first.