Thesis
Lately, in addition to spending more time writing fiction than reading
it, I’ve been studying Haskell, a programming language. Familiarity
with Haskell in this case is not really necessary, as the theory that
that I’ve been formulating while comparing Haskell and Lisp has little
to do with the practice of programming. This might be old hat to old
time programmers, but I’ve never seen it formulated in quite this
manner before. People often talk about ‘power’ and ‘expressiveness’,
when talking of high-level languages like Haskell and Lisp, but
I wonder if this isn’t missing the point, or expressing the correct
point in a way that obscures what’s really going on. I’ve begun to
think about programming as having two parts, Logic and Implementation.
This could be extended to have a third level, Runtime System, but even
then, I’m thinking of the Runtime as parts of the Implementation that
are so popular and useful to users of a given language that they’ve
calcified into the substrate of that language; there is a level at
which this distinction is important, however, and that is
compatibility. The thrust of the argument that I’m putting forward is
this: The looser the coupling between the Logic and Implementation in
a particular programming language, the more powerful and expressive
that programming language is. Which is to say, for those of you who
abhor semi-circular definitions, the looser this coupling is, the
easier the language is to use for truly wizardtastic (technical term)
uses of computing power.
DWIM, ye beastie!
At its most basic level, programming is the task of telling a computer
what to do. Ideally we’d be able to tell a computer to ‘log all the
incoming connections to port 1337 and sue the sender’ or ‘reply to all
the wfm postings on craigslist with that picture of Brad Pitt with my
face’ or just ‘fix that shit!’. It doesn’t work that way,
fortunately, which is why the rest of you pay those of us who can so
very much money. So in getting a computer to do something, you have
to tell it what to do (Logic) and how to do it (Implementation). In
most languages, telling the machine these things are so entangled that
they’re essentially the same step. In C, programming is the process
of telling the machine in really fucking small, fiddly steps, do a
then do b then do c with a and b.blerg. I make it sound horrible, but
it isn’t. C is a great implementation language. It’s a terrible
language for expressing high level program logic in a modular way,
though, because there is no clean separation between what you’re doing
and how you’re doing it. In C, you are what you do. This is fine for
prototyping, but it’s a pain in the ass when it comes to refactoring
and maintenance. If, in a C program, something that we’re doing is
too slow, we actually have to visibly change what we’re doing to alter
what should rightly be an implementation detail.
This is not to say that you cannot create this separation in C or any
of the other languages out there. They’re all Turing complete and can
all do the same things. The problem is that you have to think about
it all the time, which means that you’re spending less time thinking
about more important things, like how that obscure corner case on
multi-core machines could really bite you when a switch port starts to
flap, or whether the attractive person in the office across the hall
is available this weekend for a date. Not having to think about the
separation between what and how means it’s easier to design the system
in a way that the Implementation is as orthogonal as possible to the
Logic, meaning that it’s easier to chunk it up and give different
parts to other people, or to improve performance without having to
alter your understanding of how the entire system works. The ultimate
expression of this is moving to a compiler that’s smarter about how it
does things, and then turns out faster code without any effort on your
part, but it functions on the applications level as well.
I would argue that it also makes it easier to reason about parts of
your program as well. Not in the high level, formal correctness
reasoning type of way, but just to break it into chunks and move them
around in your head, thinking of new ways to solve problems, or better
ways to do what you’re already doing. There’s nothing magic about
this. The brain (oh, here he goes, appealing to science…) can only
hold so many parts of a complicated problem at any one time. So the
coarser chunks you can break a problem up into, while still being able
to usefully think about the way that they interact and interrelate,
the better off you are.
Additionally, I would like to put forward that OO-type implementation
hiding isn’t really a useful exemplar of this strategy all on its own.
This may have nothing to do with what it’s capable of and more to do
with the way that it is used, which seems to me to be overly focused
upon code sharing and ‘defining good APIs’. I am an OO skeptic in
general and I don’t think much of code sharing as a target in and of
itself (’First, order within. Second, order within the familiy…’).
How I think it works.
In Lisp, you have macros. Not like crappy C preprocessor macros that
take code as input and return text to be parsed, but functions that
take ASTs and then can manipulate them in arbitrary ways. A trivial
example is something like with-mutex which you pass a chunk of code,
presumably containing stuff that pertains to the mutex that you’ve
grabbed. The macro then grabs the mutex that you’ve specified,
executes the code that you’ve passed in, and releases the mutex. More
complex uses loop a deeply complex mini-language within Lisp having
to do with simply expressing really hairy looping constructs. It’s
code that writes other code.
In Haskell, you have monads. Disregarding their category theory use,
monads are a really clever way of abstracting away state changes in a
purely functional language, which otherwise does not allow the
mutation of values. A monad returns actions which then can be taken
by a program, which may then alter the program’s state. Examples
include IO, parser state, and most of the other interesting things
that you want to do with a program. It’s code that expresses actions
in a packagable, cleanly reusable way.
Although their implementation could not be more different, I say that
these two mechanisms are doing more or less the same thing, which is
enabling the separation of what you’re doing from how you’re doing
it. They’re both often cited as things in these languages that are
hard to wrap your head around, which I think stems from the fact that
this separation isn’t always an easy one to make, at least on the
level of thinking about a program. Once you get it, though, they seem
almost like magical tools, allowing you to act as if your language
arbitrarily powerful primitives. You can define things that act like
new control constructs or operators. This is a huge win because once
you have them right you don’t have to think about them anymore. On
the flip side, if I need to change something about how something is
done, either for robustness or efficiency, I can change it in one
place, without having to worry about the rest of the program. We, as
programmers, like to think of ourselves as smart people, perhaps
uniquely skilled, but it all comes back to our limitations. The
smaller bits of the program that we can work with, the better we are
at it.
This, I would argue, is why more excellent programmers are drawn to
languages with these properties (lest I be accused of arrogance, I am
explicitly not an excellent programmer. I am, however, quite good at
learning from the mistakes that other, smarter people make). It
allows them to elide away the niggling details of the huge problems
that they’re hacking away at, building a structure for reasoning more
effectively about the problem that they are trying to solve. Paul
Graham has talked about bottom up programming, which, I suppose is the
root of the ideas that I’m trying to express here, but I think that
it’s useful to re-express the process as defining what you’re trying
to do, then defining how to do it, which sounds top down, but which
has a lot of bottom-up parts, mostly because if you find that the way
that what you’re doing is wrong, you can often re-use parts of the
implementation of the previous strategy without having to change them.
Instead of spending a lot of time defining a spec and interfaces and
objects, you just write a quick spec right there in the code, then
write the code that implements that code right under it.
This method of programming, I think, eludes a quick, visual metaphor
like an arch or a lintel or a blueprint, mostly because its
consequences are so subtle. Most of the work is done in the bottom-up
style, but there are many top-down aspects, as they allow you to
better reason about which bits at the base are important to get to
first. Perhaps a good metaphor would be found in Haskell’s default of
lazy evaluation. The top level spec defines a massive solution space
comprised of all of the possible programs that you could write to do
the thing that you’re trying to do, and then you only write just the
bits that you need to write to get it working, much as Haskell can
define a list or matrix of theoretically infinite extent and then
pluck out just the values it actually needs, rather than precomputing
the entire thing. By exposing the shape of this solution space, the
high level spec allows us to better reason about what low level chunks
of the program we need to attack first.