Swaroop C H

blog books about contact subscribe

Learning Clojure

08 Jun 2012

I once happened to attend a RubyConfIndia talk by C42's Steven Deobald who said:

data > functions > macros > compilers That kind of stuck in my head even though I didn't know what it meant at that time. I understood it only after learning Clojure and "The Clojure / Lisp way". I realized it when I was writing Python code for work, and I suddenly noticed I was writing code differently and I had one of those good aha moments that is supposedly the start of a person's Lisp journey.

I'm now amused at how often I break down my Python or Java code into lots of little functions instead of the 100-liner functions that I used to write before and am still surprised that I never realized I was writing them! The good thing about the "lots of little functions" is the modularity and the ease with which I can write, read, understand and importantly test the code without having to build an object hierarchy first.

For example, my code has now suddenly started looking like this, where data structure is explicitly written down and the processing code is separate from it - this makes the code really reusable. It is a contrast to my earlier programming style where I would've probably had the data structure implicit in the parsing code (which makes it less maintainable) or worse, had classes and objects to do the same and it would certainly have not been so reusable! Think of a typical Java programming workflow where I would have had to create a class to represent the data input and passed that to a processor class instance and so on.

# http://www.lexicon.net/sjmachin/xlrd.html
import xlrd

DATA_SHEET_NUMBER = 0
START_ROW = 3 # skip headings

# Explicit structure of the data
COLUMN_MAPPING = {
    'name' : 0,
    'class' : 1,
    'maths' : 2,
    'geography' : 3,
    'english' : 4,
}

def row_to_dict(sheet, row_number):
    assert isinstance(sheet, xlrd.sheet.Sheet)
    assert isinstance(row_number, int) and \
        row_number > 0 and \
        row_number < sheet.nrows
    # Code that will work with changing structure
    return dict([(key, sheet.cell_value(rowx=row_number,
                                        colx=COLUMN_MAPPING[key]))
                 for key in COLUMN_MAPPING.keys()])

def import_excel(content):
    book = xlrd.open_workbook(file_contents=content)
    sheet = book.sheet_by_index(DATA_SHEET_NUMBER)
    # Code that will work with different spreadsheet formats
    sheet_data = [row_to_dict(sheet, row_number) for
                  row_number in range(START_ROW, sheet.nrows)]
    sheet_data = [data for data in sheet_data
                  if len(data["name"]) > 0] # Ignore empty rows
    return sheet_data

if __name__ == '__main__':
    from pprint import pprint
    pprint( import_excel(open('test.xls', 'rb').read()) )
To be clear, Python was a good first step, what changed was the mindset after attempting to learn a Lisp language. As Peter Norvig once said:
Basically, Python can be seen as a dialect of Lisp with "traditional" syntax (what Lisp people call "infix" or "m-lisp" syntax). One message on comp.lang.python said "I never understood why LISP was a good idea until I started playing with python." Python supports all of Lisp's essential features except macros, and you don't miss macros all that much because it does have eval, and operator overloading, and regular expression parsing, so some--but not all--of the use cases for macros are covered.
A good friend of mine once said that Python is more popular because it is more approachable by traditional programmers and hence a more "social" programming language, whereas Lisp is a powerful language but not for everyone. That is explained in detail in the Lisp Curse essay.

So first good thing about Clojure is that it is a Lisp. Second is that it runs on the JVM which has solid performance, sometimes 20x better if you use it right. Third is solid Java interoperability. This was important to me because as a consultant, Java is unavoidable and I've written more Java code this year than I ever have. And using a good dynamic language on top of JVM with good Java interoperability is a path to making my work go faster. At least, that was how I got started. After all, your code will end up reflecting your company.

The downside I felt when I was grokking Clojure is that syntax is not simple even though that is the claim of traditional Lisps, for example #"" is regex, #{} is a set, #_() elides the form (compiler checks the code but acts as if it was commented out), #() is an anonymous function, #' derefs to vars, and so on.

Here is a quick idea about Clojure's philosophies that I was pointed to:

Clojure - Three Circles

Another interesting point is that functional programming languages are growing and it is probably because the future is DSLs again.

If you're still not convinced, you should watch The Curious Clojureist. And you should definitely watch all the Rich Hickey talks.

How to learn Clojure

The O'Reilly Clojure book is best book that I've come across yet.

However, equally important, my strong recommendation is that Clojure is good only when combined with Emacs and ghoseb's emacs setup. After learning Clojure in that environment, writing Python again makes me miss so many goodies (To get up to the same productivity in a few ways, I'm using PyCharm these days and am enjoying that).

To make my learning solid, I rewrote isbn.net.in for the third time in Clojure. The source code is at https://github.com/swaroopch/isbnnetinclj - be prepared to read some amateurish Clojure code.

I got a lot done in ~280 lines of Clojure code compared to 480+ lines of code in Ruby/Rails and a ton more boilerplate code. This difference in number of lines of code repeats often.

One interesting point is that because of the Clojure way of thinking, I ended up using a simple combination of future and core.cache to do the fetching of prices from book stores in parallel rather than bringing a full-fledged background jobs processor (delayed_jobs) to do that which vastly simplified the system. You can read that code in stores.clj.

Ending Thoughts

I got started with this journey because of frustrations with Java and at the same time I was trying to be not be narrow-minded with experience in just Python/Ruby/Perl languages (they are so similar). I kept reminding myself of what Douglas Crockford said:
WHAT WERE THE TRAITS OF THE WEAK PROGRAMMERS YOU’VE SEEN OVER YOUR CAREER?

That’s an easy one—lack of curiosity. They were so satisfied with the work that they were doing was good enough (without an understanding of what ‘good’ was) that they didn’t push themselves.

I’m much more impressed with people that are always learning. The brilliant programmers I’ve been around are always learning.

You see so many people get into one language and spend their entire career in that language, and as a result aren’t that great as programmers.

Programming languages becoming popular is almost never about the merits of the language itself and rather just a virtuous cycle of availability of programmers or platform requirements - Javascript and Objective-C are popular because you have no other choice, not only because of the merits of the language. Similarly, Clojure is leveraging the JVM and whatever native platform it runs on and hence is getting that initial lift needed to make the language appealing since people don't want to learn and start on yet another ecosystem.

This is best explained by Alan Kay himself:
Q: What should Java have had in it to be a first-quality language, not just a commercial success? Alan Kay: Like I said, it’s a pop culture. A commercial hit record for teenagers doesn’t have to have any particular musical merits. I think a lot of the success of various programming languages is expeditious gap-filling. Perl is another example of filling a tiny, short-term need, and then being a real problem in the longer term. Basically, a lot of the problems that computing has had in the last 25 years comes from systems where the designers were trying to fix some short-term thing and didn’t think about whether the idea would scale if it were adopted. There should be a half-life on software so old software just melts away over 10 or 15 years. It was a different culture in the ’60s and ’70s; the ARPA (Advanced Research Projects Agency) and PARC culture was basically a mathematical/scientific kind of culture and was interested in scaling, and of course, the Internet was an exercise in scaling. There are just two different worlds, and I don’t think it’s even that helpful for people from one world to complain about the other world—like people from a literary culture complaining about the majority of the world that doesn’t read for ideas. It’s futile.
Did you know that Lisp and Smalltalk are not so much in vogue because they were killed by bad hardware!?:
Alan Kay: Yes, actually both Lisp and Smalltalk were done in by the eight-bit microprocessor—it’s not because they’re eight-bit micros, it’s because the processor architectures were bad, and they just killed the dynamic languages. Today these languages run reasonably because even though the architectures are still bad, the level 2 caches are so large that some fraction of the things that need to work, work reasonably well inside the caches; so both Lisp and Smalltalk can do their things and are viable today. But both of them are quite obsolete, of course.
Lastly, I wanted to mention that my Clojure journey would not have sustained if it wasn't for Baishampayan Ghose (a.k.a. @ghoseb, a.k.a BG) whose untiring answers to my dumb questions was instrumental in me finally gaining some understanding of Clojure and Lisp in general. Thanks BG!

P.S. Watch this 2011 talk by Alan Kay. As @ghoseb would say, Be prepared to blow your mind.

Comments

Baishampayan Ghose says:

Swaroop, welcome to the club!

Btw, can you please change the URL to my Emacs config to https://github.com/ghoseb/dotemacs/tree/v2/ instead? I have moved things around a bit and it will become the canonical repo soon.

Swaroop says:

Thanks! And link updated :)

sivaram says:

interested in your observation on Erlang.very curious about why its not adaptable..

Swaroop says:

Hey Sivaram,

I have no idea about Erlang yet, probably BG or Ravi Mohan can answer your question till then!

Baishampayan Ghose says:

Sivaram - I love Erlang! It solves some really hard problems and does that with perfection. Erlang is designed to let you build incredibly fault tolerant applications, stuff that keeps on running for years without any failure.
Since Erlang focusses on solving a certain class of problems it's not as general purpose as something like Clojure. Also, the fact that the language itself hasn't evolved much since the 90's doesn't help much. Having said that, I think the Erlang community is excellent and there are a couple of very good books on Erlang as well.
So yeah, if you want to build a highly scalable, concurrent, distributed & fault tolerant application Erlang might be a very good choice.

Samrat Man Singh says:

Hi, I've been trying to learn Clojure for a while too. I'm struggling(I'm completely new to FP, Lisp and also the JVM), a lot so I recently decided to take a look at Haskell(some Stack Overflow answers suggested that I should learn a pure functional language first). Maybe, after I get used to functional programming I'll get back to Clojure.

What do you guys think of that approach?

@pradeepto says:

@ghoseb @swaroopch Hi, I saw Alan Kay's talk video on Programming and Scaling. Amazing! Thanks.

@sharat87 says:

@swaroopch Nice article. Makes me want to do clojure once again. Thanks.

@naiquevin says:

Awesome article http://t.co/tmxvBbns Can't wait for tomorrow to go through all links and resources mentioned in it

Feedback

There's no comment box, but please do email me or tweet me your thoughts and criticisms, and I will publish the relevant ones here.