Rgjavvu6fwgowz7tjpih

Clojure Programming at Big Astronaut: Data Nirvana

Michael Drogalis

One of the most striking differences between Lisps and object oriented languages is its first class support of data structures. This idea often end up a struggling point for those new to Lisp, and particularly Clojure. In order to be an effective functional programmer, it's critical to understand why Clojure chooses to make data structures, and not objects, king in its environment.

A common complaint from those new to Clojure is that at first glance, it seems that the language doesn't support abstraction very well. In an OO world, when we want to create a new abstraction - we do the only thing that we can - we make a new class. I wrestled with this notion for a long time. Working directly with data structures felt too unstructured, and there was no clear path to understanding data semantics. No lies here: learning Clojure was an uphill battle.

A few lines from Stu Halloway's "Narcissistic Design" talk finally made everything click for me. Stu describes why data structures are more abstract than custom types (e.g. classes), rather than the other way around:

"Information itself is already an abstraction. Sequences, maps, all those things - they're already an abstraction..."

And, sarcastically:

"The fact of the matter is that information comes in a few basic shapes. Information comes as ... numbers, booleans, characters - it comes in sequences of those ... But do not ever expose information this way - because if you expose information this way, other people already have collection libraries they can use to manipulate that. You don't wanna do that. You gotta screw those guys! You gotta make a (Java) Bean ... for which there is no extant API, and then they have to make an API (for themselves)!"

Speaking data in the language and over the wire offers plenty of benefits:

  • Data transcends language. It's easy to use two different languages across the wire given that neither language knows, nor cares, about each other. Producers, for example, can run Clojure, while consumers can run Ruby. The only contract between them need be a data format such as EDN, Fressian, or JSON.
  • There are no implementation details in data, hence no encapsulation is needed.
  • Data is both human and machine friendly, meaning that programs can interact with other programs in a sane manner.

DSLs, on the other hand, are the opposite of data. DSLs are a language construct used to mimic data semantics. Unfortunately, DSLs offer none of the above benefits. Stu, in the same talk, sums it up by saying "Nothing says screw you like a DSL".

Rich Hickey, too, pleads on the matter, saying:

"Data - please! We’re programmers! We’re supposed to write data processing programs. There’s all these programs, and they don’t have any data in them. They have all these constructs you put around it, globbed ontop of data. Data is actually really simple. There’s not a tremendous number of variations in the essential nature of data."

I think one of the more eye opening things that Stu is also on record for is saying that "the ideal number of methods in an API is 0". Stu is advocating for exposing an information model, which is fundamentally different than an API in its temporal characteristics.

Datomic is a beautiful example of wielding information to its fullest. Query ability, schema declaration, transactions, and entity navigation are all driven through data alone. Notice how the Datomic team exposed a very minimal API for their database, and no DSL.

Even if Datomic isn’t a part of the stack that you work on, taking the time to understand what it is and how it’s built will bring you a long way - for the same reasons as learning Lisp and Clojure. Before you resort to using a macro, or even functional composition - take a step back and consider how using an information model, perhaps via Clojure EDN data, can influence your design.

An aggressive use of data will quickly put you on the right path to designing simpler systems, and hence increase the opportunities for your systems to integrate with others.

At Big Astronaut we work with Clojure, contribute to Clojure, and support a broad range of cutting-edge programming languages and methodologies. In particular, we have a deep expertise in Clojure. Pronounced "closure", it's an evolution of the Lisp programming language that runs on JVM, CLR, and JavaScript engines. Clojure's purpose is to provide a scalable methodology for solving robust and data intensive computation problems, using frameworks such as; Storm, Cascalog, and PigPen. If your team needs training in Clojure, your organization is looking for Clojure consulting, or you have a big data challenges and are considering implementing a Clojure solution we'd love to learn more about what you're working on and how we could help. Send us a message.