Click here to Skip to main content
15,891,864 members
Articles / WCF
Technical Blog

How to Make an SQL From Clojure or My Macros Kata: Part 2

Rate me:
Please Sign up or sign in to vote.
5.00/5 (1 vote)
8 Jun 2014CPOL11 min read 5.5K   1  
In a recent post I showed some strange things that one can do with macros to be able to write SQL-like queries right in Clojure. Several weeks ago we enabled selecting fields from records as well as filtering them with where clause.
In a recent post I showed some strange things that one can do with macros to be able to write SQL-like queries right in Clojure. Several weeks ago we enabled selecting fields from records as well as filtering them with where clause. In the end of the first post I promised you that we will soon add sorting capabilities to our query language and enable joining “tables” together. Not that I am going to totally break this promise, but I will deliver on it only partially: this time we will do sorting and the sweetest part, joins, we leave for the final post.<o:p>

Before proceeding, let me remind once more that the code obtained through this exercise is not something that can be used in any real application – it sucks from both performance and feature-richness perspectives and is hardly idiomatic, nice or readable. Furthermore, while doing the exercise I gradually came across the idea that what I show here is by no means a good way to utilize the power of macros. Still, I am sure that implementing a query language like this is a good exercise and can really help one understand what macros are about, how to handle them and, at least, where one should not use them. This way of learning is hard but it pays back a lot once you put enough effort into it.<o:p>

Let’s recap where did we stop last time. First of all, we mastered picking fields from records and can write simple queries like this one:<o:p>

(def table [<br />  {:a 1 :b 100 :c "100" :d 4}<br />  {:a 2 :b 200 :c "200" :d 3}<br />  {:a 3 :b 300 :c "300" :d 2}<br />  {:a 4 :b 400 :c "400" :d 1}])<br /><br />(select :c from table)<br />;=> ({:c "100"} {:c "200"} {:c "300"} {:c "400"})<br />

However, since this is hardly an impressing achievement and can be done in a much more concise way, we have gone further and allowed for filtering – starting from very simple conditions and proceeding to quite long and complex ones:<o:p>

(select :a :b from table where :a < :d)<br />;=> ({:a 1, :b 100} {:a 2, :b 200})<br /><br />(select :a :b from table where :a = 2 or (:d < 3 and :b > 300))<br />;=> ({:a 2, :b 200} {:a 4, :b 400})<br />

Our key goals were to make the syntax as close to SQL as possible and I think we succeeded at it – at the cost of introducing a lot of complex stuff, whose main purpose is to deal with symbols like from, where and others. Now let’s take a look at our macros and see what we can do to make ordering possible. The key player in our team is select:<o:p>

(defmacro select [& what]<br />  (let [fields (set <br />          (take-while #(not= 'from %) what))<br />      source (fnext <br />          (drop-while #(not= 'from %) what))<br />      conditions (next (drop-while #(not= 'where %) what))]<br />      `(map (fn [record#]<br />            (into {} (filter #(~fields (first %)) record#)))<br />          (filter #(condition % ~@conditions) ~source))))<br />

For now, select is aware of from and where clauses, which it uses to find the desired fields, the source table and the filtering conditions in a query. To enable sorting we have to pay attention to the order by clause as well and here I am going to cheat. For the sake of making code a tiny bit simpler I will use a single orderby word without a whitespace – this will save us a couple keystrokes. The only place in our code where we have to pay attention to this symbol is in the select macro, where we have to retrieve the actual ordering conditions from the original query. For this we use the same approach as with where – basically skip everything up to the orderby symbol and the symbol itself. At the same time, we should remember that the new clause at the end of the query might introduce a problem: we don’t want this stuff to get into the condition macro, because upon meeting orderings it will immediately spit an “Unable to resolve symbol: orderby” error. Thus, we also restrict the conditions list on both sides:<o:p>

(defmacro select [& what]<br />  (let [fields (set <br />          (take-while #(not= 'from %) what))<br />      source (fnext <br />          (drop-while #(not= 'from %) what))<br />      conditions (take-while #(not= 'orderby %) 
(next (drop-while #(not= 'where %) what)))<br />      orderings (next (drop-while #(not= 'orderby %) what))]<br />      `(map (fn [record#]<br />            (into {} (filter #(~fields (first %)) record#)))<br />          (filter #(condition % ~@conditions) ~source))))<br />

Now, once we have the orderings list extracted from the query, we can use it in some way. Because the select macro doesn’t look pretty straightforward already, it is a good idea to implement the sorting stuff elsewhere and just use it in select. For this purpose we are going to introduce another macro - order. To start with, let us make it distinguish between 2 situations: when no ordering is requested and when there is actually some stuff in the ordering conditions list. For now we can simply return nil in the latter case. The place to plug ordering in select seems quite obvious: sorting should be done after filtering so we will just wrap the existing body of select (the syntax-quoted part) in a call to the new macro:<o:p>

(defmacro order [orderings what]<br />  (if (empty? orderings)<br />    what<br />    nil))<br /><br />(defmacro select [& what]<br />  (let [fields (set <br />          (take-while #(not= 'from %) what))<br />      source (fnext <br />          (drop-while #(not= 'from %) what))<br />      conditions (take-while #(not= 'orderby %) <br />             (next (drop-while #(not= 'where %) what)))<br />      orderings (next (drop-while #(not= 'orderby %) what))]<br />      `(order ~orderings<br />        (map (fn [record#]<br />            (into {} (filter #(~fields (first %)) record#)))<br />          (filter #(condition % ~@conditions) ~source)))))<br />

If you execute any query with the updated macro you will see that the only thing that has changed is that now we get nil when there is a non-empty orderby clause:<o:p>

(select :c from table where :a > 1)<br />;=> ({:c "200"} {:c "300"} {:c "400"})<br /><br />(select :b :c from table where :a > 1 orderby :b)<br />;=> nil<br />

Let’s take this as an evidence that we didn’t break anything and proceed with sorting. First, what are we expecting to get in the list of ordering conditions? Definitely, it should contain keywords denoting the fields to order by. Besides, each of these keywords may have an optional asc or desc symbol after it, standing for ascending and descending order, respectively. When the keyword is not followed by such a symbol, we will assume ascending order, similarly to what SQL dialects do. Now, this is not a completely trivial thing to implement, so let us first do something simple – for example, handle the case when there are only keywords in the orderings list. For this we only need to use the built-in sort-by and juxt functions in the order macro:<o:p>

(defmacro order [orderings what]<br />  (if (empty? orderings)<br />    what<br />    `(sort-by (juxt ~@orderings) ~what)))<br /><br />(def table [<br />  {:a 1 :b 400 :c "200" :d 4}<br />  {:a 2 :b 300 :c "100" :d 3}<br />  {:a 3 :b 200 :c "400" :d 2}<br />  {:a 4 :b 100 :c "300" :d 1}])<br /><br />(select :b :c from table where :a > 1 orderby :b)<br />;=> ({:c "300", :b 100} {:c "400", :b 200} {:c "100", :b 300})<br /><br />(pprint (select :a :b :c :d from table orderby :d))<br />;=> ({:a 4, :c "300", :b 100, :d 1}<br />;=> {:a 3, :c "400", :b 200, :d 2}<br />;=> {:a 2, :c "100", :b 300, :d 3}<br />;=> {:a 1, :c "200", :b 400, :d 4})<br />

The examples show us that filtering and the simplest sorting facilities both work nicely. Still, if we chose to ask for ascending or descending order explicitly, the compiler will award us with something like “Unable to resolve symbol: asc”. Let us allow for these symbols and get closer to our goal. Here I want to start with something that I have been avoiding since the beginning of our strange exercise – we will get rid of the asc and desc symbols in the order macro well before doing anything else. Even though while implementing select and condition I got used to handling symbols in macros, they still bother me a lot and I want to force them out as fast as possible. We will replace asc with 1 and desc with -1 – this feels pretty logical and saves us from comparing everything to asc and desc as well as from the attempts to prevent Clojure from evaluating these symbols.<o:p>

(defmacro order [orderings what]<br />  (if (empty? orderings)<br />    what<br />    (let [orderings-desym<br />        (vec (map (fn [s]<br />        (cond (= 'desc s) -1<br />            (= 'asc s) 1<br />            (keyword? s) s))))]<br />    `(sort-by (juxt ~@orderings) ~what))))<br />

Next, we need a way to understand which keywords should go with which order. This would be very simple should we not allow to skip the asc symbol, but with this extra convenience feature we have to find a way to handle this without falling into some non-trivial loop-recur construct. Fortunately, this can be done in a not very elegant but satisfactory manner:<o:p>

(defmacro order [orderings what]<br />  (if (empty? orderings)<br />    what<br />    (let [orderings-desym <br />        (vec (map (fn [s]<br />        (cond (= 'desc s) -1<br />            (= 'asc s) 1<br />            (keyword? s) s))))<br />          orderings-separated <br />          (->> orderings-desym<br />              (partition-by keyword?)<br />              (map (partial interpose 1))<br />              flatten<br />              (partition-all 2))]<br />    `(sort-by (juxt ~@orderings) ~what))))<br />

This seemingly strange sequence of functions transforms the incoming list of ordering conditions into a sequence of pairs of the form (:key 1) or (:key -1). To achieve this result we first separate the list into sublists of keywords without ordering markers and those of ordering markers themselves. Then we use interpose, which will add missing 1‘s, and flatten the resulting list to finally split it into pairs. To make this easier to understand let’s take a look at the example – here are the stages of this transformation for a particular list of ordering conditions:<o:p>

;0. [:a :b :c 1 :d :e -1 :f] ;initial orderings<br />;1. ((:a :b :c) (1) (:d :e) (-1) (:f)) ;partition-by keyword?<br />;2. ((:a 1 :b 1 :c) (1) (:d 1 :e) (-1) (:f)) ;map (partial interpose 1)<br />;3. (:a 1 :b 1 :c 1 :d 1 :e -1 :f) ;flatten<br />;4. ((:a 1) (:b 1) (:c 1) (:d 1) (:e -1) (:f)) ;partition-all 2<br />

As you see, each keyword apart from the last one gets its 1 or -1. – missing tail is, in fact, not a big problem and we will deal with it shortly. Now we have set up everything except for the actual sorting part – we need to transform the new sequence of conditions into a list of functions suitable for sort-by. For this purpose, we can introduce an auxiliary function and consume it in the order macro:<o:p>

(defn order-fn<br />  ([k]<br />    k)<br />  ([k ord]<br />    (if (neg? ord)<br />      `(fn [r#] (- (~k r#)))<br />      k)))<br /><br />(defmacro order [orderings what]<br />  (if (empty? orderings)<br />    what<br />    (let [orderings-desym <br />        (map (fn [s]<br />        (cond (= 'desc s) -1<br />            (= 'asc s) 1<br />            (keyword? s) s)) orderings)<br />          orderings-separated <br />          (->> orderings-desym<br />        (partition-by keyword?)<br />        (map (partial interpose 1))<br />        flatten<br />        (partition-all 2))<br />          order-funcs <br />          (map #(apply order-fn %) orderings-separated)]<br />    `(sort-by (juxt ~@order-funcs) ~what))))<br /><br />(pprint (select :a :b :c :d from table orderby :a desc))<br /><br />;=> ({:a 4, :c "300", :b 100, :d 1}<br />;=> {:a 3, :c "400", :b 200, :d 2}<br />;=> {:a 2, :c "100", :b 300, :d 3}<br />;=> {:a 1, :c "200", :b 400, :d 4})<br />

It works! Let’s try to understand why. The key piece is the order-fn, which generates sorting functions for us. The one-argument version is trivial – it takes a single key and returns it to the caller. Why do we need something this stupid? Well, that’s simply our way to handle the last keyword in the orderings sequence when it comes without explicit asc or desc. As seen in the example above in this case it won’t have an ordering marker (1 or -1) in the orderings-separated list, but with the help of order-fn defined as above we can just ignore this and everything will work fine. 

The two-arguments version of order-fn is a bit less trivial but still manageable: it will either return the keyword if we want ascending order, or wrap it into a function that will "invert" the numeric value retrieved with the keyword otherwise.  At the same time, here we have one nasty feature: it returns a syntax-quoted fn form instead of a mere function. This might seem weird – and it is – but at the same time that’s the only way it is going to work. The problem here is that we use order-fn in a macro to produce another function – a closure around the k argument. This combination of macro, function and closure somehow makes compiler vomit “No matching ctor found for class user$order_fn$fn__*” errors. I must admit that I failed to understand this problem – if you can explain it I would be grateful. Still, for now we, at least, can go on with quoting.

The way we use order-fn is very simple – we map it onto the list of (key, sorting order) pairs and then pass the resulting functions to juxt. The latter produces a single function, which extracts all the values used for ordering from a record. Finally, in combination with sort-by this gives us a properly ordered results set. <o:p>

Now, you might have spotted several problems with our solution. The most severe one (apart from the approach in general) is that we allow sorting only by the numeric fields – strings won’t do. This is because to produce an ordering function for desc we use simple numeric negation, which isn’t compatible with strings, sequences or anything else. I have tried to come up with a more generic solution, but the time and brainpower that I had put into it were not enough. Fortunately, my goal of understanding macros better and learning to be not afraid of quoting and unquoting allows me to live with this. However, if you come up with a more generic implementation of order-fn it will be a great pleasure for me to see it – please share it in comments!<o:p>

On the other side, the use of quoting in the order-fn makes me think that something is definitely wrong with our approach to the problem. The degree of complexity of our macros also strengthens this feeling – I still can manage them, but it ceased to be an easy task long before we produced a working implementation of orderby. I have already stated that this way of implementing SQL-ish queries doesn’t look like an idiomatic way to use Clojure macros and all these concerns only reassure me in this idea. Still, the audible tension that I had in my brains when fighting this stuff confirms that the exercise is a damn good kata.<o:p>

Here are some of my takeaways from this iteration:<o:p>
  • Maybe the most frequent error that I faced here was the “Wrong number of arguments passed to a keyword” one. The reason for this was always that Clojure tried to evaluate a list starting from a keyword when I didn’t mean it to be evaluated. This same problem manifested itself in some other ways and I find it the most important thing to tackle: you should always pay close attention to what and when you are evaluating.
  • I have expected to face some difficulties with sorting over several fields – until I remembered that there is juxt in Clojure. So one of the lessons from each and every exercise in programming is actually that there is likely a function to solve your problem. That's actually one of the reasons why it is good to do these exercises – you discover and remember a lot of useful stuff.
  • One sign of using a wrong approach to a problem is that you have a lot of bindings or variables and find it difficult to come up with proper names for them (order, order-fn, orderings-separated, etc simply don’t sound very good). Maybe the most appropriate thing to do under such conditions is to step back and try to rethink the way you reason about the problem.

I will add joins to the mess in the next post – as promised. For now, feel free to play with what we already have – here is the code. By the way, there is a problem with our new select macro, which limits sorting capabilities, but is very easy to fix. Did you spot it? Let me know in the comments!<o:p>

In addition to showing me where I messed up, please tell whether you find this exercise a good way to accustom oneself with macros or do you think it is as useless as it is long and complex? Maybe you have your favorite Clojure kata, which helps to explore and understand macros better? I will be glad to hear about it!<o:p>

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer Acumatica
Russian Federation Russian Federation
I am a full-time .NET programmer and a lover of C#, C++, Clojure and Python at the same time. I do some Windows Phone development on my own and occasionally try myself at web-development. To enhance all these software-related activities I maintain a blog writing there on various topics, most of which actually come back to programming.

Comments and Discussions

 
-- There are no messages in this forum --