Grep in sections of a file

Processing files which are obviously organized in sections, chunks whatever you want to call it, happens and trying to find elements in it in an AND relationship is my most common use case. Sadly grep does not seem to have a nice way of processing based on a separator, but purely goes by line, so the easy way around this is to join all the lines in a chunk and grep in the result. Think input like this


Some AWK magic does the joining trick,

awk '/SECTION/ {printf "\n%s\n",$0;next} {printf "%s ",$0} END {print "\n"}' INPUT_FILE

And now

awk '/SECTION/ {printf "\n%s\n",$0;next} {printf "%s ",$0} END {print "\n"}' INPUT_FILE | \
grep 'foo=bar' | \
grep 'baz=blob'

Gives the matching section.

Solving Sudoku with Clojure core.logic

Recently I had some fun solving Sudoku using Clojure.core.logic. Afterwards figuring out that it would actually be the example on the Wiki, but hey fun non the less and my solution actually is very close to the Wiki one, especially after cleaning it up.

I find core.logic fascinating as it allows to express logic problems in a very straight forward fashion, and given I will not learn Prolog anytime soon, it get me to some understanding of the ideas.

Solving a puzzle like Sudoku basically breaks down towards the following steps:

  1. Break down the puzzle according to the rules, in case of Sudoku this means defining rows, columns and squares
  2. Defining the variables for the pieces to be computed, in my case this means replacing input zeros with core.logic lvars.
  3. Define the rules, in case of Sudoku there are only, constraint on the numbers to 1-9, constraint on rows, columns and squares to have each number show up only once.
(defn solve
  "Takes a sudoku as a vector, with all empty cells set to 0 and returns
  a solved sudoku as a vector.

    [0 0 0 2 6 0 7 0 1
     6 8 0 0 7 0 0 9 0
     1 9 0 0 0 4 5 0 0
     8 2 0 1 0 0 0 4 0
     0 0 4 6 0 2 9 0 0
     0 5 0 0 0 3 0 2 8
     0 0 9 3 0 0 0 7 4
     0 4 0 0 5 0 0 3 6
     7 0 3 0 1 8 0 0 0])
  (let [board (vec (map #(if (zero? %) (lvar) %) sudoku))
        rows (indexed-sub-board board row-indexes)
        cols (indexed-sub-board board column-indexes)
        squares (indexed-sub-board board square-indexes)]

     (run 1 [q]
       (== q board)
       ;; only 1 - 9 are allowed
       (everyg #(fd/in % (fd/interval 1 9)) board)

       ;; every number can appear only once per row
       (everyg fd/distinct rows)
       ;; every number can appear only once per column
       (everyg fd/distinct cols)
       ;; every number can appear only once per square
       (everyg fd/distinct squares)))))

Overall the whole solution ended up very straight forward, the only problem I faced was the very macro heavy definition of core.logic, making debugging not as straight forward and I ended up reading a log of library code to figure out how things were supposed to work. But I guess this also comes from my general lack of knowledge in this specific kind of programming, and therefore having to read up on very basic concepts along the way.

The full code is available on github.

A Lisp in Clojure? How useless! But fun non the less.

Having found a nice post on how to built a simple lisp like (scheme like) language (How to Write a (Lisp) Interpreter (in Python)) I was wondering what would this look like in Clojure. Writing a lisp in Clojure sounded very useless and fun so here we go the result in on Github.

A couple interesting observations:

Since Clojure lends itself to using immutable datastructures the inital implementation of the evaluation seemed a little complex as it needs to receive the environment along with the next token. But as soon as I was implementing function calls this suddenly made the stacking of environments trivial. No complex chain maps etc. needed, nice!

clojure.core.match is a very nice little lib, I wish it was part of core Clojure by default. I had first implemented a giant cond statement, match made it much more concise and robust, as it not only asserts the matching symbol but also the parameters to it.

(defn lclj-eval
  "Eval a expression X in the context of environment ENV, defaults to
  ([x] (lclj-eval x lclj-base-env))
  ([x env]
   (match [x]
          [(_ :guard symbol?)] {:result ((keyword x) env) :env env}
          [(_ :guard number?)] {:result x :env env}
          [['if _ _ _]] (lclj-eval (if-branch x env) env)
          [['define _ _]] {:env (define-in-env x env)}
          [['quote _]] {:env env :result (-> x rest first)}
          [['lambda _ _]] {:env env :result (lclj-fn x env)}
          :else (lclj-fn-call x env))))

Obviously this can all be done better but it was a fun (I guess in total) afternoon for sure, having never written a lisp interpreter before.

Reversing collection vs Java Array in Clojure

How big is the overhead of using Clojure datastructures vs native Java Arrays? I was rather suprised to find it is lower than I expected, and the implementation looks much better using Clojure native vectors and lists than Java Arrays.

First an essentail when doing something like this

(set! *warn-on-reflection* true)

Without that it is easy to accidentally create arrays very inefficient calls via the array accessor and setter methods by having to rely on reflection. Now for the code

(defn hn-reverse
  (reduce #(conj %1 %2) '() coll))

Simple basic clojure implementation walking the incoming collection and prepending to a list.

Using criterium for a quick benchmark on my machine gives me

(let [v (vec (range 1000))]
  (bench (hn-reverse v)))

On my little XPS 13 gives me

Evaluation count : 2370120 in 60 samples of 39502 calls.
             Execution time mean : 26.090496 microsecond
    Execution time std-deviation : 832.330358 ns
   Execution time lower quantile : 25.220196 microsecond ( 2.5%)
   Execution time upper quantile : 28.154777 microsecond (97.5%)
                   Overhead used : 11.468971 ns

Found 5 outliers in 60 samples (8.3333 %)
  low-severe     5 (8.3333 %)
 Variance from outliers : 18.9765 % Variance is moderately inflated by outliers

Now how would an array perform here? (and yes I assumed it would be faster hence the name)

(defn fast-reverse
  [^ints array]
  ;; ^ints type annotation is essential here, otherwise this will be very slow
  ;; due to reflection on aget and aset
  (let [l (alength array)]
    (loop [i (dec l)
           j 0
           a (int-array l)]
      (if (= j l) a
            (aset a i (aget array j))
            (recur (dec i) (inc j) a))))))

Again quick criterium test

(let [a (int-array (range 1000))]
  (bench (fast-reverse a)))

This time

Evaluation count : 5851680 in 60 samples of 97528 calls.
             Execution time mean : 10.561515 microsecond
    Execution time std-deviation : 336.219142 ns
   Execution time lower quantile : 10.240088 microsecond ( 2.5%)
   Execution time upper quantile : 11.385153 microsecond (97.5%)
                   Overhead used : 11.468971 ns

Found 3 outliers in 60 samples (5.0000 %)
  low-severe     3 (5.0000 %)
 Variance from outliers : 18.9721 % Variance is moderately inflated by outliers

Using native Java Arrays is about 2.5x faster than clojure vector and list implentation in this case.

I somehow expected a bigger difference given the jump from immutable to mutable datastructures and the less allocations needed but it seems like this kind of optimization only really makes sense in the edge cases.

Counting vowels, in a more declerative manner

Playing with some clojure code while solving some of karan/Projects I discovered my solutions differ a lot when thinking in a functional way doing programming in Clojure than doing the same in Ruby some time ago. almost arriving at a declerative style in this case

(defn count-vowels
  "Take string S count vowels return map of vowel to count."
  (let [up-inc (fnil inc 0)
        counts (reduce #(update %1 (string/lower-case %2) up-inc) {} s)
        vowels #{"a" "e" "i" "o" "u"}]
    (select-keys counts vowels)))

 (generators/string generators/printable-ascii-char 1000))

Not sure why the concept of building a dataset and selecting what is relevant felt more natural than imperativly building up the data here, but it did for me somehow and the code reads quite nicely.

Playing with lazy-seq

Played a little with lazy-seq vs loop recur today, and got a suprise that lazy-seq is not as slow as I would have thought. Most likely the usecases are very different so by no means is this interchangably for most applications, but I assumed that a lazy-seq carries a lot of overhead. Don’t know why I thought that so, but playing around with it and timing the output I realized it seems to be a wrong assumption. By no means this is a real benchmark but it makes me read up more for sure.

(defn fib
  "Return the Nth fibonacci number."
  (loop [i n
         p 0
         pp 1]
    (if (zero? i)
      (recur (dec i) (+ pp p) p))))

(defn lazy-fib
  "Return the fibonacci sequence as a lazy sequence."
  ([] (lazy-fib 0 1))
  ([p pp] (lazy-seq (cons p (lazy-fib pp (+ p pp))))))

(let [n 60]
  (println "Nth number:")
  (time (fib n))
  (println "Non lazy:")
  (time (last (take n (map fib (range)))))
  (println "Lazy:")
  (time (last (take n (lazy-fib)))))

;; Nth number:
;; "Elapsed time: 0.022058 msecs"
;; Non lazy:
;; "Elapsed time: 0.133195 msecs"
;; Lazy:
;; "Elapsed time: 0.026575 msecs"

Seems like lazy-seq is at least same order of magnitude to just calculating N for fibonacci here, and creating a non lazy seq seems much slower. No optimization what so ever here, and also being pretty brute force about the recursion in the lazy-fib.

Emacs, echo mike alpha charlie sierra

Having moved to a different country spelling words over the phone is a real pain as its not easy for people to understand a foreign name, and because I don’t know the military alphabet in english. So even when I decide spell things out I get stuck on letters as I don’t know what the common word would be. After not remembering how to spell out my last name over the phone again today I decide to create a quick little helper, after all I almost always take notes on my laptop anyway when calling.

Here we go quick emacs string-to-military, with configurable alphabet.

(defun string-to-military (s)
  (interactive "sEnter string to transfor to military: ")
  (flet ((string-blank-p (s) (string-match-p "^\\s-*$" s))
         (military-get-string (s)
                              (if (string-blank-p s) " "
                                  (alist-get s military-alphabet "" nil #'equal))))
    (let* ((chars (mapcar 'string s))
           (normalized-chars (mapcar 'downcase chars)))
      (string-join (mapcar 'military-get-string normalized-chars) " "))))

(defun string-to-military-region (start end)
  (interactive "r")
  (let ((content (buffer-substring-no-properties start end)))
    (delete-region start end)
    (insert (string-to-military content))))

and for the curious

(defvar military-alphabet '(("a" . "alpha")
                            ("b" . "bravo")
                            ("c" . "charlie")
                            ("d" . "delta")
                            ("e" . "echo")
                            ("f" . "foxtrot")
                            ("g" . "golf")
                            ("h" . "hotel")
                            ("i" . "india")
                            ("j" . "juliet")
                            ("k" . "kilo")
                            ("l" . "lima")
                            ("m" . "mike")
                            ("n" . "november")
                            ("o" . "oscar")
                            ("p" . "papa")
                            ("q" . "quebec")
                            ("r" . "romeo")
                            ("s" . "sierra")
                            ("t" . "tango")
                            ("u" . "uniform")
                            ("v" . "victor")
                            ("w" . "whiskey")
                            ("x" . "x-Ray")
                            ("y" . "yankee")
                            ("z" . "zulu"))
  "Military alphabet alist (CHARACTER . WORD), all characters are
lowercase for lookup.")

now part of my coders-little-helper.el in my local emacs.

Setup use-package

I like my emacs configuration to be managable in git installing everything on first start, as I migrated away from my home grown install and package manage to use-package I wanted to keep this nicety, so I need to make sure use-package gets installed if it isn’t already present.

;; setup use-package, if we don't have it, install it
(require 'use-package nil t)

(unless (featurep 'use-package)
  (package-install 'use-package)
  (require 'use-package))