Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is mentioned in the article, one of the last optimizations done there was switching to array. Also, Clojure’s HAMT was designed with CPU cache in mind and it’s performance characteristics don’t degrade over time. Immutable data structures will be slower than arrays - that’s true, but Clojure standard library works on arrays just fine, as demonstrated in the article.


You're right - I missed that, the author does mention arrays having better access characteristics, although he doesn't really explain why HAMTs specifically are slow.

How does Clojure's HAMT avoid fragmenting over time?

> Clojure standard library works on arrays just fine, as demonstrated in the article.

Right, but then you don't have immutability - so you lose all the guarantees that you originally had with immutable-by-default.


> although he doesn't really explain why HAMTs specifically are slow

Hello, author here :)

HAMTs are certainly slower, for "churning" operations, i.e., lots of updates, which is where Clojure exposes the transient API, which gives you limited localized mutability (some terms and conditions may apply)

Where iteration is concerned, they standard library implementation is pretty good. It relies on chunks of 64 element arrays which store the keys and values contiguously. Thus, APIs which expose direct iteration, Iterator and IReduce(Init) operate on these chunks one at a time. It isn't as fast as primitive arrays, but it's pretty fast.


> but then you don't have immutability - so you lose all the guarantees that you originally had with immutable-by-default

Not really, for example take the following solution with execution time mean : 1.678226 ms

    (defn smt-8''' [^ints times-arr]
      (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
        (if (< pointer-2 (alength times-arr))
          (let [start-element (aget times-arr pointer-1)
                end-element (aget times-arr pointer-2)
                time-diff (- end-element start-element)]
            (recur
             (if (< time-diff 1000)
               (conj! res [(mapv #(aget times-arr (+ pointer-1 (int %))) (range 8))
                           time-diff])
               res)
             (inc pointer-1)
             (inc pointer-2)))
          (persistent! res))))
It only requires the input being an array, but it will return an immutable persistent vector of vectors. So it is very easy to selectively go down to an array in the performance critical sections while being immutable and idiomatic in most places.

> he doesn't really explain why HAMTs specifically are slow

Take this solution using HAMT vectors with execution time mean : 23.567174 ms

    (defn smt-8' [times-vec]
      (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
        (if-let [end-element (get times-vec pointer-2)]
          (let [end-element (int end-element)
                start-element (int (get times-vec pointer-1))
                time-diff (- end-element start-element)]
            (recur
             (if (< time-diff 1000)
               (conj! res [(subvec times-vec pointer-1 (inc pointer-2))
                           time-diff])
               res)
             (inc pointer-1)
             (inc pointer-2)))
          (persistent! res))))
Now it is 14 times slower than my prior one which iterates over an array, but it is still pretty fast, so that's OP's point, things are often sufficiently fast, and when they are not you can selectively optimize those parts, and easily too, see how similar my two solutions are from one another.

Edit: These assume you've `(set! unchecked-math true)` prior to compiling the functions.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: