I'm reading this book right now. It's really great so far! I've been working a l...

jmkr · on May 30, 2023

This has been a topic I've wanted to get into for a few years now, specifically because of Clojure! So if you have any additional recommendations I'd appreciate it.

I really enjoyed Friedman's book `Scheme and the Art of Programming` because it filled in some pieces missing from "The Little Schemer" (and "Seasoned Schemer"). Building stuff like `kons`, `map` and all that `letrec` stuff.

But the big difference between Scheme and Clojure is that in Scheme, while it's "functional by concept," you get to escape with `(set! ...)` whenever you want to have a more traditional ds/algo (like say doing the n queens problem).

In Clojure you can kind of do that by either escaping to Java, using protocols (with atoms), or using transients, but I often feel like there's a "more functional way to do stuff that hasn't been taught to me."

I've opened up either the Okasaki dissertation or book or both, but I've always had trouble reading it, and then sticking with it. And some stuff like looking at Rosetta code and saying "to reverse a linked list in a lisp is easy... because it's a cons cell" seems like cheating. Almost like showing up to an interview, saying your "linked list" is implemented in an array structure and then calling `reverse()` on it.

Will watch that talk from 2014, must not have seen it before.

I guess, conceptually, day to day things in Clojure does feel pretty natural, even easier, and I think I have a decent understanding of it. But then when I look at leetcode type problems, or something more involved, it takes a lot of mental effort to translate to it. Especially things like `big O` gets thrown away in my mental model. I get it, persistent data structures and all that, but there's still a mystery there.

smabie · on May 30, 2023

I feel like they are.. not so awesome: they are grossly inefficient due to all the pointer chasing and are pretty much guaranteed to be slower than the alternative since they trash your cache.

jfoutz · on May 30, 2023

I wish I could remember the exact book, I think it's _writing solid code_. There was a long example about the excel evaluation engine back in the day when it was shipped on CD's and had to be perfect. The approach was, use one big dumb slow, but understandable and correct implementation, in parallel, use the lightning fast super whiz bang new implementation. Since both shared the same interface, they could be swapped out, or both run at the same time.

I think there is real value in starting with the pure functional versions, then swapping out when needed. One problem that seems fairly common in large codebases is using an object as a hash key. but as the code grows, that object sort of gets passed around and eventually somebody updates it without rehashing. That's a tough bug find. They are for me anyway.

This is one of those rare cases where you can actually make it faster later without trashing the overall design.

I'd encourage starting with the pure functional version first, every time. I'd go further and say, leave some opportunity to flip back to the pure functional version in dev builds for isolating heisenbugs.

Blah blah, grain of salt, free advice, your milage may vary, Games have different requirements than webpages, everything is contextual.

This is one rare case where it's always worth having the capability of swapping back and forth is worth it. Just use it, and think hard before giving it up is a really good default.

1MachineElf · on May 30, 2023

I wanted to verify the book reference, and I think you could be correct.

There is a website for it: https://writingsolidcode.com/

The Amazon page it links to includes a few Index pages in the preview. I saw these under E:

Excel number formatting code 163

Excel's dialog handling code 113

agentultra · on May 30, 2023

You feel this way but if you do the math and benchmarks you will be surprised.

You might not be running GHC's RTS on a resource-constrained target platform but you can generate the C code that could run on that platform from Haskell, leveraging the guarantees you get with using Haskell's advanced type system.

Update: point is that GHC can optimize code and some of those optimizations can avoid pointer chasing! But like most optimizing compilers, some times GHC can miss them, and you have to do a bit of work to tell GHC where it can avoid chasing unnecessary pointers (or where it should/should-not inline, give it some help to know where it can do fusion, etc).

aeonik · on May 30, 2023

Most of the problems I'm trying to solve require those pointers anyway.

Tracking diffs and versions of data over time: Version control systems and RDMS just don't cut it.

Bitemporal databases are interesting to me as well.

duped · on May 30, 2023

There are entire fields of research dedicated to studying cache oblivious persistent data structures, and they're almost always trees.

One of the key points of immutable datastructures is that they're easy to reason about given that the invariants are upheld. Designing an efficient memory representation that upholds those invariants is an entirely separate problem domain. Not every pointer dereference is slow, and not every tree is represented with pointers.

wtetzner · on May 30, 2023

They're only grossly inefficient if you wouldn't otherwise need to frequently make copies of large mutable data structures. It all depends on what you're trying to do.

mtrimpe · on May 30, 2023

I agree that it's a great book; but I wouldn't recommend it as an entry point into understanding purely functional data structures.

Okasaki fleshes out a couple of examples and explains his thought processes and heuristics for developing such data structures quite well; but they're very much still notes on the early-stage exploration of the concept.

From a historical perspective it's still a fascinating read though and definitely recommend it if you want to read up on the origins of immutable data structures.