One day a student came to Moon and said: “I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons.”
Moon patiently told the student the following story:
“One day a student came to Moon and said: ‘I understand how to make a better garbage collector...
If something is purely functional and doesn't have any fixed point loopholes or other nonsense, you can know that there is no program with circular pointers.
I don't think that's true. I think you could design a purely functional language where everything is immutable, but something contains a circular loop by definition. You could just declare that l is the list of integers going from 0 to 5 and then back to 0 ad infinitum. IMO, circular references can exist without mutation, if the language specifies that something is circular/recursive at the moment it's created. Defining a circular linked list in a declarative way rather than in terms of the instructions that build the linked list. The list is created all at once. It comes into existence with a cycle into it.
Prolog implementations provide destructive operations and mutation, but here above this is done with unification only. Arguably, you could say that unification performs mutation (but only once).
Better than that, you can "mutate" if you're not invalidating information about the variable. Oz (not incidentally, developed by Prolog people) lets you add constraints to a variable as long as they don't contradict the previous ones. Binding to a specific value is just a very strong constraint. Re-binding to the same value is okay.
This is less strict than pure functional programming, but still feels declarative and makes concurrent programming easy: no piece of code that looked previously at your variable will have its assumptions about it broken.
Arguably we could argue that everything flips 1's to 0's and back under the hood, though. Instantiation of a new lexical environment with fresh bindings mutates something in the machine.
Ah that's interesting. I was going to say that in principle a strict language could have a special syntax for defining circular data structures, but I didn't know about that particular corner of CL.
In Standard ML, all recursive value definitions must have function literals as their right-hand sides. Thus, at runtime, recursion is guaranteed to go through functions (which normally aren't allocated in the heap) and/or mutable data structures (which can be handled in a special way by the garbage collector). Immutable data structures are generally guaranteed to be cycle-free.
That being said, reference counting is a really bad fit for functional languages for several other reasons.
You don't need the type of fixed points that cause the issues with circular pointers. Recursion can be done with (appropriately for HN) the Y combinator. The kind of tricks I'm talking about rely on laziness and are pretty complicated. See: http://stackoverflow.com/questions/37366222/why-is-this-vers...
Having switched to managed code a little over a decade ago, I can say that garbage collection is a godsend, but something in this article flipped a switch for me. A conventional C programmer is taught, in a somewhat moralistic way, that it's their responsibility to track every resource and manage its lifetime, but with managed code we can delegate that to an admittedly very complicated program that takes care of it for us. Just like if I live in Manhattan, I don't need to know how to take my bag of garbage to the dump. I wonder what other metaphors are lurking right under our noses like that.
Driving conditions can vary a huge lot. Stuck in city traffic, there's nothing to gain from manual stick. But if you are coming down a steep mountain road or running a car race, you want to use your engine to brake as well as to accelerate. As far as I know, in such a case, manual is unbeatable.
> A semi-automatic transmission ... is an automobile transmission that does not change gears automatically, but rather facilitates manual gear changes by dispensing with the need to press a clutch pedal at the same time as changing gears.
I wasn't talking about clutch-and-stick vs. push-a-button. I was talking about controlling which gear is on vs. letting the car guess alone. I've never heard about race cars where the driver does not decide himself which gear is on at which moment; if you have, thanks for letting me learn something new.
FWIW, I mostly program in C and I drive automatic. C enables me to be a control freak with my code, but I don't really have a need or desire to be a control freak with other things in my life…
(1984), so large=small. The Symbolics machine used had a whopping 1M words (=4.5 Megabytes) of real and 15M words (= 67.5 Megabytes) of virtual memory.
They used that 1:15 real:virtual ratio in their experiments and claim typical use used 1:30.
I think this system swapped quite a bit more than most modern systems. That must have had an impact on optimizing the garbage collector.
1MW RAM was very expensive in 1984. It was actually 36bit + 8bit ECC memory. That was a limiting factor. Disks with >100 MB capacity were also quite expensive. With RAM, Disk, Tape and possibly other extensions the price was easily $100k for a new system.
1 MW was not really enough for practical development use. After a few years 4 MW = 18MB was a useful RAM size. Disks might then be 300MB or even larger. Which made virtual memory of 200-400 MB possible. Mid 80s onwards.
> That must have had an impact on optimizing the garbage collector.
That is one of the topics of the linked paper... ;-) We are not only talking about 'the' garbage collector, but the whole memory management system, which provided static/manual memory management and various forms of GCs.
There are all kinds of bells & whistles you can add on top of the general approach, mostly time-space tradeoffs, such as with a copying collector. But the overall time-space complexity generally remains the same. Sometimes it can get worse, like when you add concepts like ephemerons.
Well defining the java GC as tracing is underspecifying it a bit. Virtually all GCs are non-refcount, AFAIK mostly because touching refcounters all the time destroys memory performance. A tracing GC can do that in a batched and parallel fashion.
And then you realize that they are more or less the same in the end:
That paper discusses a reference counting GC which also has a trace phase in order to reclaim cycles.
"Reference counting must perform extra graph iterations in
order to complete collection of cyclic data. This is
typically viewed as a major drawback of reference counting"
What that paper really shows is the equivalence of any kind of tracing collector.
In practice when people refer to reference counting it's usually implied that there's no trace phase, and that the GC cannot reclaim cycles. Perl, Swift, etc, are the typical examples. Python is the counter example because it does reference counting and tracing.
Moon patiently told the student the following story:
“One day a student came to Moon and said: ‘I understand how to make a better garbage collector...