Regarding the rant at the end about papers being behind paywalls (which, BTW, I agree with):
If anyone doesn't know, CiteSeerX is a search engine which finds, collects, and caches publicly available papers from around the internet. It's invaluable for finding freely-available papers which the major sources (like the ACM) only make available from behind paywalls.
The nice thing about a Cheney-On-The-MTA style collector, in addition to allowing for relatively cheap first class continuations, is that makes the whole stack vs heap allocation moot.
Indeed, they use stack as fast bump allocator. That's really smart but probably fairly inefficient, since all the return addresses, stack frames and saved registers are just write-only junk.
GCC at least has an annotation that can be used to tell the compiler that a function call never returns. I don't have the details, but I presume it causes the compiler to omit any save/restore code, though I would guess that the return address is still pushed (x86 call instead of jmp) so that you get decent stack traces. 16-byte stack alignment across calls means that the average cost of pushing the return pointer is 8 bytes. It's not free, but it's cheap.
...But that's the part that allows for cheap first class continuations, because in Scheme, those stack frames can be part of a datastructure. Sort of. Continuations are confusing.
That's a problem when compiling via C (which admittedly is one of the major selling point of the technique), when targeting a lower level intermediate language or even assembly, custom calling conventions are an option.
If anyone doesn't know, CiteSeerX is a search engine which finds, collects, and caches publicly available papers from around the internet. It's invaluable for finding freely-available papers which the major sources (like the ACM) only make available from behind paywalls.
Here's CiteSeerX's page on the paper in question:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.77.5...
I find it fantastically useful.