I like the characterization that Andrej Bauer uses: "while relates to goto, as effect handlers to shift/reset" :-)
That is, you indeed need delimited continuations to implement effect handlers, but they have more structure than, say, shift/reset. In particular, instead of being able to just yield (shift) with any function to an innermost context (reset), you can only perform a certain set of operations that are handled by a function defined in the handler.
This gives various advantages, in particular,
- you can give simple types to effect handlers (while shift/reset for example needs an answer type system), and you can use fine grained effect typing (instead of just a "shift" effect, you have "<exn,console>" effects)
- you can do better reasoning as the set of operations is restricted
- and (as a consequence) you have more opportunity to optimize them. In particular, tail-presumptive operations can be implemented to execute "in-place" without capturing the continuation for example.
- finally, different effect handlers are compositional and can be combined freely (unlike shift/reset; for example shift aways shifts to the innermost reset so we cannot compose arbitrarily compose with multiple resets).
"a bunch" is actually 2. What looks like a second nay vote is actually a rebuttal to the first. It seems most people just skipped the voting and went straight to the discussion.
I think an even more interesting perspective is that they aren't the same thing, but they can be implemented in terms of each other.
"Implemented in terms of each other" is an important metric, and in math that's not an unreasonable definition of "the same thing after all", but in engineering I would say it's not. If nothing else, the different affordances offered by each matter quite a bit in practice. Even in math that's still true, honestly, although in a different way.
> I think an even more interesting perspective is that they aren't the same thing, but they can be implemented in terms of each other.
I don’t know, I think that it’s perfectly reasonable even from an engineering/computer science perspective to say that they are literally the same thing, just viewed from two different perspectives.
After all, what is an object? It’s some code, with bundled data. What is a closure? It’s some code… with some bundled data. I think it’s fair to say that this is a “is it a vase or two faces?” situation. You’re looking at the same image, just seeing it from different perspectives.
I don't know where you're getting that from. In lambda calculus — where closures were first developed — there's no restrictions against mutation of captured variables or values.
Only in pure lambda calculus. But that's not because of closures. That's for all variables everywhere; captured, free, whatever. So your original statement — "usually the captured values in a closure can't be mutated" — still has no basis even if you insist "But, pure λ-calculus has no mutations". It's not because of closures that the captured variables became immutable.
When used in any of many lambda calculi that do have mutations, closures support mutation just the same as any variable. Whether a variable is free or captured, that doesn't change if or how it can be mutated.
very pedantically anything that occupies memory and has an an address is an object in C and C++. If you mean in the OO way, I wouldn't call a lambda an object as it doesn't even do dynamic dispatch.
I think this is really interesting. I'm in the same boat. and its easy to dismiss the delimited people as being deluded. but a lot of highly developed people think shift/reset is _alot_ easier to work with.
I can understand the basic case...but its totally unclear for me how to compose them.
this is kind of a fundamental issue about language development...apparently we can get used to all sorts of terrible ideas - which makes evaluating them quite difficult.
(reset val) => val
(reset E[(shift k expr)]) => (reset ((lambda (k) expr)
(lambda (v) (reset E[v]))))
; where E has no reset
E is the current-continuation of the expression, and the comment is an invariant to be held.
And I think call/cc is this:
E[(call/cc k expr)] => ((lambda (k) expr) (lambda (v) E[v]))
So imagine a stack like this:
g
h
call/cc my-fun
this captures the whole stack.
And now imagine:
g
reset
h
shift my-fun
this only captures h (see how it moves the E in the reduction rules and wraps it in reset). So if I try to store k from shift and call it at a later time it won't reach g, but doing the same with call/cc it would reach g.
Thanks I think I did get that much, but your example still helps a bit.
Some questions:
- how does (both formally and from an implementation point of view) does shift find the corresponding reset? Does it relies on some sort of dynamic scoping or does the reset need to be lexically visible from the shift call? It doesn't help that in all the examples shift and resets are always called in the same expression (with shift in a lambda).
- Related to the above, how does typing work? Is the captured delimited continuation typed (i.e. does it take parameters of a specific type and returns typed values), and if so, does the typing of shift and reset need to match (racket is dynamically typed so this is not an issue there of course).
I have no formal language background, I just implement coroutines and (one shot) continuations for fun (in C++), so the key to my understanding is from an implementation point of view.
edit: from a comment elsethread it seems that I should look into effect systems.
i looked up the racket docs to understand shift/ reset, and stumbled upon a litany of control structures. One is prompt/control, which has almost the same reduction rule..
(prompt E[(control k expr)]) => (prompt ((lambda (k) expr)
(lambda (v) (E[v]))))
the last reset is the difference, but it seemingly makes no difference, they are interchangeable.
You can basically think of call/cc as subroutine + return/throw (break out of control structures), and delimited continuations as yield + next/send in Python.
It’s very likely specified as an acyclic graph (because, without checking, I see no reason why it would not be). So if you made a cyclic graph it would be wrong. It’s possible, likely even, that a lot of git’s implementation and associated tools don’t check for that, because it’s very hard[1] to achieve, but that just means stuff would break in more non-obvious ways.
[1] Constructing a cyclic git graph means finding a hash collision, and git uses SHA1 which is known to have weaknesses. With another hash algorithm it might be “practically impossible” instead of “very hard”.
You can use the "grafting" feature to emulate rewriting commit parents (but not true rewrite--as in the object itself is not actually modified and some tools do not respect it), so maybe give it a shot
Another variant is "law of excluded middle" continuations. It gives you a function, but when you call that then it will go back and give you the value passed instead. You can implement this in terms of call/cc and vice-versa.
You can implement delimited continuations with regular call/cc and a mutable cell. Of course, like all abstractions based on raw call/cc, it's not modular: you can't do that and use the result inside, for example, a multitasking scheduler that itself uses raw call/cc as well.
call/cc is a widget from Scheme / Lisp is a glorified come_here statement (where "come_here" is roughly the exact opposite of a "goto" statement). Its gloriously complicated but its applications are seemingly endless.
call/cc calls function "f" with a "come_here" label. Later, if you call the label, you return to the "come_here" point. call/cc saves off the stack-frame, so the "come_here" executes exactly as before (all local variables remain the same as they were before).
------
In C++ (call/cc doesn't exist, but we can imagine it in C++ lambdas)
int f(auto come_here){
come_here(25); // call the "come_here" function with the parameter 25
}
...
int x;
x = call/cc(f);
// x now equals 25, because f called come_here with 25.
We can imagine the "come_here" function to be the exact line of code "x = call/cc". Now funny things can happen if you do ... funny things.
int g(auto come_here){
some_global_variable = come_here; // Yeah, lets store this come_here label off for later.
return 0; // Notice, come_here hasn't been called yet. Return 0 normally.
}
int x;
x = call/cc(g); // defaults to the 0, for the "Return 0" in g.
if(x < 100){
some_global_variable(x+1); // Loop back to the x=call/cc command, and set x+1.
}
Look Ma, I created an overly complicated while loop!
---------
EDIT: Now for fun, try _writing_ try, catch, finally statements (from Java). By saving off the right labels in the right order, you can experiment with try/catch, or other complicated maneuvers (ex: the "Ambiguous" operator, often named "amb").
This call/cc exists because scheme is catering itself as a meta-language. A language to build other languages out of.
You don't need the global variable (at least, the Scheme equivalent does not - I'm not going to lawyer over how variable capture would work in a hypothetical dialect of C++ that added call/cc). Any variable in scope to be captured by 'g' will do. There may be some misunderstanding in your statement "all local variables remain the same as they were before" - the same bindings will exist, but the values are not restored to their values prior to the call/cc, such that any subsequent changes to those variables will be preserved, permitting exactly the sort of thing you used as an example (albeit using a global variable).
You're right: there's lots of little mistakes I have in that post. And I think the variable binding vs variable-values thing is a major mistake I left in there... and I appreciate the corrections.
Hopefully my relatively low-effort translation is useful to someone out there. But yes, I do admit that I "err on the side of incorrect", trying to favor simplicity in the explanation instead.
Interestingly it is relatively straightforward to implement one shot continuations in C++ (i.e. call/cc1), but your example implicitly relies on being able to invoke a continuation multiple times. You can sort of implement multishot continuations with fork (more than just a curiosity, that's how reversible debuggers work in fact).
There are two important ideas to pick up here, I think. First, functional programming prefers a reduction-based model of computation: the whole program text represents the current state of the program, and a "step" of computation corresponds to simplifying ("reducing") the expression in a well-defined way. Second, `call/cc` adds a novel reduction rule that behaves a bit like function application, but looks a bit backwards. (`shift` and `reduce` then generalize the basic idea, but use the same machinery.)
In detail: Consider a program `fib 2`, which simply computes the third Fibonacci number and then exits. We can define `fib` like so (using Haskell-y syntax, which I find a little lighter for this purpose):
fib n = fib' n 0 1
fib' 0 a b = b
fib' n a b = fib (n - 1) b (a + b)
If we run this program, what do the individual steps look like?
print (fib 2)
= print (fib' 2 0 1) -- Substitute the definition of fib
= print (fib' (2 - 1) 1 (0 + 1)) -- Substitute the second clause of fib'
= print (fib' 1 1 1) -- Simplify - and + (technically, this is two steps)
= print (fib' (1 - 1) 1 (1 + 1))
= print (fib' 0 1 2)
= print 2 -- Substitute the first clause of fib'
= ()
This isn't just an analogy; this is exactly how an unoptimized interpreter could fully implement a simple functional language. There's no need for an actual call stack when evaluating this program: the program text itself describes "what to do next" (which is the point of a call stack).
Now, what does `call/cc` do? Well, remember what function application does: we take something like `(\x. FOO) A` and substitute `A` into `FOO` where ever `x` appears, no matter how deep into the program it is (unless it's shadowed, i.e. not the same `x`). We could write this as `apply (\x. FOO) A` if we wanted, although by syntax rules we know that the first value in a call is the function to call. (The capitalized names could be entire other expressions, whereas `x` is just a variable name.)
Similarly, if the whole program looks like `A (call/cc \x. FOO)`, we substitute `\y. A` into `FOO` wherever `x` appears (unless it's shadowed)`. It's a weird kind of function call, where the function is nested inside `call/cc` like normal, but the argument is everything outside `call/cc`. And since `A _` has got a value-shaped hole where the `call/cc` was, we know that if `FOO` is going to use it, it needs to provide it a value -- hence why we substitute `\y. A`, not just `A`. (So we have to fill the hole with the new variable name `y`.)
print (call/cc (\x. x 42))
= (\x. x 42) (\y. print y) -- `\y. print y` is equivalent to `print`,
= (\y. print y) 42 -- so I'll elide this expansion next time.
= print 42
This example seems kind of useless -- call/cc did nothing! But what if we had more values to return than just `42` -- what if we had a whole bunch of things, and we wanted to run the same processing for all of them?
print (call/cc (\x. x 42 >> x 101 >> x 0))
= (\x. x 42 >> x 101 >> x 0) print
= print 42 >> print 101 >> print 0
(`>>` is just a Haskell operator for sequencing, like `;` in C or Java.)
Now we've replicated the rest of our program based on each of the three values returned by the inner function. It's like we've bifurcated (trifurcated?) our program into multiple independent programs that continue from where the original program left off. (You might know that `fork` in Linux returns a different value depending on whether you're in the original process or in the spawned child process. This is very similar!)
But what if we don't want to do everything in our program multiple times? What if we just want to do some of it for each value, but join everything back together at the end for some final processing? Well, `shift` and `reset` let you do that. The `shift` operator slots in where `call/cc` was, but you now get to choose how much of the rest of the program to replicate by placing a `reset` somewhere else in the program. You can wrap the whole program in `reset` to get the same behavior as `call/cc`, or you can scope it to a smaller portion:
(reset (print (shift (\x. x 42 >> x 101 >> x 0)))) >> print "Done"
= ((\x. x 42 >> x 101 >> x 0) print) >> print "Done"
= (print 42 >> print 101 >> print 0) >> print "Done"
(We definitely didn't want to print `Done` three times, right?)
None of the program's behavior needed call/cc or shift / reset. They're fundamentally rewriting rules that let us give such a program a different form. Sometimes the forms we can achieve with call/cc are more readable or provide better modularity. Sometimes they don't (so we shouldn't use them!).
Shift and reset let you write your own coroutine system like it was nothing, though, so it's certainly not without merit :)
One thing that catches me all the time is that when writing examples in a functional language, the continuation of call/cc is on the left of the call to call/cc.
Rewriting your example as imperative statements might be more understandable to non-functional[1] programmers like me:
x = call/cc ...
print x ;; <- this is the continuation
Of course this does not work well with your nice reduction based explanation, but can be made to work if you go for a CPS transformation based explanation.
For sure! Like I said, this is very similar to fork(), and you've basically called out why :)
pid = fork();
if (pid != 0) {
printf("parent!");
} else {
printf("child!");
}
// Displays, in principle, though in any order:
// parent!
// child!
`fork` captures "the rest of the program" -- the part after the `;` on `fork()`, although confusingly including the `which = _` -- and causes it to be run twice, each with a different return value (which really means "passed to the continuation").
> the continuation of call/cc is on the left of the call to call/cc.
Kiiind of. It's really around the call to call/cc -- the "101" and "5" in this example are to the right, while the "+" is to the left:
(+ (call/cc (\k. k 42)) 101 5)
= (\k. k 42) (\x. (+ x 101 5))
= (\x. (+ x 101 5)) 42
= (+ 42 101 5)
In terms of a syntax tree, the continuation is above the call to call/cc. The left/right distinction only really shows for super-simple examples. But, conversely, the argument for function application is always below the application sense, so there's a very reliable duality here.
(Another, slightly twisted perspective is that a syntax tree is, after all, a graph; and following the parent of a node leads to a subgraph just as well as following a child. The fact that this subgraph is "upside-down" is just an extra detail ;), but you can use this subgraph during reduction just like you would a child subgraph. In this way, call/cc is literally just function application, but applying the parent subgraph instead of a normal child.)
((There's a third, incredibly boring member of this family... we can pass one child to another with `apply`, and pass the context to a child with `call/cc`. What lets us pass a child to the context? `id`, or the no-op operator!))
Certainly adding call/cc doesn't allow you to represent more computable functions by programs. But there are plenty of internal features of programs that we're interested in, that let us meaningfully distinguish amongst programs that otherwise have the exact same observable behavior.
I like coroutines as an example here, because I have very direct experience writing programs with similar behavior in systems that either do or do not have coroutines. By far, I found the coroutine-based programs far easier to understand and extend.
Mattias Felleisen wrote the seminal paper on comparing expressive power of Turing-complete languages. It's worth a read if it's not come across your radar before!
If you are interested in this, Ningning Xie and I give a tutorial about effect handlers (and more) at ICFP'21 tomorrow (Thursday 12:30 EST): <https://icfp21.sigplan.org/details/icfp-2021-tutorials/5/Pro...>
[1] <https://koka-lang.github.io/koka/doc/book.html>