Yes - impressive how good the small models are getting, and this "reasoning dist...

Yes - impressive how good the small models are getting, and this "reasoning distillation" seems to have given them a significant boost.

Even though humor is largely about the unanticipated punchline, I'd have guessed (maybe wrongly) that there'd be enough analytical discussion of humor in the training set for a reasoning model to come up with a much more plausible attempt at a formulaic type of joke.

From the example given it seems there's too much "thought" put into "what do I have to work with here", and not enough into conceiving/selecting a template for the joke. Maybe part of the problem is that the LLM doesn't realize that, being an LLM, it's best chance at being funny to a human is to closely stick to a formula that humans find funny, and not try to be too smart in trying to deconstruct it.