Where in the interview process was this question used? Were there other programming and/or system design questions before and after? Was this a senior-level candidate question?
We experimented quite a bit. Since it was relatively inexpensive to administer (almost anyone could ramp up to conducting it), we were able to test it in a variety of scenarios (even for intern candidates). It ended up being _after_ the initial phone screen, but early into the on-site. There were several other questions that tested other skills (system design, algorithms knowledge, reasoning about concurrency, etc.). Every engineering candidate did the question, regardless of seniority, at least during my tenure.
I recently got this question for an engineering manager position. I got it done in an hour and 20 minute with all test cases presented passing. But I don't particularly think this is a good question for senior level candidates since implementing is mostly copy pasting existing functions and renaming them and adding a new memcached VERB.
I'd guess senior level candidates should solve it without copying and pasting, perhaps make sure test cases are present for the behaviour before modifications are made, and then generalize the existing functions. Or, state why adding more functionality on the server side for every type of arithmetic operation would be unnecessary complexity, and instead add support for any type of atomic arithmetic operation with a transaction/UDF approach.