I guess it seems like you kinda chose this question by chance, but for new similar types of questions, do you now do them yourself (or get a coworker to try) to get at least an n=1 sample for how long it ought to take? I like to give a 2x margin of time over my / a coworker's time, so anything that takes us over 30 minutes is right out or needs to be simplified when we only get an hour with the candidate. An experience with an intern at my last job who struggled with our large codebase (mostly in the way of being comfortable despite not having or being able to fit everything in working memory all the time) led me to conclude the same as you, i.e. it's an important signal (at least for our company's work) to be able to jump in and effectively work in a somewhat large existing codebase.
I'm amused at the comments suggesting this is too easy (or your own suggesting the same for redis), I think if I tried this I would have filtered almost everyone, at least if they only had an hour (minus time for soft resume questions to ease them into it, and questions they have at the end). So many programmers have never even used grep (and that's not necessarily something to hold against them at least at my last job where the "As hire Bs and Bs hire Cs and Cs hire Ds" transition had already long since occurred). I've made two attempts at the idea though by crafting my own problem in its little framework, the latter attempt I used most involves writing several lines of code (not even involving a loop) into an "INSERT CODE HERE" method body, either in Java or JavaScript, and even that was hard enough to filter some people and still provide a bit of score distribution among those who passed. Still, I think it's the right approach if you have a large or complicated existing codebase, and even in the confines of an hour seems like a better use of time approximating the gold standard "work sample test" than asking a classic algorithm question.
Yes definitely -- although it's pretty crazy how much not being under pressure affects your ability to complete the question in X minutes. I've learned over time that good interviews take a lot of experimentation, and the best thing you can do is test/refine questions in real interviews before you rely on them as a signal to make a hiring call on a candidate.
I wouldn't necessarily consider these questions a substitute to an algorithms question, but rather a way of obtaining a different and important signal. An algorithms question may be a valuable signal too, depending on the nature of the role.
I'm amused at the comments suggesting this is too easy (or your own suggesting the same for redis), I think if I tried this I would have filtered almost everyone, at least if they only had an hour (minus time for soft resume questions to ease them into it, and questions they have at the end). So many programmers have never even used grep (and that's not necessarily something to hold against them at least at my last job where the "As hire Bs and Bs hire Cs and Cs hire Ds" transition had already long since occurred). I've made two attempts at the idea though by crafting my own problem in its little framework, the latter attempt I used most involves writing several lines of code (not even involving a loop) into an "INSERT CODE HERE" method body, either in Java or JavaScript, and even that was hard enough to filter some people and still provide a bit of score distribution among those who passed. Still, I think it's the right approach if you have a large or complicated existing codebase, and even in the confines of an hour seems like a better use of time approximating the gold standard "work sample test" than asking a classic algorithm question.