> In short, an interface should not be interactable until a few milliseconds after it has finished (re)rendering
I was a console game developer working on UI for many years so I am deeply familiar with the problem when a UI should be responsive to input while the visuals are changing and when it should not.
You might be surprised, but it turns out that blocking input for a while until the UI settles down is not what you want.
Yes, in cases where the UI is transitioning to an unfamiliar state, the input has a good chance to be useless or incorrect and would be better dropped on the floor. It's annoying when you think you're going to click X but the UI changes to stick Y under your finger instead.
However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
I've watched users use software that temporarily disables input like this and what you see is that they either get trained to learn the input delay and time their tap as tightly as possible, or they just get annoying and hammer inputs until it gets processed.
And, in practice, it turns out these latter times where a user is interacting with a familiar UI are 100x more common than when they misclick on an unfamiliar UI. So while the latter case is super annoying, it's a better experience in aggregate if the app is as responsive as it can be, as quickly as it can be.
Perhaps there is a third way where an app makes a distinction between flows to static context versus dynamically generated content and only puts an input block in for the latter, but I expect the line between "static" and "dynamic" is too fuzzy. People certainly learn to rely on familiar auto-complete suggestions.
UI is hard. A box of silicon to a great ape is not an easy connection to engineer.
These are great points. But I would debate the 100x point a little. And I think there are some cases where ignoring fast taps is clearly preferable.
I’m specifically thinking about phone notifications that slide in from the top – ie, from an app other than the one you’re using.
So we have two options: ignore taps on these notification banners for ~200ms after the slide-down (risking a ‘failed tap’) or don’t (risking a ‘mis-tap’).
I’d argue these are in different leagues of annoyingness, at least for notification banners, so their relative frequency difference is somewhat beside the point. A ‘failed tap’ is an annoying moment of friction - you have to wait and tap it again, which is jarring. Whereas a ‘mis-tap’ can sometimes force you to drop what you were doing and switch contexts - eg because you have now cleared the notification which would have served as a to-do, or because you’ve now marked someone’s message as read and risk appearing rude if you don’t reply immediately. Or sometimes even worse things than that.
So I would argue that even if it’s 100x less common, an mis-tap can be 1000x worse of an experience. (Take these numbers with a pinch of salt, obviously.)
Also, I’d argue a ‘failed tap’ in a power user workflow is not actually something that gets repeated that many times, as in those situations the user gets to learn (after a few jarring halts) to wait a beat before tapping.
All that said, this is all just theory, and if Apple actually implemented this for iOS notifications then it’s always possible I might change my view after trying it! In practice, I have added these post-rendering interactivity periods to UI elements myself a few times, and have found it always needs to be finely tuned to each case. UI is hard, as you say.
> So we have two options: ignore taps on these notification banners for ~200ms after the slide-down (risking a ‘failed tap’) or don’t (risking a ‘mis-tap’).
Yeah, notifications are an interesting corner case where by their nature you can probably assume a user isn't anticipating one and it might be worth ignoring input for a bit.
> Also, I’d argue a ‘failed tap’ in a power user workflow is not actually something that gets repeated that many times, as in those situations the user gets to learn (after a few jarring halts) to wait a beat before tapping.
You'd be surprised. Some users (and most software types are definitely in this camp) will learn the input delay and wait so they optimize their effort and minimize the number of taps.
But there are many other people on this planet who will just whale on the device until it does what they want. These are the same people who push every elevator and street crossing button twenty times.
I don't view notifications as a corner case. I think two factors are key:
1. Can the user predict the UI change? This is close to the static vs dynamic idea, but doesn't matter if the UI changes. If the user can learn to predict how the UI changes, processing the tap makes more sense. This allows (power) users to be fast. You usually don't know that a notification is about to be displayed, so this doesn't apply.
2. Is the action reversible? If a checkbox appears, undoing the misclick is trivial. Dismissing a potentially important notification with no history, deleting a file etc. should maybe block interactions for a moment to force the user to reconsider.
Often even better is to offer undo (if possible). It allows to fast track the happy path while you can still recover from errors.
For that reason, it's wonderful when games provide a log of actions and/or recent dialogue, so you can easily see what you missed. That kind of functionality seems less common outside games.
I am 99% sure the NY Times Games app on Android is blocking input until fully rendered on its 'home' screen where all the games are listed, and it drives me nuts. I tap on the element I want and nothing happens, I have to tap again. Maybe some kind of overlay or spinner would help signal that it's not accepting input would help? Arg.
I wonder if a good distinction is user initiated actions versus system initiated. If the user begins the action, the changes are immediate and buffered to the interface that appears next.
But when the system initiates it (eg. notifications, popups), then the prior interface remains active.
This is not the only distinction, but it is one of them, and I think that one is a good idea. Another distinction is the results of the user initiated action, of whether the result is expected or unexpected, and that distinction is not always so clear.
Great point, and I suspected the problem might not be as easy as it appears at first glance. (Because of course it isn't...)
I also considered the case when you're rapidly scrolling through a page- if a naive approach simply made things non-interactable if they've recently moved, that would neuter re-scrolling until the scrolling halted, which is NOT what people want
While your use case is valid, it's nowhere near as annoying to have to wait 1 second for a button to enable than it is to call random person from your contacts because his name appeared under your fat finger. Maybe there can be a distinction between expected layout change and ad-hoc elements appearing, like notifications, list updates etc. I would probably go too far asking for a setting of "time to enable after layout change"
> However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
This is very true, but the app has to be explicitly designed around this e.g. by not injecting random UI elements that can affect the layout.
Unfortunately this seems to be regressing in modern app UX, and not just on mobile. For example, for a very long time, the taskbar in Windows was predictable in this sense because e.g. the Start button is always in the corner, followed by the apps that you've pinned always being in the same locations. And then Win11 comes and changes taskbar layout to be centered by default instead of left-adjusted - which means that, as new apps get launched and their icons added to taskbar, the existing icons shift around to keep the whole thing centered. Who thought this was a good idea? What metric are they using to measure how good their UX is?
> Yes, in cases where the UI is transitioning to an unfamiliar state, the input has a good chance to be useless or incorrect and would be better dropped on the floor. It's annoying when you think you're going to click X but the UI changes to stick Y under your finger instead.
> However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
I agree with both of these, but I think that such a thing would work better with keyboard-oriented interfaces. However, when using a mouse or touch-screen, these are still good ideas anyways, although the situations where you will know and should expect what comes next is less when using the mouse, still it can be important because unexpected pop-ups etc from other programs, just as much as, when using the keyboard, pop-ups that take keyboard focus are as significant for this issue. Since this can sometimes involve multiple programs running on the same computer, that do not necessarily know each other, it cannot necessarily be solved from only the program itself. (I think that it will be another thing to consider in the UI of my operating system design.)
I've become obsessed with how Visual Studio Code or Helix editor gives a great big JSON settings/properties file for tweaking values. SO much so that I despise other apps for their lack of "set-ability".
To the original author's point, the consternation arises when you as a programmer just know there is an animation time, or a delay time, etc. that is hardcoded into the app and you can't adjust the value. The lack of interface and inability to have that exposed to the user is at least one major frustration that could help OP.
I was a console game developer working on UI for many years so I am deeply familiar with the problem when a UI should be responsive to input while the visuals are changing and when it should not.
You might be surprised, but it turns out that blocking input for a while until the UI settles down is not what you want.
Yes, in cases where the UI is transitioning to an unfamiliar state, the input has a good chance to be useless or incorrect and would be better dropped on the floor. It's annoying when you think you're going to click X but the UI changes to stick Y under your finger instead.
However, there are times where you're tapping on a familiar app whose flow you know intimately and you know exactly where Y is about to appear and you want to tap on it as fast as you can. In those cases, it is absolutely infuriating if the app simply ignores your input and forces you to tap again.
I've watched users use software that temporarily disables input like this and what you see is that they either get trained to learn the input delay and time their tap as tightly as possible, or they just get annoying and hammer inputs until it gets processed.
And, in practice, it turns out these latter times where a user is interacting with a familiar UI are 100x more common than when they misclick on an unfamiliar UI. So while the latter case is super annoying, it's a better experience in aggregate if the app is as responsive as it can be, as quickly as it can be.
Perhaps there is a third way where an app makes a distinction between flows to static context versus dynamically generated content and only puts an input block in for the latter, but I expect the line between "static" and "dynamic" is too fuzzy. People certainly learn to rely on familiar auto-complete suggestions.
UI is hard. A box of silicon to a great ape is not an easy connection to engineer.