Hacker News new | past | comments | ask | show | jobs | submit login

Here's the best tool for finding usability issues: https://aistudio.google.com/live

You share the screen with Gemini, and tell it (using your voice) what you are trying to do. Gemini will look at your UI and try to figure out how to accomplish the task, then tell you (using its voice) what to click.

If Gemini can't figure it out you have usability issues. Now you know what to fix!




I'll have to use this, thanks for sharing. Isn't it problematic since Gemini isn't representative of a real user, though?


Definitely a huge trap to replace real user insights with anything else.

But this looks like a nice level 0 of testing


A real user might be worse. A program is less flexible (maybe) and more consistent (definitely) than a meat space CBL.

The goal is not realism but a kind of ready made "you must be this tall to ride the rollercoaster" threshold.

Discovering edge cases with dodgy human users has its value, but that's a different value.


A real user will be worse … but that’s kinda the point.

The most valuable thing you learn in usability/research is not if your experience works, but the way it’ll be misinterpreted, abused, and bent to do things it wasn’t designed to.


Enter "Drunk User Testing". Host a happy hour event and give some buzzed users some scenarios to test.

https://www.newyorker.com/magazine/2018/04/30/an-open-bar-fo...

https://uxpamagazine.org/boozeability/


More consistent? That's not a given with LLMs unless you set the temperature to 0.


You are right. LLMs are totally random and useless.

Thanks for playing.


You seem to disagree. Here's an interesting study where the researchers used an OpenAI-LLM-based tool to grade student papers and by grading them 10 times in a row, they got vastly different results:

https://rainermuehlhoff.de/en/fobizz-AI-grading-assistant-te...

Quote: "The results reveal significant shortcomings: The tool’s numerical grades and qualitative feedback are often random and do not improve even when its suggestions are incorporated."


Very clever. Reminds me of using Alexa to test your pronunciation of foreign words. If Alexa has no idea you probably said it wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: