Hacker Newsnew | past | comments | ask | show | jobs | submit | fbrchps's commentslogin

I'm curious how often you find factual inaccuracies in the LLM responses when doing that.

I've found that more often than not, it gets at least one key feature/option/etc. outright wrong whenever I've tried that, making it effectively useless for me. Since I need to verify the exact information myself anyways, I'm 90% the easy to just having the different items in comparing up in side-by-side browser tabs, anyways.


I usually use it for sub $10 things (mostly groceries) and I'm actively grocery shopping when I do it. So say I'm standing in the store trying to decide between 8 different yogurts all of which have different sales going on and different servings sizes, so instead of having to flip over and read all the 8 brands and do math to equalize everything myself I take a photo of the shelf and ask Gemini which one has the most protein per dollar. It usually gets it pretty accurate, I'm doing the math to check in my head but it's just a time saver to not have to fuss every single time there's a sale. But it's not just yogurt it's lots of things, like debating chicken vs beef meatballs or which of the breakfast cereal is closest to the current favorite because I don't want to have to go to an extra store because this one doesn't have it. When I first got Claude I was determined to save $200 having spent the $200 on Claude and I would say it did manage to assist in grocery shopping sufficiently to make it worthwhile. It also helps keep a running memory for me about prices of certain things, did you know the prices for Bonne Maman chocolate hazelnut spread have been fluctuating by $4 it goes from $4.50 to $8.50. I take photos of the eggs section and ask what's the best hen care to price this week. Probably the biggest oops I've had doing this was asking it how to replace my bicycle freewheel, and it told me to watch a video, order the part, only to discover I'm not strong enough by far and the real solution was to pay the guy at the bike shop 10 dollars to do it with his giant vise grip + special freewheel tool. I did have to pay bike shop guy an extra 10 dollars too for him to fix my own attempt that almost ruined the whole wheel.


I guess the Bonne Maman chocolate hazelnut spread trade is over then. Cats outta the bag.


That was something we discussed at my workplace.

Prior hosting provider was a little-known company with decent enough track record, but because they employed humans, stuff would break. When it did break, C-suite would panic about how much revenue is lost, etc.

The number of outages was "reasonable" to anyone who understood the technical side, but non-technical would complain for weeks after an outage about how we're always down, "well BigServiceX doesn't break ever, why do we?", and again lost revenue.

Now on Azure/Cloudflare, we go down when everyone else does, but C-Suite goes "oh it's not just us, and it's out of our control? Okay let us know when it fixes itself."

A great lesson in optics and perception, for our junior team members.


The first 92% and the last 92%, exactly.


Unfortunately, Top Gun 2 was not "for entertainment's sake" it was another round of US military advertising/propaganda, just like the first one.


If it wasn't sufficiently entertaining, it would be ineffective as propaganda.


I'm also getting the error on Android, latest Chrome.


Latest Firefox on Android does seem to work, oddly enough. How the turntables...


This format is unreadable on mobile, it keeps opening up my keyboard and scrolling up a bit when it does.

I understand and appreciate the "why" of the format, but this also could have been a non-editable "editor-like" presentation and achieved the same result.


This site's mobile experience mirrors my feelings when having to edit deeply nested and templated YAML.


Is editing deeply nested JSON and XML a better experience than YAML? I don't think so.


I’d argue yes, strictly due to the lack of significant whitespace.


That's a really odd reason to prefer something used as a configuration language where readability is important.


From the bottom of the article:

> # ps. By design, this website is as usable as YAML.

It's intentionally bad.


And from their own testimonials section:

> The good news (I realised) was that you can select all the text of the site, and then delete it. Problem solved.



This site is beautiful in it's own way.


Its. It's is a contraction of it+is


The problem with any tool like this is that people are often _terrible_ at knowing how to clearly explain what it is they want/what their actual issue is.

Key example: "login is broken!" Could be the captcha didn't load, captcha was blocked by their ad blocker, they are rate limited, they used the wrong email, they used the wrong password, they don't have an account, they aren't on the right website, etc.


That's not what my tool is for. A ticket system would be better for such cases. My tool is more for requesting features, such as “Please add PayPal as a payment method” or “Please implement website widgets, I need them urgently”.


Customer/User is also not good at explaining those things either.

I've found the most success with collecting these requests internally, watching what bubbles up and then refining that into a solid use case - then present that to existing clients to vote/weight.


> a browser-based project that claimed to be undetectable

For now


That's just part of the game. Sometimes you're ahead, sometimes you're behind, but there's never a decisive winner.


That's not turnstile, that's a Managed Challenge.

Turnstile is the in-page captcha option, which you're right, does affect page load. But they force a defer on the loading of that JS as best they can.

Also, turnstile is a Proof of Work check, and is meant to slow down & verify would-be attack vectors. Turnstile should only be used on things like Login, email change, "place order", etc.


Managed challenges actually come from the same "challenges" platform, which includes Turnstile; the only difference being that Turnstile is something that you can embed yourself on a webpage, and managed challenge is Cloudflare serving the same "challenge" on an interstitial web page.

Also, Turnstile is definitely not a simple proof of work check, and performs browser fingerprinting and checks for web APIs. You can easily check this by changing your browser's user-agent at the header level and leave it as-is at the header level; this puts Turnstile into an infinite loop.


Or the much more sensible, and MSFT way of handling it (in outlook)

ExternalUser: Hello here is a calendar invite I would like you to attend, please confirm or deny

User: Thank you, now I can verify the request and choose to add this to my calendar or not


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: