I find that it consistently breaks around that exact range you specified. In the sense that reliability falls off a cliff, even though I've used it successfully close to the 1M token limit.
At 500k+ I will define a task and it will suddenly panic and go back to a previous task that we just fully completed.
Interesting that you're migrating assistants and threads to the responses API, I presumed you were killing them off.
I started my MVP product with assistants and migrated to responses pretty easily. I handle a few more things myself but other than that it's not really been difficult.
Yes - with our built-in provider, we provide all the models that OpenRouter provides but without OpenRouter's 5% markup. We provide them at cost (the AI provider cost)
Right, my workflow to get even a basic prompt working consistently rarely involves fewer than like 10 cycles of [run it 10 times -> update the prompt extensively to knock out problems in the first step]
And then every time I try to add something new to the prompt, all the prompting for previously existing behavior often needs to be updated as well to account for the new stuff, even if it's in a totally separate 'branch' of the prompt flow/logic.
I'd anticipate that each individual MCP I wanted to add would require a similar process to ensure reliability.
We don't know whether pushing towards AGI is marching towards a dystopia.
If it's winner takes all for the first company/nation to have AGI (presuming we can control it), then slowing down progress of any kind with regulation is a risk.
I don't think there's a good enough analogy to be made, like your nuclear power/weapons example.
The hypothetical benefits of an aligned AGI outweigh those of any other technology by orders of magnitude.
As with nuclear weapons, there is non-negligible probability of wiping out the human race. The companies developing AI have not solved the alignment problem, and OpenAI even dismantled what programs it had on it. They are not going to invest in it unless forced to.
We should not be racing ahead because China is, but investing energy in alignment research and international agreements.
I think I linked to the verification process on OpenAI's website in then documentation.
You need to verify your identity with a driver's license or passport with OpenAI to have access to certain things like chain of thought summaries in the API and image generation with the new model.
Nothing I can do there, you gotta verify unfortunately.
The main issue I notice is that the up/down arrows and other UI elements are too tiny to easily/accurately tap with a touchscreen. Thankfully touchscreens are pretty good at guessing which HTML element my fat fingers are touching, but it still leaves me a bit paranoid about e.g. whether I upvoted or downvoted.
In tiny, gray text, with words that look similar to those with imperfect eyesight. It could easily be improved without breaking the site's entire aesthetic, IMO.
I am acutely nearsighted so I can still read 2mm tall characters unaided and am well past the point when most humans require corrective lenses for reading.
It's sort of a super power right about now.
So it depends. I dislike large text on my phone. old.reddit.com works just fine for me as well.
> If you are looking for an app, I am the developer of HACK
A little off-topic, but I'm totally blind (use VoiceOver) and just tried the app on iOS. A few issues:
* Several buttons are unlabelled, so I'm not sure what they do.
* Navigating through posts isn't the most efficient (I need to swipe several times to move from post to post).
* The relative level of comments within a thread isn't announced (i.e. replies to OP are at level 1, replies to those replies are level 2, etc). Are these indented visually?
* Accessibility actions should be added to quickly vote, reply to comments, etc.
Happy to provide additional feedback/model apps that do this well, but I'm not an iOS developer.
Thanks for the feedback. It’s my fault and I apologize because I hadn’t taken accessibility into account while development. I will look into making the app accessible in updates.
The lack of a redesign I think is a big part why the community remains (if it ain't broken, don't fix it!), plus HN's heavy moderation -- towards these very guidelines.
HN is more heavily moderated than all the sites people complain about on here. If you browse the front page you’ll frequently find titles which make no sense at all because they’re stripped of the adjective which meant anything.
If you browse any of it chronologically with showdead you'll find tons of completely innocent stuff moderated out from ever showing up before it would’ve ever been visible, quite often on first post for no apparent reason.
Unlike other communities where excess moderation is a concern (a frequent discussion here), you can’t object to a decision publicly. I’m knowingly flouting the abstract version of that fact because I think it objectively contributes to the discussion. But I have every reason to believe even this is going too far.
For sure it’s very carefully moderated, dang works very hard to keep conversations civil and away from flaming, especially around politics. “Heavy” has some negative connotations though. I think the moderation is appropriate.
Dang tends to be fair in his assessments. The first and last time he called me out he made it clear that I ran afoul of the terms and that I should go read them again if I had not. I stopped posting and went and read them. Like another poster said above this is a place where people get chances and when those are given even multiple times to the same person before getting "heavy" there is a high likelihood that people can turn around for the better and permanently. This is now the only place I come due to it being laid out by dang in the way that he did and my active decision to take the advice and become a better poster. There are a lot of very intelligent people here and sometimes I am in awe with the breadth of knowledge that is shared in the comments. I agree that the moderation has been appropriate and effective at that. Things are changing for the positive in my life all around thanks to that reminder and my choice to be a better person because of it.
The moderation is quiet but very firm. For example, I can't post more than like 8 comments per day without getting blocked by rate-limit errors because dang has decided I get in too many flamewars.
I think it is like 5 comments in the same hour and then you have to observe for the next 3 hours (or something like that, I don’t know exactly).
I love this feature. I’m of a personality which really easily gets baited into flamewars (which is against the guidelines). Being forced to stop after 5 posts forces me into a cooldown period where I can reflect on my behavior. When the shutdown is over I will probably be demotivated enough to continue this recklessness and the probability of me being yelled at by dang goes down with it.
If I have something important to say, it can wait a couple of hours.
Because probably dang and company tries to nip the bud when the link is posted. Since that most that pass that filter are relatively uncontroversial (they're things that interests some and silently ignored by others), they only need to actively monitor those with inherently high-level controversy (like partisan politics).
Between actual moderation and community moderation, yes. The way moderation works on HN is somewhat hidden though: dead comments will disappear, post titles and such are edited with no (publicly visible anyway) history, etc.
Inspired by Stalin, I imagine. I'm nervous to say 'it works', since we don't know really know what's being suppressed or changed.
It’s not directly monetized so the design is meant to serve the user. In contrast social media design principles are about controlling user behavior to maximize revenue. This means HN is fast, simple, and information dense. Those are desirable properties on any device.
Those links are great, I just wish there was a way to get back after clicking parent.
In long threads, it's easy to loose track of which parent a given comment is replying to. So I'll click `parent`, which gives me the context I need. But then I have no way to get back to the reply I was reading previously.
At 500k+ I will define a task and it will suddenly panic and go back to a previous task that we just fully completed.