We have changed our entire business model so that what we actually produce is very strongly aligned with pelicans on bicycles. This way we’ll always know which model is best for us.
Highly recommend this approach, saves us tons of eval time.
One very clever consequence of Anthropic's guarded release of the Mythos model is that they've kind of claimed the position of best in class here, and also positioned themselves as the responsible vendor in this space in one fell swoop.
OpenAI pulled the same trick with GPT3. It's amazing how well it's working judging by the comments I'm hearing from people I know exist. Because out there on social media, who knows.
Should be compulsory reading. Actually, now that I think about it, this would make a great interview question in the AI era: “what did you think of Soul of a new machine?”
Those two books are probably the two best about tech projects I've ever read. I worked at Data General as a product manager for about 13 years and know many of the individuals although I joined a few years after the book was written.
What strikes me is the stories that never get told. I met a retiree at a Java meetup once who had worked at Zilog during the z8000 era. He was surprised to meet someone who knew about that.
Especially, pre-web and pre-blogs there's a great deal of tech industry history that largely doesn't exist any longer unless it was especially notable and/or some author decided to spend a year or two writing about it.
Long term this could make Anthropic stronger. Not sucking at the government teat, but actually building and iterating to solve real problems that a broad swath of industry needs.
There is one scenario it would be good for. People running stock trading programs often need a better network and always on environment than they can get at home
One superpower I wish I had is the incredible summarizing into single sentences that you can see in the LLM web UIs when they automatically make a title for a discussion.
If by "train" you mean "learn", it occurs to me you could try applying the "CAR story" interview technique of relating a story in 3 sentence (one each for the Challenge, Action and Result). Once you have it down to 3 sentences, distilling it into one, or producing a title, should become doable. HTH
This is a fine start for filesystem and network policies. But before I’m ever going to be comfortable with an OpenClaw-like thing running on my system on my behalf, I’m going to want policies at an application level as well - which emails can be read, sent, deleted. Same for calendar entries and instant messaging, etc.
Highly recommend this approach, saves us tons of eval time.
reply