More

randomgermanguy · 2026-04-24T11:57:07 1777031827

Definitely something in this realm, they call the models "preview" at a bunch of different points in the paper.

What im really hoping is for a double-punch like with V3 -> R1

randomgermanguy · 2026-04-24T09:47:10 1777024030

Only comparing on SOTA scores (ignoring price etc.) is like choosing your daily-driver by looking at who makes the fastest sports-car...

LinXitoW · 2026-04-24T10:34:24 1777026864

The constant improvements of SOTA are the main thing keeping the investment machine running. We can't really remove training costs from inference costs, because a bunch of the funding and loans for the inference hardware only exists because the promises the continuous training (tries to) provides.

dnnddidiej · 2026-04-24T09:57:17 1777024637

Not really. SOTA vs non SOTA is "can I get my coding work actually done today" vs. "this can do customer support chat"

It is like car vs. kick scooter.

regularfry · 2026-04-24T11:02:11 1777028531

It really isn't. We get coding work actually done today on Opus 4.5. That's not SOTA any more, and anything proximate to that level, even quite loosely, is genuinely useful.

dnnddidiej · 2026-04-24T11:07:31 1777028851

OK we are in Opus 4.5 is not SOTA. Right by that definition .... yes you are right.

randomgermanguy · 2026-04-24T11:47:54 1777031274

I mean its almost halve a year, i think that counts ?

dnnddidiej · 2026-04-24T23:14:18 1777072458

Time wise you are correct.

randomgermanguy · 2026-04-24T11:54:20 1777031660

> "can I get my coding work actually done today" vs. "this can do customer support chat"

I think you need to define "can get coding work done" for this to make sense. Ive been using GPT-3 back-then for basic scripts, does that count ? Or only Claude-Code ?

I also think this is a false dichotomy, if you look at the Project Vend project or Vending-Bench, customer support etc. is at no means trivial. (Old but great story https://www.businessinsider.com/car-dealership-chevrolet-cha...)

UlisesAC4 · 2026-04-24T17:32:42 1777051962

This, I have been doing my side hustle code with open code an 3.2 reasoner and it is way better than what I have at day job with copilot and whatever models are there.

wahnfrieden · 2026-04-25T04:50:43 1777092643

Copilot is a bad harness that perverts the productivity of models like GPT 5.5.

dnnddidiej · 2026-04-24T23:15:08 1777072508

Tell me more please!

zrn900 · 2026-04-26T23:46:31 1777247191

Not really. The current SOTAs are already at the point that they can do that. The following models will start to surpass the daily work level. It's a diminishing returns situation just like anything else in tech.

randomgermanguy · 2026-04-24T09:41:47 1777023707

If you found a rare 9000 card with 200+ GB of VRAM, sure

randomgermanguy · 2026-04-07T19:20:39 1775589639

I think the general question is if they'll release it at all, haven't yet read anything stating that they would

estearum · 2026-04-07T19:33:12 1775590392

Well let me introduce people to a few brand new concepts:

https://en.wikipedia.org/wiki/Capitalism

https://en.wikipedia.org/wiki/Race_to_the_bottom

https://en.wikipedia.org/wiki/Arms_race

Of course they'll release it once they can de-risk it sufficently and/or a competitor gets close enough on their tail, whichever comes first.

randomgermanguy · 2026-03-29T12:42:57 1774788177

One can see the impact of this cultural-wave on people above ~40 pretty heavily.

Hand-in-hand with the whole "Atomkraft ? Nein Danke" campaign. (https://en.wikipedia.org/wiki/Nuclear_Power%3F_No_Thanks)

randomgermanguy · 2026-03-22T11:17:56 1774178276

The major selling point of the tinyboxes is that you're able to run them in your office without any hassle.

I used to own a Dell Poweredge for my home-office, but those fans even on minimal setting kept me up at night

randomgermanguy · 2026-03-11T09:05:51 1773219951

Yes, but in practice land-ownership is only zero sum in places like Europe where every square-kilometer has 300 years of documented ownership etc, or other high-density areas.

The Asia, Africa & the Americas have so much unused space that isn't as inhospitable as central Australia

graemep · 2026-03-11T13:28:02 1773235682

Where in Asia do you have in mind? A few things I know off hand. Sri Lanka has a higher population density than Britain, Japan's is much higher than that, and Java has nearly the population of Russia in an area smaller than England (just England, not Britain or the UK). India and China are big, but have huge populations.

There is lots of "unused space" in places like Alaska or Siberia or deserts or mountains, but land is not a fungible commodity. Unused space is unused for a reason. In practice, almost all ownership of land is a zero sum game.

randomgermanguy · 2026-03-11T09:02:58 1773219778

I think the author might argue, that simply becoming more efficient at creating a rent-seeking mechanism is not beneficial. No matter how well motivated you are to improve your zero-sum game skills, it's still zero-sum.

Or something like that.

randomgermanguy · 2026-03-10T12:18:10 1773145090

You can already buy A100/H100s on eBay. While it might not ever be economical to run these at home (cost of electricity), but it's plenty fun.

randomgermanguy · 2026-02-16T18:46:58 1771267618

Cool idea, but kinda sad that it has to go through a cloud-provider. I feel like there's a possibility with an accelerator-board (Coral TPU or something), to make this into a totally local thing maybe? The longer-waiting time is surely not an issue when considering how many people still use Polaroids.

whackamadoodle · 2026-02-16T20:14:34 1771272874

We were looking to add on-device styles with the Raspberry Pi in order to keep the device cost low, though a Coral TPU would make this easier. The OnyxStream library appears to be able to do SD1.5 generation in 10 minutes on a Pi Zero, so with some optimization and reducing image resolution img2img may be possible on the Pi in ~1 minute. We were also looking at style transfer models, which are much more lightweight and could run fast on a Pi (https://github.com/tyui592/AdaIN_Pytorch/tree/master). Eventually our goal is to make this both on-device and relatively cheap.

alexkranias · 2026-02-16T18:53:05 1771267985

We were looking into OnnxStream (https://github.com/vitoplantamura/OnnxStream) and modifying it to support img2img. We got pretty close but yeah capability of running diffusion models on a Raspi are quite limited lol.

Alternatively we could use compute from your iPhone, but it adds additional dependencies to external hardware that I don't quite like. We could use a Jetson, but then power draw is quite high. I agree with you that on-device inference is the holy grail, but figuring out the best approach is something we are still trying to figure out.