More

cranx · 2026-04-28T02:52:16 1777344736

This is flawed. Turquoise is not blue or green. Also different displays will show different colors. And a lot of displays aren’t great at producing the hues in the green color space. Idk the test seems arbitrary, but I’m not color expert

skygazer · 2026-04-28T03:06:10 1777345570

I'm not sure what's going on, but in Chrome my median point is heavily on the blue side but in Safari it's on the Green side. Also, at least in Safari, reset leads to a series of perceptually unchanging turquois screens -- it seems a bug. Refreshing fixes it for the next run.

cranx · 2026-02-12T15:36:23 1770910583

"Job’s done!"

cranx · 2026-01-23T12:04:58 1769169898

I find the title a bit misleading. I think it should be titled It’s Faster to Copy Memory Directly than Send a Protobuf. Which then seems rather obvious that removing a serialization and deserialization step reduces runtime.

bluGill · 2026-01-23T13:54:01 1769176441

Protobuf does something important that copying memory cannot do: a protocol that can be changed separately on either end and things can still work. You have to build for "my client doesn't send some new data" (make a good default), or "I got extra data I don't understand" (ignore it). However the ability to upgrade part of the system is critical when the system is large and complex since you can't fix everything to understand your new feature without making the new feature take ages to roll out.

Protobuf also handles a bunch of languages for you. The other team wants to write in a "stupid language" - you don't have to have a political fight to prove your preferred is best for everything. You just let that team do what they want and they can learn the hard way it was a bad language. Either it isn't really that bad and so the fight was pointless, or it was but management can find other metrics to prove it and it becomes their problem to decide if it is bad enough to be worth fixing.

vlovich123 · 2026-01-23T15:48:24 1769183304

But something more modern that doesn’t have the encoding/decoding penalty of Protobuf would be better (eg cap’n’proto but there’s a bunch now in this space).

bluGill · 2026-01-23T16:37:58 1769186278

Not that you are wrong, but in the real world this is not significant for most uses. If it is significant you are doing too much IPC. Or maybe using protobuf where you should be making a direct function call. Fix the architecture either way. (similar to how I can make bubble sort faster with careful machine code optimization, but it is hard to make modern tim sort slower in the real world no matter how bad the implementation is)

squirrellous · 2026-01-24T01:39:09 1769218749

Whether it is significant is highly specific to the domain. I think protobuf has accumulated enough problems that every large scale application will eventually run into a footgun or two. Almost all of them have to do with the generated types, not the wire format.

Examples that can noticeably slow things down even for “normal” web apps: map types and deeply nested messages.

MrDarcy · 2026-01-23T13:03:52 1769173432

TIL serializing a protobuf is only 5 times slower than copying memory, which is way faster than I thought it’d be. Impressive given all the other nice things protobuf offers to development teams.

dietr1ch · 2026-01-23T14:37:10 1769179030

I guess that number is as good or as bad as you want with the right nesting.

Protobuf is likely really close to optimally fast for what it is designed to be, and the flaws and performance losses left are most likely all in the design space, which is why alternatives are a dime a dozen.

infogulch · 2026-01-23T22:27:42 1769207262

Now check this out:

> Protobuf performs up to 6 times faster than JSON. - https://auth0.com/blog/beating-json-performance-with-protobu... (2017)

That's a 30x faster just by switching to a zero-copy data format that's suitable for both in memory use and network. JSON services spend 20-90% of their compute on serde. A zero copy data format would essentially eliminate it.

cmrdporcupine · 2026-01-23T15:22:16 1769181736

I wouldn't hold onto that number as any kind of fixed usable constant since the reality will depend entirely on things like cache locality and concurrency, and the memory bandwidth of the machine you're running on.

Go around doing this kind of pointless thing because "it's only 5x slower" is a bad assumption to make.

jeffbee · 2026-01-23T19:09:10 1769195350

Serializing a protobuf can be significantly faster than memcpy, depending. If you have a giant vector of small numbers represented with wide types (4-8 bytes in the machine) then the cost of copying them as variable-length symbols can be less.

satvikpendem · 2026-01-23T21:02:15 1769202135

5x is pretty slow honestly. Imagine anything happening 5x as slow as you'd expect it to. I mean, for a recent project I had to inline Rust structs rather than parsing JSON too for specific fields, and that definitely sped it up.

nicman23 · 2026-01-23T13:31:06 1769175066

that actually crazy fast

miroljub · 2026-01-23T12:40:10 1769172010

Yep.

Just doing memcpy or mmap would be even faster. But the same Rust advocates bragging about Rust speed frown upon such unsecure practices in C/C++.

infogulch · 2026-01-23T15:13:32 1769181212

Why don't we use standardized zero-copy data formats for this kind of thing? A standardized layout like Arrow means that the data is not tied to the layout/padding of a particular language, potential security problems like bounds checks are automatically handled by the tooling, and it works well over multiple communication channels.

mrlongroots · 2026-01-23T16:43:09 1769186589

While Arrow is amazing, it is only the C Data Interface that can be FFI'ed, which is pretty low level. If you have something higher-level like a table or a vector of recordbatches, you have to write quite a bit of FFI glue yourself. It is still performant because it's a tiny amount of metadata, but it can still be a bit tedious.

And the reason is ABI compatibility. Reasoning about ABI compatibility across different C++ versions and optimization levels and architectures can be a nightmare, let alone different programming languages.

The reason it works at all for Arrow is that the leaf levels of the data model are large contiguous columnar arrays, so reconstructing the higher layers still gets you a lot of value. The other domains where it works are tensors/DLPack and scientific arrays (Zarr etc). For arbitrary struct layouts across languages/architectures/versions, serdes is way more reliable than a universal ABI.

lenkite · 2026-01-23T14:40:36 1769179236

How we used Claude and bindgen to make Rust catch up with C's 5x performance.

cranx · 2025-10-25T13:53:39 1761400419

So this still uses the JNI for the ui? Unfortunate that is

cranx · 2025-10-25T13:34:35 1761399275

I find it strange that number 1 is not about political capital and how you it got it or maintain it.

cranx · 2025-10-20T15:23:09 1760973789

This is cool, but the UX of the arrows should follow the scroll mode of the device. You drag down on an iPhone to scroll up. Following the arrow and dragging up causes nothing to happen

Aaargh20318 · 2025-10-20T16:02:09 1760976129

This. Took me a while to realize you have to scroll down, not up.

cranx · 2025-10-07T12:35:35 1759840535

This is cool, but do you throw away stuff when done with it or do you keep things at another permanent location? Like the tent that isn’t brought on every trip?

cranx · 2025-09-06T13:05:16 1757163916

But we do. A series of mathematical functions are applied to predict the next tokens. It’s not magic although it seems like it is. People are acting like it’s the dark ages and Merlin made a rabbit disappear in a hat.

ekunazanu · 2025-09-06T13:57:32 1757167052

Depends on your definition of knowing. Sure, we know it is predicting next tokens, but do we understand why they output the things they do? I am not well versed with LLMs, but I assume even for smaller modles interpretability is a big challenge.

chongli · 2025-09-06T16:25:40 1757175940

The answer is simple: the set of weights and biases comprise a mathematical function which has been specifically built to approximate the training set. The methods of building such a function are very old and well-known (from calculus).

There's no magic here. Most of people's awestruck reactions are due to our brain's own pattern recognition abilities and our association of language use with intelligence. But there's really no intelligence here at all, just like the "face on Mars" is just a random feature of a desert planet's landscape, not an intelligent life form.

lazide · 2025-09-06T14:03:29 1757167409

For any given set of model weights and inputs? Yes, we definitely do understand them.

Do we understand the emergent properties of almost-intelligence they appear to present, and what that means about them and us, etc. etc.?

No.

jvanderbot · 2025-09-06T15:15:21 1757171721

Right. The machine works as designed and it's all assembly instructions on gates. The values in the registers change but not the instructions.

And it happens to do something weirdly useful to our own minds based on the values in the registers.

umanwizard · 2025-09-06T16:55:38 1757177738

Doesn’t this apply to any process (including human brains) that outputs sequences of words? There is some statistical distribution describing what word will come next.

cranx · 2025-08-30T05:53:15 1756533195

I think a lot of engineers think, “I thought of a complicated set of abstract ideas that mean nothing to anyone else but will demonstrate my superior intellect” and then make that. It took a lot of self control to not curse.

cranx · on March 17, 2024

Why write this in rust to execute shell commands?

Saris · on March 17, 2024

Ease of use I suppose, looking up package names and manually pasting them in is really slow.