Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

protobufs are a data exchange format. The schema needs to map clearly to the wire format because it is a notation for the wire format, not your object model du jour.

If the protobuf schema code generator had to translate these suggestions into efficient wire representations and efficient language representations, it would be more complex than half the compilers of the languages it targets.

I do mourn the days when a project could bang out a great purpose-built binary serialization format and actually use it. But half the people I hire today, and everyone in the team down the hallway that needs to use our API, can no longer do that. I'm lucky if they know how two's complement works



> protobufs are a data exchange format. The schema needs to map clearly to the wire format because it is a notation for the wire format, not your object model du jour.

Yes, exactly this. I don't understand the vitriol here. Protobufs work fine for a wide variety of purposes and where the criticisms matter people use a different tool.

> I do mourn the days when a project could bang out a great purpose-built binary serialization format and actually use it. But half the people I hire today, and everyone in the team down the hallway that needs to use our API, can no longer do that. I'm lucky if they know how two's complement works

Would be great to work at a place where everyone knew how the machine worked, but the vast majority of developers entering the workplace since about 2000 have learned Java and web sh*t exclusively.


The problem is that protobufs aren’t just an interchange format, they’re also a system for generating the code used to interact with said interchange. Said code has a habit of leaking into the types used by your code base. It’s too easy to just pass protoc-generated objects around and use them all over your code base, hence the majority of the criticisms.

Protobuf seems to encourage this… instead of a hard boundary where your serialization logic ends and your business logic begins, every project I’ve worked on that uses protobuf tends to blur the lines all over the place, even going as far as to make protoc-generated types the core data model used by the whole code base.


You can take the wire format and ignore the code generator, if you like. I often do, especially for languages other than C++. For C++ I quite like the generated code.


I mean can you blame them? There's just so much java and web shit to learn. When are they going to pick up bit-banging when every week there's a new build toolchain or minifier or CSS standard or whatever other pile of abstraction upon abstraction of the day.


We use C and all our nodes are x86... so we just dump packed structs on the wire and the schema is a header file. I suspect this sort of simple approach works in most cases...

And really that's how the network stack on Linux and Co. is implemented (modulo taking care of endianness)


Yep, that does work quite well in a fixed scope.

Till one day someone wakes up and wants a UI in C#. Which is also not a big deal, but you have to either hand roll the serialization code or build a header file parser and generate it. And then someone needs to add a field - so we just tack it onto the end of the struct. This works fine as long as you have the length represented out of band somehow. If not then you get to make struct FooBar2 with your extra field. Then a year later someone has the great idea that it would be great to send text, and now you're making a variable length packet or just always sending around structs with a bunch of empty space and a length field (which is also not too bad). But wait, now the UI team totally needs that data in JavaScript for their new Web UI, so you're back to either generating or hand jamming dozens of structs. All the while tamping out bugs in various places where someone is running old code against new data or old data against new code - all of which have various expectations. Or that microcontroller that is blowing up because it just doesn't like that unaligned integer that is packed into the struct.

Not that I'm bitter about that life - it just involved a lot more troubleshooting protocols over the years than I'd have liked. Anyway, those are problems that Protocol Buffers helps solve. But as long as you're using just C and not changing much - packed structs are quite lovely.


Sure, a small header that contains type and length is trivial and pretty much implied in my previous comment.

I don't think people should think too much about "what if we change language", etc. because (1) that's unlikely to happen, (2) you have years ahead of you, (3) it may be simpler to convert structs (not the most complicated thing in the world amd supported in most languages in one form or another) into whatever else when/if you actually need it than to overengineer now 'just in case'.


The main downside I see is performing schema changes. A very common pattern I use with protobufs in distributed systems is: 1.) add a field to the schema 2.) update and deploy the grpc servers to start populating that new field. The old grpc clients will continue to work ignoring the new field 3.) update and deploy the grpc clients to start using the new field

With packed structs you don’t get the reverse compatibility, so schemas and endpoints have to be immutable which is not the end of the world but it can make changes more difficult.


Deprecating a field in the struct becomes difficult then without versioning.

And versioning only works when someone remembers to increment the version.


I notice this too. What happened to CS degrees? All I meet are React-infused-piles-of-tools people that don't seem to be able to code anything without a mile long pile of tools.


There are ~30K CS graduates from US universities per year. A big chunk of them are international students who have to (or want to) go back home. The remaining ones are being fought over to fill an estimated 500K-1.5M open software engineering jobs. So chances are your company is likely not hiring the cream of the crop.

Numbers are all rough, but seemed to be in the same ballpark in all the sources I checked (like https://thescalers.com/development-deep-dive-how-many-softwa...).


All of this has shifted during the layoffs. The numbers and experience on the ground has drastically changed.


The recent round of layoffs all put together made up a tiny percent of just the net new headcount added to the industry in just the last ~2 years. And all these companies have already started ramping hiring back up. There is no broad change in tech hiring.


No your just using old numbers. headcount has also shrunk.

You can't use numbers and raw logic because it has various levels of inaccuracies and things you haven't accounted for especially when the numbers and data come from times prior to the layoffs.

Reports from recruiters and people who are unemployed are telling me things that are significantly different. There is a fundamental change in the job market.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: