Hacker News

nsteel · 2024-09-23T12:40:18 1727095218

I don't think QUIC is a good example. If you realistically want to do something on the internet you absolutely have to use either TCP or UDP. There's no choice in it. But within the confines of your datacenter you can do whatever you want. Including re-inventing the wheel to be square-shaped if that fits better.

londons_explore · 2024-09-23T13:02:55 1727096575

Even if your training computer is fully within your control, occasionally you'll want to run your protocols over the internet for example to test a node in a remote location, debug some problem, etc.

If you have requirements the internet cannot meet (eg. "latency must be <500us for correctness"), then it limits what you can do.

fidotron · 2024-09-23T13:09:53 1727096993

Do you run PCI over the internet to test boards remotely?

londons_explore · 2024-09-23T13:16:54 1727097414

No, but thats a bit of a pain.

If it could run over the internet, I'd be able to use standard tooling (eg. wireshark to see what messages are being sent to debug my driver). I'd be able to connect to a PCI card on another machine remotely and have it 'just work', albeit with poor performance.

There's a lot you lose by inventing your own protocol, and you lose even more if your protocol can't be tunnelled over IP.

ethbr1 · 2024-09-23T13:58:15 1727099895

That's what engineering is though: losing things you care less about in exchange for gaining things you care more about.

Your point about being able to reach the Internet probably isn't as important of a design goal as some latency ceiling within their cluster.

fidotron · 2024-09-23T12:44:42 1727095482

I legitimately couldn’t tell if the poster was sarcastic. Overpromotion of QUIC is verging on meme territory.

Alex4386 · 2024-09-26T03:52:40 1727322760

My point was It is "worse than QUIC". What I really wanted to say was even "QUIC" (the garbo protocol decided to "encrypt" flow-control stuff) did that and decided to go on a UDP

Alex4386 · 2024-09-26T04:16:41 1727324201

Also they are the company which decided to ditch CANbus on their cars. This protocol itself is already a clown.

Cthulhu_ · 2024-09-23T12:49:42 1727095782

What makes you think Musk was behind this personally? There's very few things he is involved in in terms of engineering.

Alex4386 · 2024-09-26T04:08:26 1727323706

Was a "joke" since Elon being a clown on "the specific platform". But this design is absurd since "there is already profound high-bandwidth low-latency packet-based protocol for inter-chip networking (PCI-e)" and "This is terrible to run on conventional Ethernet/II infrastructure due to lack of hardware acceleration support due to lack of checksum/etc. offloading support".

I really think there is no performance enhancement, heck even worse performance when it is being used on data center deployment until they decided to design their own switch on everything. and considering everything and the packet header is being about the size of about UDP, It seems there is no advantage of using it instead of UDP with Broadcast IP addr., Maybe in their hopes and dreams, Broadcom or other chip vendors might support their ethertype but It will be further in the future.

billsmithaustin · 2024-09-23T12:31:43 1727094703

Or perhaps Elon is not personally involved in what network protocols they use on their Dojo AI supercomputer.

londons_explore · 2024-09-23T12:59:35 1727096375

Sometimes reinventing the wheel is the right call...

But considering Dojo is years late and as far as I can see hasn't yet done any meaningful work, and Tesla is still buying up a lot of H100's, I'd say the bet on reinventing everything didn't work out this time.

It's so late that it probably wouldn't be on the forefront of FLOPs/$ anymore, making the whole project have a business value of $0 (and negative if you consider sunk costs).

Alex4386 · 2024-09-26T03:58:43 1727323123

If then they really needed to ditch Ethernet/II entirely. If they really wanted to do stuff, They should've built on top of UDP. The "QUIC" even did this (which is terrible for inline flow-control due to something useful for analysis being inside of encrypted part), and It's gonna be slower due to lack of hardware acceleration.

They ditched wrong thing and decided to replace with even worse software implementation that TCP/UDP already provides., and due to lack of infrastructural support, This will almost certainly have worse performance while doing exactly same job.

KaiserPro · 2024-09-23T13:42:30 1727098950

You don't want to use QUIC for this.

QUIC is designed to handle loss, and its not meant to be ultra low latency.

QUIC is designed to be used over IP, not on raw ethernet.

There is somewhat of a method to do it this way, especially as its designed to pipe data directly into silicon.

also "the giants of UDP" seems a bit off. UDP is the under developed step child of IP, completely outshined by TCP.

Alex4386 · 2024-09-26T04:03:36 1727323416

If this is designed to pipe data directly into silicon, allow me to introduce "PCI-e". Has similar packet-based networking, designed to be latency tolerant (but supports high bandwidth/low latency). and has similar addressing policy.

Also there is nothing you can see on the ethernet frame that support and achieve low-latency and high-bandwidth. even this is designed for silicon, this is just a PCI-e with 6-byte addressing, but way worse compatibility.

KaiserPro · 2024-09-26T10:23:44 1727346224

Look I think its silly too, but in their defence, switching PCI-e is really fucking hard.

Spooky23 · 2024-09-23T12:23:08 1727094188

Tesla is a weird company. They are hyperfocused on GM-like micro cost cutting in the cars.

Yet they spend expensive engineering time on stuff like this. Maybe there’s some big cost savings in the backend.

adgjlsfhk1 · 2024-09-23T16:51:40 1727110300

I think the difference is that Tesla plans like a tech company in that they assume they will build an infinite number of cars so any fixed cost optimization is going to be worth it eventually

zaroth · 2024-09-23T13:30:07 1727098207

It’s for FSD, which when fully realized, will have a global market cap of over $10 trillion.

paxys · 2024-09-23T13:30:02 1727098202

You are right Elon personally created the protocol and wrote the spec. It's not like Tesla employs engineers who can make decisions like these independently.

mkoubaa · 2024-09-23T13:44:10 1727099050

At best you can say his organizations give engineers the latitude to reinvent the wheel when they feel it's necessary

olalonde · 2024-09-23T13:01:13 1727096473

> Elon being Elon and reinventing the wheel again!

Well he has a pretty good track record at doing that.