I don't think QUIC is a good example. If you realistically want to do something on the internet you absolutely have to use either TCP or UDP. There's no choice in it. But within the confines of your datacenter you can do whatever you want. Including re-inventing the wheel to be square-shaped if that fits better.
Even if your training computer is fully within your control, occasionally you'll want to run your protocols over the internet for example to test a node in a remote location, debug some problem, etc.
If you have requirements the internet cannot meet (eg. "latency must be <500us for correctness"), then it limits what you can do.
If it could run over the internet, I'd be able to use standard tooling (eg. wireshark to see what messages are being sent to debug my driver). I'd be able to connect to a PCI card on another machine remotely and have it 'just work', albeit with poor performance.
There's a lot you lose by inventing your own protocol, and you lose even more if your protocol can't be tunnelled over IP.
My point was It is "worse than QUIC". What I really wanted to say was even "QUIC" (the garbo protocol decided to "encrypt" flow-control stuff) did that and decided to go on a UDP
Was a "joke" since Elon being a clown on "the specific platform". But this design is absurd since "there is already profound high-bandwidth low-latency packet-based protocol for inter-chip networking (PCI-e)" and "This is terrible to run on conventional Ethernet/II infrastructure due to lack of hardware acceleration support due to lack of checksum/etc. offloading support".
I really think there is no performance enhancement, heck even worse performance when it is being used on data center deployment until they decided to design their own switch on everything. and considering everything and the packet header is being about the size of about UDP, It seems there is no advantage of using it instead of UDP with Broadcast IP addr., Maybe in their hopes and dreams, Broadcom or other chip vendors might support their ethertype but It will be further in the future.
Sometimes reinventing the wheel is the right call...
But considering Dojo is years late and as far as I can see hasn't yet done any meaningful work, and Tesla is still buying up a lot of H100's, I'd say the bet on reinventing everything didn't work out this time.
It's so late that it probably wouldn't be on the forefront of FLOPs/$ anymore, making the whole project have a business value of $0 (and negative if you consider sunk costs).
If then they really needed to ditch Ethernet/II entirely. If they really wanted to do stuff, They should've built on top of UDP.
The "QUIC" even did this (which is terrible for inline flow-control due to something useful for analysis being inside of encrypted part), and It's gonna be slower due to lack of hardware acceleration.
They ditched wrong thing and decided to replace with even worse software implementation that TCP/UDP already provides., and due to lack of infrastructural support, This will almost certainly have worse performance while doing exactly same job.
If this is designed to pipe data directly into silicon, allow me to introduce "PCI-e". Has similar packet-based networking, designed to be latency tolerant (but supports high bandwidth/low latency). and has similar addressing policy.
Also there is nothing you can see on the ethernet frame that support and achieve low-latency and high-bandwidth. even this is designed for silicon, this is just a PCI-e with 6-byte addressing, but way worse compatibility.
I think the difference is that Tesla plans like a tech company in that they assume they will build an infinite number of cars so any fixed cost optimization is going to be worth it eventually
You are right Elon personally created the protocol and wrote the spec. It's not like Tesla employs engineers who can make decisions like these independently.