Oh, I was hoping this would be something built more directly over Ethernet, rather than on top of UDP/IP (if I'm understanding the layer diagram correctly).
I've been working with Ethernet devices a lot lately, using the network as a communication bus, essentially. I find that there's a lot of complexity that we simply don't need: ARP, DHCP, DNS... So many points of failure. We know all the devices on our LAN and their unique MAC addresses, and could do everything we need to addressing-wise at Layer 2. But everything's built on Layer 3 and up, so we're effectively working backward to map devices to IP addresses and vice versa. It's unsatisfying.
1. Your forwarding table would have to be larger because Ethernet uses exact match instead of longest prefix match. For example, you might be limited to 128K servers total while Google has millions.
2. The Ethernet header has less entropy for ECMP than a UDP/IP header. Maybe you could add entropy somewhere but ASICs may not support it.
3. You're breaking compatibility with... everything. Maybe Google could afford this but no one else could.
1. Especially in the datacenter. When you add VMs to the mix you get LOADS of devices to address. Add on top that a single device has multiple connections (management, internet, storage, etc), you'd run out of capacity almost instantly.
2. The point is to find some bytes that are constant for a given logical stream of related packets. Taking bytes outside of the header means taking bytes from the payload, that is by definition not deterministic. That's why everything identifies flows using the IPs + ports + protocol.
Okay, but my point there was saying "Google has millions of servers" isn't relevant, we're not looking at the entire company.
Even with a few addresses per VM, how many racks do you need to put into the same shared-compute mass? One data center is the upper limit, but it doesn't have to be the entire data center.
Back of the envelope math using modern hypervisors that can fit loads of vms in a single U
- let’s put 500 vms on a single one. with 128c256t CPUs it’s easy
- say you can fit 30 of those in a single rack (the common rack is 42U) due to power constraints
- and place 10 of those racks
That’s 500 x 30 x 10 = 150000 nodes to address. With 10 racks you already blow past the scaling limits of the common datacenter switch when it comes to MAC addresses. Here are the limits for Cisco’s Nexus 9000 series, a very common datacenter switch: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/ne...
Plus layer 2 switches when they don't know the destination port will flood. With so many hosts that would be absolutely horrific.
I've seen computers at moderately sized LAN parties (talking 100 nodes, far from the large or even massive events) that were literally crippled by the broadcast traffic. At some point the flooding and layer 2 discovery (ARP) would do the same as well.
Limiting the broadcast domains with layer 3 really makes the Internet possible. Sure, you can have less overhead and simply do layer 2 only, and really it is completely possible. It's just such a rare use case that it in practice isn't important enough to actually do.
I think in High-Performance Computing a lot of the networking NICs behave in that way, because you are 100% sure that you know the fabric layout.
Stuff like InfiniBand, HPE Slingshot, Atos BXI, ...
There is a consortium that's building a specification for those kinds of things: https://ultraethernet.org/
There are systems like this: Fibre Channel has it's own data link layer, actually they do reliability at the data link layer! I think InfiniBand is similar to this respect.
Actually it's interesting that google didn't choose any of these, for their high bandwidth storage needs. They have the money to do their own thing, but why should they?
Infiniband and most other specialized protocols have data-link-layer reliable messaging. The downside of this is that a congested switch cannot drop packets, so you end up backpressuring your network to death unless the people writing software really know what they're doing. Google was not able to make this work at scale.
Google migrated away from IB years ago. IIUC, the failure modes at the time (e.g., fully locked up fabric) were too painful at the time, and they preferred to work with a mostly vanilla Linux kernel for userspace networking.
Ah yes, instead of going to google.com or 192.168.1.1 or adding a printer connected to my Wi-Fi, let me open my big yellow pages of globally unique MAC addresses…
How do networks manage the larger number of IPv6 addresses?
My cursory digging indicates that the secret sauce is to grant large IPv6 prefixes and delegate routing to the prefix. An informative-looking Reddit comment says there are 100k IPv6 prefixes (as of Oct. 2020), and each active route takes 1 KiB. [1]
So, IPv6 differs significantly from MAC addresses because you only need to track prefixes.
Prefixes are routed. ARP, in IPv6, is replaced by a function of Neighbor Discovery Protocol (NDP) called Neighbor Advertisment (NA) and Neighbor Solicitation (NS) where L2 MAC is sent in it's messaging. The NS messaging leverages multicast to communicate with the broader set of hosts for discovery. Basically it sends a message to the multicast address containing the IPv6 address such that it can discover it's MAC. So the flooding/broadcasting for MAC of v4 is replaced by a much more efficient L3 to L2 lookup in v6.
If you're flexible enough to forgo UDP/IP, why not use infiniband instead of ethernet? That gets rid of all the complexity you mentioned but still gives you ordered streams
Infiniband is basically a nVidia monopoly since they bought Mellanox, and the hyperscalers who already chafe at nVidia's GPU pricing power don't like it one bit, which is why they are working so hard on getting rid of it.
I don't understand enough about niche high-performance interconnects to know if CXL is a viable alternative for Infiniband where Ethernet-based solutions have too much latency.
I agree, it's hard to see what real world use case would benefit from dumping UDP/IP while holding on to Ethernet, and not moving over to an Infiniband or similar solution.
That also can't be satisfied by one of the existing specialized solutions that another user mentioned, such as EtherCAT.
Why though? You can't route MAC because... ?
Because ipv4 provides a higher entropy address?
Because MAC is self-assigned and reduplication would require a higher level system?
or just because we just don't use MAC addresses that way?
I'm certain there are reasons IP came to live alongside/on top of MAC, but saying you can't do multi-hop routing with it just isn't true. If all the technologies of the Internet were reset tomorrow, how might you design the perfect layer 2 addressing and routing system?
MACS are random. Given a MAC and a connection to a LAN, you can easily answer the question, "is there a station with that MAC here". If it's not here, and you have a single gateway to another network, you can figure out that to talk to that MAC, you need to go over a gateway. And then things eventually go funny. We hit a network that talks to four others. It has no idea where to send the packet destined for that MAC. It could send it to all four (multicast). Then when a reply comes from one of them, remember that destination for next time. Remember for how long? Sending a packet to every destination will cause an exponential explosion of that packet throughout the network.
It works on small scales. We can stitch together a few LANs with ethernet switches. The switches initially forward everything to all ports, but learn where the MACs are so as to send frames only to ports where the destination MAC is known to be.
Ethernet switching won't scale to anywhere near the complexity of the Internet.
You can't route MAC because there is no prefix matching - only exact matching. That's exactly why you need to "switch" them... and incidentally this is what your proposal accomplishes – it's equivalent to a fully-switched network. Switches (especially L3 switches) maintain port-MAC association tables to switch packets between ports and they're available off the shelf.
IP addresses have structure because a single ISP buys a contiguous block, like 123.234.*.*. A simple routing table sends that whole block to a single network port.
The table required for the whole Internet is large, but not gigabytes.
You can't route by MAC-address because it's effectively random. You'd have to store the port number for every device separately. This works fine at LAN scale, but not for the whole Internet.
MAC addresses being random is a historical accident (because of hardware limitations). today we can define them in software. and just like we have link-local addresses we could self-assign link-local MAC addresses.
and i think the self assigning protocol in link-local could even go a step further. instead of hard coding a subnet, it could detect the subnet by copying the one from its nearest neighbor. so start with a random address, talk to neighbor to learn the subnet (and netmask) in use and switch to a new address within that subnet. then possibly run DHCP and update the address again. for static addresses DHCP could identify hosts by its cryptographic host key (like the one for SSH)
when two subnets join one of them may have to adjust its prefix. more complex, but still possible.
subnet prefixes could still be assigned to organizations to avoid overlap on a global level.
i am sure i am missing some details but i think in general this could work.
well, it's merging MAC and IP into one address. there is no need for two if the MAC address can be assigned dynamically. and it's extending the auto-discovery of the address to work over larger networks. so it's not reinventing but simplifying things. (or not, i am not familiar enough with the details to be aware of other problems that could complicate things again)
>You can't route by MAC-address because it's effectively random. You'd have to store the port number for every device separately. This works fine at LAN scale, but not for the whole Internet.
Not that I see any advantages to the approach but it's almost workable(?), if a little silly, at internet scale:
If every device had a 64byte ID, guesstimating 10billion people * 100 devices/head gets us a 'measly' 64TB of storage. Double that to include routing info gets us to ~128TB. A bit much to be practical, but not entirely insane either.
the router needs to remember where each address goes. with MAC addresses being random, there is no shortcut. DNS is distributed and you look it up one subdomain level at a time, and that can be cached. same for IP, the router only needs to store the subnet for each destination, not all ip addresses.
a central lookup database for mac addresses (which could be distributed by having separate servers for a segment of the address space) doesn't make much sense because the distance of a server to the location of the device is to great and would make updates expensive.
so the router has to remember each address used. but at least it would not have to store all addresses in existence. actually, i think the storage needs are similar to those for NAT. well, except backbone routers which have to store a lot more.
the actual problem is the initial discovery of a MAC address. where does the routing information for a MAC address come from?
you need some peer finding protocols like DHT, and those are slower.
Because aggregation, summarization and continents are a thing. Also... there are things which speak IP and don't use Ethernet for underlying communications, specifically in the network carrier and high performance optical space.
I used a MAC address generator to get those two, but I think two is enough to make the discussion. Current reality aside, would you be able to identify those with binary math as being on the same network device, different network devices, across the world? MAC addresses on physical NICs are provided by the manufacturer, sure you can adjust them but I think that leaves the good-faith portion of this discussion.
So if you wanted to have those to communicate no matter what you would have to have a network device state: "I'm network device A, I have this device 0C:F9:31:D2:DB:51" then another state: "I'm network device B, I have this device AB:33:C6:C6:19:74". Then whenever 0C:F9:31:D2:DB:51 wants to talk with AB:33:C6:C6:19:74 it's network device will have to just send it to the next upstream network device or if there are multiple network devices that could be upstream you could send it to them all which is just not great for security whatsoever or you now have to do a recursive lookup for whatever n devices might yet be upstream and wait for a response to see if one of those has it. Overall trying to send ethernet frames globally without an IP network sounds like not a great idea.
So it seems like the primary use of IP, as you describe, is to define a way to narrow the search to sub address groups so as to not require enumerating every address in the scheme.
Still, there's doesn't seem to be any reason you couldn't just say "device 1 gets MAC 00:00:00:00:00:01" and "device 2 gets 00:00:00:00:00:02" and the gateway controller gets :::00 and there's a special address on :::FF that can be used to talk to everyone...
Is that it? Is that all there is to IP? A loose pattern for reducing search scope, a couple reserved addresses for special cases, and a balance between address bitsize and total number of unique addresses (without requiring additional routing complexity)?
You could. Assuming all your equipment supports setting the MAC, and you make sure to operate on prefixes so you can route by prefix. There's nothing stopping you from doing so.
The reason we don't is because at the time IP was introduced, there were many alternative physical layers in active use. And while Ethernet is near ubiquitous now, what we learnt from that was that it is unreasonable to assume that all your data will go over the same physical layer. And so you need a standard addressing format that will work elsewhere too.
Nothing stops you from stripping it back locally and using MAC addresses for everything internal to you, and ditching IP, and "just" gateway to/from IP. Lots of people did gateway between different protocols before IP became the dominant choice.
But you won't get everyone else to change because it'd require new firewall and new routers, and all kinds of software rewrites, and you can see how long the IPv6 transition has taken, so you'd still need to wrap and unwrap TCP/IP and find a way to address IP for everything that isn't 100% local, and even for lots of local-only stuff unless you want to rewrite everything.
There would be potential ways. E.g. you could certainly use a few bits to say "this is external" and then have some convention to pack an IPv4 address into the MAC or let an IPv6 address overflow into the data, and use that to make gatewaying and routing to external networks easier, while everything else just relies on the MAC. But you'd still need a protocol header for other things too, and then the question is how much benefit you would gain from ditching pretty much just ARP, which isn't exactly complex, a lookup table, and replacing the IPs in the header with just a destination MAC. Because the rest of the complexity is still there.
And you can gain most of the benefit of that by getting an IPv6 EUI64 address [1]. They'll work with "normal" IP equipment, and you can optimize in your own software by having the IP stack ditch ARP lookups when they see a local EUI64 address. Whether that optimisation actually makes a difference is another question.
Then you realize doing some action ends up being O(n^2) so you add some workaround in your switch and cache some things. And you know what they say about cache invalidation. And vendor A implemented it wrong in 1993 so you have a special case for their systems. And then you want to handle abuse cases. And authentication. And you're competing against the whole rest of the world and your thing isn't enough better.
Then how do you send traffic to device1 on another network? You need globally unique addresses and hierarchy. Go back to the drawing board and come back when you’ve ended up inventing a worse IP protocol.
> It all seems so... simple
Because you haven’t even thought through basic use cases.
MAC is just one way to identify ("address") directly connected/visible nodes on a network. Not all L2 technologies use MAC addresses.
- "Directly connected/visibile" means node X can contact node Y simply by throwing something on the medium (wire, radio, etc.) and doesn't have to knowingly send to a middleman (router).
When Ethernet was invented in the early 80's there were a lot more L2 technologies. Most are uncommon now (Frame Link DLCIs I think fall in this category, and PPP/dialup was common at one time - no MACs there) except for one: I don't think the cellular network uses MAC addresses at all. I could be wrong with newer 4G/5G stuff which overlaps with Wi-Fi in various places.
> I'm certain there are reasons IP came to live alongside/on top of MAC
There were different teams/universities working on what today we would call LAN and WAN. I forget the details and history (I'm sure someone here, who was involved, could chime in, hah) and might have this wrong, but the result is LAN networking is MAC based while WAN networking is IP based.
It's one of those accidents of history that things are just the way they are and many don't question it. I run into it a lot describing basic networking concepts or early cisco material when people ask _why_ both MACs and IP addresses exist and its just... not always the correct time to explain those details to them.
The fact that there used to be a lot of alternative lower level network layers is incidentally also the best argument for IP: We needed a common shared layer because it was shit trying to gateway between multiple different protocols that user-level software had to know about. And as much as ethernet is dominant now, it's still not the only thing.
It's been done. There's XNS, and there's QNX networking over raw Ethernet. Both worked well. Look them up. You probably don't want to go that route, but it is technically possible on a LAN.
There's also Audio-over-Ethernet, which is still used for professional audio where latency must be held low.
I believe there are other broadcast protocols that runs on Ethernet for the same reason.
Fibre Channel-over-Ethernet used to be a thing too, but I haven't seen it for a while. Perhaps latency wasn't as much of an issue as people thought and it lost to iSCSI.
I don't think it necessarily is bad idea to run protocols directly on the data link layer. The fewer parts the better. It's just that somewhere someone probably wants to route it, and the more general usage tends to win. So it's always going to be a niche market where latency is really important.
eCPRI, or essentially radio signal over ethernet (ie. from base station indoors to radio module at top of the tower/roof) is another interesting use case (with even stricter latency requirements than audio).
Move to ipv6 and drop ARP and DHCP, which eliminates a good chunk of the older cruft. IPv6 builds everything on top of multicast support which is required for IPv6 switches/routers. It's so much cleaner. You could even avoid DNS if you really want.
Hmmm, so much of this looks like an attempt to solve the problems that were solved with fibre channel a couple decades back. Which I guess is standard NIH, with the advantage of not having to pay the FC consortium 95% HW margins.
But still, you would think that some of those lessons could be learned before replacing it. AKA FC routes IP as one of its many protocols on top of the lower levels providing far more service guarantees than one normally gets with ethernet. Much of the QoS/latency/etc metrics were designed into FC from the beginning as a use on storage area networks (SANs). It just never took off as a IP transport because it cost 10x as much as ehernet, including a decade ago when these same groups tried to dump it on an ethernet MAC only to discover that it requires special switches which were $$$$ because "enterprise markup" defeating the whole point of cheap ethernet phy's. See FCoE..
And yet today, there is NVMEoF on FC, which is what one runs when its important that someone scp'ing a file on your network doesn't cause your database queries to slow down.
What I don't get is why OCP doesn't just actually build some of these adapters/etc with a "we won't be greedy" take and sell them not only to the hyperscalers but on the open market. That way someone could actually build say, a FC adapter that has a price similar to an ethernet adapter.
Maybe a better example would be Infiniband which is a simple and efficient protocol... but it's basically owned by Nvidia. For whatever reason Broadcom won't make Infiniband ASICs and Google doesn't want to be locked in to Nvidia so they have to use Ethernet.
"Hardware transport" is kind of a misnomer, because this is a networking protocol. It just happens to be a networking protocol that requires hardware acceleration on the NIC. We already had that in things like Infiniband and Omnipath, but Nvidia bought one and Intel rugpulled the other. Meanwhile ethernet has been approaching parity in terms of throughput, but TCP introduces unpleasant latency, so this is Google's NIH-flavored DIY on the topic. It's something of a rite of passage for a company to convince itself that building this sort of thing in-house is necessary, and that it will revolutionize high-performance computing in all the ways that previous, nearly-identical projects have not.
"The ecosystem" is The Open Compute Project [1], a trade association which mostly puts together quasi-standards that provide targets so computer manufacturers can produce bleeding-edge gear with some hope that it will be interoperable. An example OCP production is the newer 21" racks that are starting to appear in datacenters.
> It's something of a rite of passage for a company to convince itself that building this sort of thing in-house is necessary, and that it will revolutionize high-performance computing in all the ways that previous, nearly-identical projects have not.
The same happens with:
- databases
- encryption
- operating systems
- frameworks
- programming languages
I've seen this so many times by now it stopped being funny.
Are you suggesting that it would have been better for Google to use off-the-shelf databases etc? Because at their scale it seems clearly necessary to bring that in-house.
At Google's scale a lot of the ordinary limitations do not apply. Unfortunately many companies believe that because Google does it it must be good. It's cargo cult reasoning and the result is endless NIH projects.
As for Google's 'Falcon' project: it smacks of NIH to me, but maybe their use cases are specific enough that none of the off-the-shelf bits were usable.
There seems to be quite a lot of overlap between Falcon and what the Ultra Ethernet Consortium is ostensibly working on. As well as Amazon's Scalable Reliable Datagram (SRD) thing. All of them, in a way, are about addressing deficiencies in RoCEv2 for large scale latency sensitive networking that you see in HPC and DL training.
But none of these are things you can buy today. Well, there's InfiniBand, but if you're wed to Ethernet..
I think it's a mistake to assume that an organization this large and sophisticated simply failed to try RoCE. They probably gave it a go but it didn't work out for some technical or economic reason.
When you have enough scale you can claim a certain particular way of doing things are better than the others, which in most cases is just one way of doing things. This is what we see here.
Microsoft, Facebook and Twitter all use a monorepo too. It's not just Google.
Granted, Facebook have written their own VCS and Microsoft heavily modified Git to make it usable with monorepos (but only on Windows).
Unfortunately stock Git is bad at multirepos and monorepos. When you have "hundreds of people working full time on a project" scale, stock Git doesn't have a good answer.
To this day I still haven't seen a more sensible API for low-latency Ethernet than Exablaze (was the market leader in low-latency trading, then got bought by Cisco).
The only thing blocking these from becoming standard is that it means userland has direct control of hardware.
I'm confused by this because we've been using Falcon at work for over a year now, perhaps longer, as I just started a year ago. What are they making available that wasn't already?
I don’t understand networking all that well. Is it interesting that the telcos and non-tech companies are moving away from specialized hardware toward software defined networks while the hyperscalers are using hardware acceleration?
SDN just means reconfiguring things that used to be manually configured on-the-fly.
E.g. instead of your little server setting firewall rules locally, it tells the router what traffic to allow. That router, in turn, tells upstream about its needs, and so on. Or a server reports its load, and the routers do active load balancing. The hardware wires are still there, as always.
Re. hardware acceleration, I think the earliest form of this was moving the checksum computation [1] from the CPU to the network device, even though the networking device didn't really know about the protocol it was doing the checksum for.
In both, it's just about parallelizing workloads, the same we do with microservices at the upper levels of the stack. Natural progression of distributed systems, with fancy names attached.
I don't think it's that hard to rely on hardware development, it's more of a problem of rolling out a fleet of new hardware.
It's just not realistic to take all the switches, routers, and other garbage you've got in between points in the network off the rack/ceiling/wall/pole because the hardware can't support some protocol.
Good evidence for this is the rollout of fiber, which has been happening neighborhood by neighborhood and house by house for a decade.
I was mainly talking about your own dc but yes if you need to traverse public infra it’s a complete non-starter. But also it’s not like middle boxes offer you any sort of sdn api - you still need to overlay
> Is it interesting that the telcos and non-tech companies are moving away from specialized hardware toward software defined networks while the hyperscalers are using hardware acceleration?
Their SDN implementations are also hardware accelerated.
These things are partially related - telcos running things in the cloud, in software, is actually running on top of these hardware innovations, it is just abstracted from them.
It sounds like this builds on top of Ethernet to provide a higher performance alternative to UDP/TCP, with some sort of hardware acceleration.
I may be in over my head since I’m not an HPC/datacenter expert, but not sure I understand how you’d use this on the software side. Maybe someone is aware of specific examples? (beyond the vague “HPC/AI”)
edit: as another comment mentioned, the diagram shows it’s on top of UDP/IP, so it’s mostly an alternative to TCP/IP
I normally like Google blog announcements, as they are usually heavy on technical details. But not this one. Quoting, the meat of it is:
> Fine-grained hardware-assisted round-trip time (RTT) measurements with flexible, per-flow hardware-enforced traffic shaping, and fast and accurate packet retransmissions, are combined with multipath-capable and PSP-encrypted Falcon connections ... flexible ordering semantics and graceful error handling ... hardware and software are co-designed to work together to help achieve the desired attributes of high message rate, low latency, and high bandwidth
So like QUIC, but designed for low latency. Maybe. There is no indication of how they achieve it if it is, nor is there a link to further details. The bulk of the article is literally name dropping. Protocol names, FAANG company names, standards organisation names. It reads like C-suite bait. "Come join us boys - all the big guys already have. So it's a sure winner."
I was confused by the reference to “lossy” networks in this page. Does this have a different meaning in this context than something like lossy compression where data is actually discarded?
Ethernet, unlike, say, Infiniband, doesn't promise things will get where you sent them just because it didn't error initially, so other protocols handle this at higher levels to notice the failure cases.*
For an example of what this means, try setting your MTU above the limit, and watch the raw traffic.
* - it's been years since I cared about the formal definition, my apologies if I got it wrong.
All networks are “lossy” because any cable can be cut, etc.
A “lossy” protocol is one that doesn’t attempt to compensate for that. In most cases, but not all, that means a higher level protocol will need to ensure that every bit of data has made it through. (An example protocol that might not care is one for watching broadcast TV on the Internet… if you miss a few seconds it’s not a big deal).
I guarantee that there will eventually be a vaguely similar (but different!) stack published by each of: NetFlix, Microsoft, Amazon, and Apple. Just kidding, Apple won't publish anything.
The IT ecosystem has fragmented into mutually incompatible cliques. You are either in the Google ecosystem, the Amazon ecosystem, or some other one, but there are no more truly open and industry-wide standards.
Look at WebAuthN: it enables a mobile device from "any" vendor to sign on to web pages without a password. Great! Can I transfer secrets from an Apple iPhone to a Google Android phone? Yes? No? Hello? Anyone there?
I just got a new camera. It can take HDR still images, which look astonishingly good. Can I send that to an Apple device? Sure! Can I send it to a Google device? Err... not without transcoding it first... on a Microsoft Windows box. Can I send it to a mailing list of people with mixed-vendor devices? Ha-ha... no.
This is the best argument I've seen for splitting up the FAANGs + Microsoft + NVIDIA. Once they get to this behemoth trillion-dollar scale, they become nations onto themselves and no longer need to cooperate, no longer need to use any open standards at all, and can start dictating and pushing third parties around.
Another random example is HTTP/3, which is basically the "What's best for Google" protocol.
Or gRPC, which is "What Google needs in their data centre".
And now Falcon, which is "The transport Google needs for their workloads".
Does it work for anyone else? I don't know, but it's a certainty that Google doesn't care and never will, because they don't need to.
This is exaggerated to the point that I consider it fiction. BTW Google doesn't substantially use gRPC within their datacenters.
The industry has always been this way. Back in the day there were many many processor ISAs that are now consolidated. There was been many networking standards that consolidated (IPX/SPX anyone?) New things often diverge because of new requirements not out of spite. There is a push and pull between standardization and innovation. Doesn't make it particularly unhealthy unless you can point to specific metrics and compare trends throughout the long arc of time.
> This is exaggerated to the point that I consider it fiction. BTW Google doesn't substantially use gRPC within their datacenters.
For explicitness Google uses stubby which shares a lot of interface level commonality with gRPC but there's differences at a runtime level. Nobody is slinging json or soap around Google data centers.
At least I enjoy taking the plane and I decide when to take it. Planes serve a purpose. Adtech is just parasitic. An Internet without ads wouldn't need QUIC.
I don't understand what the objection is to the methodology. Are you claiming there is another party that has a better sample (even subjectively so) and was pushing it and didn't succeed? If anything, standards committees are often overly annoying and biased the other way just because some other company representative wants to justify their presence. That is also cherry picked. For example standardizing ALPN over NPN which is largely a downgrade for the average user done in the standard process. The examples you use simply indicate a certain company is ahead of the game in solving problems others also have.
Not sure why Netflix would be in your list. AFAIK, they run their cloudy stuff on AWS, which isn't too unusual; chaos monkey is neat though? Their CDN boxes are exotic because they run a lot of sessions at relatively pedestrian bandwidths and a lot times 5-20Mbps adds up to a huge number. There's real work there and it's impressive, but it doesn't need exotic network protocols. Bulk encryption offloading NICs are super handy for their use case, which is certainly somewhat exotic.
They haven't said much lately about sending content updates to their CDN nodes, but I think the throughput requirements on that isn't as high.
I agree — the corporates have stop building and making `for` their users. They make their services so the users have no option but to get more and more comfortable in their ecosystem, and never get out.
Very soon, we will have providers/companies/champions/fighters that keep building the middleware transports to connect their behemoths.
Btw, someone somewhere popped up a better term — AGAMEMNON (Apple, Google, Amazon, Microsoft, Ebay, Meta, Nvidia, OpenAI, Netflix)
Because the whole point of the linked article is that they're making it part of the Open Compute Project, whose entire existence is devoted to making sure things are compatible with other things.
>The IT ecosystem has fragmented into mutually incompatible cliques. You are either in the Google ecosystem, the Amazon ecosystem, or some other one, but there are no more truly open and industry-wide standards.
This is one excellent example of the reason that increased/renewed anti-trust actions by the FTC are necessary.
There's no interesting distinction between a "native" transport protocol and a transport protocol running on top of a UDP shim. The UDP header is probably necessary for ECMP.
? Maybe i'm misunderstanding what your trying to say, but there are major differences between an ethernet + IP transport and other transports like fiber channel (or even token ring, atm, etc) which has built into the lowest layers buffer crediting / flow control, retransmission, prioritization, etc. Sure you can build much of that higher in the stack but it requires everything in the network to be playing the same game to assure QoS metrics, and if that's the case you don't really have a normal IP network anymore.
By transport protocol I mean layer 4. FC/TR/ATM/IB are (mostly) layer 2 protocols.
I think the idea is that Falcon assumes the underlying network is semi-crappy and works around that (e.g. Falcon assumes that packets arrive out of order then it puts them back in order).
OSI is an entire networking stack designed by committee that died from disuse [1]. The only thing we now remember of it are the functional layers within the protocol.
For example, people sometimes refer to TCP as a "Layer 4" protocol even though (a) TCP predates the invention of Layer 4 and (b) TCP is a square peg that does not exactly fit into the round hole that is Layer 4.
I wish we could forget the other remnants of OSI like x.509, ASN.1, and LDAP. Those were the things good enough to be used for real systems, and I'd still rather crawl over broken glass than implement any of them.
The diagram shows it layered on top of RDMA and NVM Express, and supporting UDP and IP. Unless you think the reverse makes sense. The diagram is just upside down.
RDMA is a low-level physical transport. You are saying that they are going to emulate RDMA on top of Falcon. Are they going to run Ethernet and then IP on top of that?
The diagram is confusing since it upside down to layering direction, but the article is clear. Falcon is a hardware transport protocol, replacing Ethernet. Like Ethernet, it runs on top of physical transport like RDMA. And IP runs on top of Falcon and Ethernet.
RDMA stands for Remote Direct Memory Access. The version that runs on ethernet is known as RDMA over Converged Ethernet, or RoCE. Until recently, the most common hardware interconnect used to support RDMA was Infiniband. Omnipath existed for a while.
The point is: RDMA is just a term for a computer reading memory on another computer without involving the operating system. The interconnect hardware and physical transport must support RDMA but that doesn't mean it is RDMA, as there are several different implementations of RDMA, and each kind of hardware supports a different subset of those implementations.
Maybe, but often in practice if the amount of supported needed gets high. It needs some very involved people to keep support going, otherwise it will bit rot and fail to function after a time.
I've been working with Ethernet devices a lot lately, using the network as a communication bus, essentially. I find that there's a lot of complexity that we simply don't need: ARP, DHCP, DNS... So many points of failure. We know all the devices on our LAN and their unique MAC addresses, and could do everything we need to addressing-wise at Layer 2. But everything's built on Layer 3 and up, so we're effectively working backward to map devices to IP addresses and vice versa. It's unsatisfying.