Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Being that prescriptive is fundamentally unworkable in practice. Propagating unknown attributes is fundamentally what made the deployment of 32-bit AS numbers possible (originally RFC 4893; unaware routers pass the `AS4_PATH` attribute without needing to comprehend it), large communities (RFC 8092), the Only To Customer attribute (RFC 9234) and others.

A BGP Update message is mostly just a container of Type-Length-Value attributes. As long as the TLV structure is intact, you should be able to just pass on those TLVs without problems to any peers that the route is destined for.

The problem fundamentally is three things:

1. The original BGP RFC suggests tearing down the connection upon receiving an erroneous message. This is a terrible idea, especially for transitive attributes: you'll just reconnect and your peer will resend you the same message, flapping over and over, and the attribute is likely to not even be your peer's fault. The modern recommendation is Treat As Withdraw, i.e. remove any matching routes from the same peer from your routing table.

2. A lack of fuzz testing and similar by BGP implementers (Arista in this case)

3. Even for vendors which have done such testing, a number of have decided (IMO stupidly) to require you to turn on these robustness features explicitly.



PNG solved this problem when BGP was still young: each section of an image document is marked as to whether understanding it is necessary to process the payload or not. So image transform and palette data is intrinsic, but metadata is not. Adding EXIF for instance is thus made trivial. No browser needs to understand it so it can be added without breaking the distribution mechanism.


This is also how BGP (mostly) solved it. Each attribute has 'transitive' bit. Unknown attributes with 'transitive' bit set are passed, one without are discarded.


... Except for acTL, which is a special exception because it turns out that wasn't sufficient to ensure consistency in 100% of cases.


I was never that enthusiastic about motion PNGs in the first place. We have so many other ways to achieve that now.


You're suggesting that being liberal in what you accept is necessary for forward evolution of the protocol, but I think you're presenting a false dichotomy.

In practice there are many ways to allow a protocol to evolve, and being liberal in what you accept is just about the worst way to achieve that. The most obvious alternative is to version the protocol, and have each node support multiple versions.

Old nodes will simply not receive messages for a version of the protocol they do not speak. The subset of nodes supporting a new version can translate messages into older versions of the protocol where it makes sense, and they can do this because they speak the new protocol, so can make an intelligent decision. This allows the network to function as a single entity even when only a subset is able to communicate on the newer protocol.

With strict versioning and compliance to specification, reference validators can be built and fitted as barriers between subnetworks so that problems in one are less likely to spread to others. It becomes trivial for anyone to quickly detect problems in the network.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: