Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Going multipath without Multipath TCP (benjojo.co.uk)
114 points by FiloSottile on Feb 25, 2022 | hide | past | favorite | 14 comments


I worked with the MPTCP Linux implementation before it was upstreamed. One of the exciting things about it was that it was the default: IPPROTO_TCP would attempt an MPTCP handshake. This meant that any unmodified application that had used TCP could inherit the magic of connection bonding and bandwidth aggregation, when used with an MPTCP capable server. Browser downloads, SSH connections, iperf, you name it!

It seems that one of the concessions necessary to get MPTCP upstreamed was to make it opt-in. Even if the distribution enables MPTCP in its config, and if the sysctl parameter disabling it is enabled, the applications must STILL request it via IPPROTO_MPTCP. It was probably the correct choice, but to me it destroyed a major benefit of MPTCP. If it can't (easily) be used to enable unmodified applications to reap its benefits, then why was it written as a standards-based, backward compatible extension to TCP? There were so many extra hurdles to overcome by adopting those constraints. Modern protocol development, especially at the "transport" layer (though usually based on UDP), is easiest done in userspace these days. This enables easier experimentation, iteration, and delivering updates to your users. See Chrome and SPDY/QUIC. (Incidentally, I recall some discussion on "multipath QUIC", wonder where that led to).

To bring this comment back to the article, I love it. It sidesteps the protocol development hurdles and just says "if we're going to require modified clients and servers anyway, let's shove it into the application layer in a library". I'm sure approaches like this could handle all the same use cases:

- maximizing throughput of multiple interfaces, as the article does

- connection handover (by making a new connection over a new IP/interface before breaking the old one)

- reducing latency by racing the same data across all interfaces

- using metered (e.g. LTE) connections as backup for poorly performing default connections

Of course, applications would need to be modified to use such a library, but that's what the community seems to have decided is the way forward regardless :/


I didn't realize it was such a drop-in replacement. Couldn't you then use a LD_PRELOAD to translate TCP into MPTCP? Then applications wouldn't need to be modified.


It's probably possible to do that, assuming you have a kernel properly configured and the sysctls and privileges established. I had no problems using the out-of-tree MPTCP kernel for a variety of TCP based tasks, so most applications had no issue with it. Of course then the next requirement is a server which supports MPTCP as well... which means you need at least two hosts with very specific kernels, and maybe LD_PRELOAD setups :/

And to be honest, there are good reasons not to enable MPTCP by default. For one, there's security consequences. "Hi Bob, my name is Alice, I have a totally different IP address but would like to join into this conversation as well..." MPTCP has protections for this based on session keys that are exchanged. But for many people, it's still a concern. Another good reason is that most applications don't benefit from multiple paths, so it's a lot of extra complexity and packets being sent for no benefit. And finally, there are many possible use cases for multipath reliable sockets, as I detailed above. Each requires different policy (to determine when to make new connections across different IP addresses, or how to schedule data to transmit across each subflow). It's unlikely that an unmodified application would get the kind of behavior that the user desires. How do you decide whether you should treat one subflow as a metered "backup" connection, or whether you should be blasting data across all subflows?

This could all be configured by user who has dutifully read enough manual pages and implemented enough netlink applications (or BPF scripts?) to control the kernel's behavior, I'm sure.

So, the moral of the story to me is that, drop-in replacement was an exciting goal, and something that felt important. Losing it definitely meant that MPTCP was doomed to not reach critical mass, and probably won't see widespread deployment. But losing it also means that we can see applications fine-tune a solution like this one for their use case, which feels like a win.


To expand on this - if you are going to mess with LD_PRELOAD, why not stick this whole library behind LD_PRELOAD?

You get similar levels of compatibility (except for the odd app that shirks libc and hand-codes their system calls...). But you also get the ability to do userspace protocol development, and you have an easier time of configuring the policy for how you handle new connections and subflows.


Thanks for the explanation, the linux internals of this is past my understanding. It reminded me of tsocks (for wrapping connections in a socks proxy), so I thought a similar thing would work. But it makes sense that you don't really get benefits unless you've tuned mptcp.


Super interesting, thanks for sharing.

You mentioning SSH makes me curious, could this be used for roaming SSH? With only? patching the client?


Yes, roaming is an explicit goal of MPTCP. MPTCP even contains a "break before make" functionality. If you think about it there's two options:

1. Make a new connection subflow on your new IP address, while still holding onto the old IP address and subflow. Once this is done, you can close the old connection and disconnect your old network interface. This is "make before break" and it is limited to those who actually have two separate network interfaces, and can only be used if they overlap (e.g. wifi connections plus a persistent LTE connection).

2. Break the existing subflow in a way that the higher-level MPTCP session management knows that you don't want to tear down the whole connection. Then, roam to a new IP address, and re-connect to the server using your new IP. So long as you cryptographically prove you're the same client (which MPTCP already requires), then you can re-establish the same logical connection. The MPTCP socket never "disconnected", and the application (in this case, SSH) continues to believe it has one connection. This is break-before-make.

MPTCP break before make is awesome because it's one of the few benefits which is available for single-homed clients. You don't need multiple interfaces (e.g. ethernet, wifi, and lte) to benefit from it!


ok, I'm not following completely but I'm curious enough they I need to try to compile a ssh client with this, appreciate the explanation though


Great post! One quibble is that it says this:

> This extension is currently used sparsely, with the only two commonly deployed uses being the OpenMTCPRouter Project and Apple’s Siri

Siri was the first application supported, but now it’s available more broadly on iOS:

https://developer.apple.com/documentation/foundation/urlsess...


There is much more than "two commonly deployed used" of MPTCP, to name a few : OpenMPTCPRouter is an improved fork from OVH Overthebox project ( https://www.ovhtelecom.fr/overthebox/ ), Tessares is a company that provide key in hand MPTCP solutions for many ISPs ( https://www.tessares.net/customers ). MPTCP was designed to be transparent for the end user, so you might be using it without knowing it, from your phone, your computer or your internet box


This is the future of censorship-resistant network.

Middleman only happens because every info is dumped inside one series of packets transferred as a single src-dst ip:port connection. Crack one stream and you get all. If we somehow split bytes and bits and shuffled them into multiple ip:port connections, span on multiple ISPs, then middleboxes are rendered useless. Targeting one TCP stream is easy, but correctly select M from N connections is nearly impossible.


Most real world censorship doesn't seem to be happen at the level you are describing as a major weakness here, does it?


It's the link layer. You can't defeat censorship without the physical ground work


This is interesting. I was thinking of doing a similar thing but running it on top of quic instead of tcp.

Furthest I got was having a somewhat working `quicat`.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: