It's not a focus of this project, but in a different project we have set up some pretty careful measurements (running every few days) to test whether this happens to different congestion-control schemes, including ones like CUBIC (Linux's default for an Internet stream socket) and Sprout (which Salsify's transport protocol is based on).
The bottom line is that TCP CUBIC is actually pretty bad for reasonable-length flows, and Sprout is empirically better (at the cost of getting a lot less throughput!).
I guess one key thing to note here is that Salsify is not a "high-bandwidth" transport -- one of the main goals is to wait until the network is ready for a frame (killing off encoder outputs if necessary) to avoid overloading the network and provoking packet loss or queueing delay.
Video over RTP can mean a lot of things, including totally unresponsive traffic that doesn't vary the sending rate in response to congestion signals at all. If an app does this, yeah, life can suck.
I just remember going through a log and seeing 3 RTP streams that had gotten into sort of a harmonic oscillation pattern where they each would observe congestion at different times and back off in turn, and then one stream would consume most of the bandwidth until it's backoff triggered, then another stream would do the same.
I have zero more detail on how they were managing backoff though as this was like 6 years ago with some random video-meeting software that probably doesn't exist anymore (I think there were 100s of video meeting companies that started and died from 2010-2013).
The bottom line is that TCP CUBIC is actually pretty bad for reasonable-length flows, and Sprout is empirically better (at the cost of getting a lot less throughput!).
See, e.g., http://pantheon.stanford.edu/result/1997/ and https://s3.amazonaws.com/stanford-pantheon/real-world/Stanfo...
I guess one key thing to note here is that Salsify is not a "high-bandwidth" transport -- one of the main goals is to wait until the network is ready for a frame (killing off encoder outputs if necessary) to avoid overloading the network and provoking packet loss or queueing delay.
Video over RTP can mean a lot of things, including totally unresponsive traffic that doesn't vary the sending rate in response to congestion signals at all. If an app does this, yeah, life can suck.