Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi everyone! I am one of the authors of Salsify. Ask me anything!


A practical issue I've seen with high-bandwith non-TCP transports (including video over RTP) is how they behave in the presence of multiple other streams on the line; when bandwidth becomes limited often you get one winner and a bunch of loser streams. Does salsify address this at all?


It's not a focus of this project, but in a different project we have set up some pretty careful measurements (running every few days) to test whether this happens to different congestion-control schemes, including ones like CUBIC (Linux's default for an Internet stream socket) and Sprout (which Salsify's transport protocol is based on).

The bottom line is that TCP CUBIC is actually pretty bad for reasonable-length flows, and Sprout is empirically better (at the cost of getting a lot less throughput!).

See, e.g., http://pantheon.stanford.edu/result/1997/ and https://s3.amazonaws.com/stanford-pantheon/real-world/Stanfo...

I guess one key thing to note here is that Salsify is not a "high-bandwidth" transport -- one of the main goals is to wait until the network is ready for a frame (killing off encoder outputs if necessary) to avoid overloading the network and provoking packet loss or queueing delay.

Video over RTP can mean a lot of things, including totally unresponsive traffic that doesn't vary the sending rate in response to congestion signals at all. If an app does this, yeah, life can suck.


I just remember going through a log and seeing 3 RTP streams that had gotten into sort of a harmonic oscillation pattern where they each would observe congestion at different times and back off in turn, and then one stream would consume most of the bandwidth until it's backoff triggered, then another stream would do the same.

I have zero more detail on how they were managing backoff though as this was like 6 years ago with some random video-meeting software that probably doesn't exist anymore (I think there were 100s of video meeting companies that started and died from 2010-2013).


How does this differ from SIP/SDP?


These are encapsulation and signaling protocols, but don't really cover the questions of when to send each compressed video frame, how big it should be on the wire, and how to write an encoder that can compress it to hit that target size accurately. Some of the schemes we compared against (WebRTC reference implementation in Chrome, Hangouts, FaceTime, Skype) certainly use various parts of the standard protocol suite (with different engines to decide when and how much to send). But the call setup and encapsulation are not the focus here -- I'm sure we could do Salsify within SIP or WebRTC if necessary.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: