More

Flowdalic · on June 27, 2023

The kernel usually just needs to know which binary to execute as init, which is often provided as kernel command line argument by the bootloader.

djbusby · on June 27, 2023

I use syslinux, which is manually updated, no init is passed in.

In the Gentoo kernel make menuconfig there are options for Systemd vs OpenRC - not entirely sure what they do.

Flowdalic · on Dec 30, 2022

The problem does not seem to be that TCP_NODELAY is on, but that the packets are sent carry only 50 bytes of payload. If you send a large file, then I would expect that you invoke send() with page-sized buffers. This should give the TCP stack enough opportunity to fill the packets with an reasonable amount of payload, even in the absence of Nagel's algorithm. Or am I missing something?

kevincox · on Dec 30, 2022

Even if the application is making 50 byte sends why aren't these getting coalesced once the socket's buffer is full? I understand that Nagle's algorithm will send the first couple packets "eagerly" but I would have expected that onced the transmit window is full they start getting coalesced since they are being buffered anyways.

Disabling Nagle's algorithm should be trading network usage for latency. But it shouldn't reduce throughput.

Flowdalic · on Dec 30, 2022

> Even if the application is making 50 byte sends why aren't these getting coalesced once the socket's buffer is full?

Because maybe the 50 bytes are latency sensitive and need to be at the recipient as soon as possible?

> I understand that Nagle's algorithm will send the first couple packets "eagerly" […] Disabling Nagle's algorithm should be trading network usage for latency

No, Nagle's algorithm will delay outgoing TCP packets in the hope that more data will be provided to the TCP connection, that can be shoved into the delayed packet.

The issue here is not Go's default setting of TCP_NODELAY. There is an use case for TCP_NODELAY. Just like there is a use case for disabling TCP_NODELAY, i.e., Nagle's algorithm (see RFC 869). So any discussion about the default behavior appears to be pointless.

Instead, I believe the application or a underlying library is to blame. Because I don’t see why applications performing a bulk transfer of data by using “small” (a few bytes) write is anything but a bad design. Not writing large (e.g., page-sized) chunks of data into the file descriptor of the socket, especially when you know that there multiple more of this chunks are to come, just kills performance on multiple levels.

If I understand the situation the blog post describes correctly, then git-lfs is sending a large (50 MiB?) file in 50 bytes chunks. I suspect this is because git-lfs (or something between git-lfs and the Linux socket, e.g., a library) issues writes to the socket with 50 bytes of data from the file.

kevincox · on Dec 30, 2022

> Because maybe the 50 bytes are latency sensitive and need to be at the recipient as soon as possible?

The difference in latency between a 50 byte and 1500 byte packet is miniscule. If you have the data available in the socket buffer I don't see why you wouldn't want to send it in a single packet.

The latency benefit of TCP_NODELAY should be that it isn't waiting for user space to write more data, not that it is sending short packets.

Flowdalic · on Dec 24, 2022

It should be possible to install meson via pip --user. Even though I prefer system-wide installations, I believe this weakens your argument for user defined functions in your situation.

foresto · on Dec 24, 2022

> I believe this weakens your argument for user defined functions in your situation.

It does not. This backport was not for a single user, but for deployment into the package repo used by multiple systems (including build systems). A one-off Meson build would be unsuitable for this case.

In any case, note that I didn't claim it was impossible without user-defined functions. It was simply messier and more painful, much as a special Meson build would be.

AnssiH · on Dec 24, 2022

You can also use run meson directly from its source tree with no installation steps, that's what we do in some cases.

Flowdalic · on July 11, 2022

QUIC does a lot more than "the one for TCP". While I also believe that modern TCP consists of more than just one RFC (which you already hinted at).

I guess the art in protocol design is to have as few as possible mandatory-to-implement parts, which are itself minimized in complexity, so that a minimal implementation is doable with a reasonable amount of effort while already achieving a good result (and UX). Then the optional parts can be added piece-by-piece, after the implementation was already published/released.

Flowdalic · on May 24, 2022

It appears that Gloox, a relative low-level XMPP-client C library, rolled much of its Unicode and XML parsing itself, which made such vulnerabilities more likely. There maybe good reasons to not re-use existing modules and rely on external libraries, especially if you target constraint low-end embedded devices, but you should always be aware of the drawbacks. And the Zoom client typically does not run on those.

zamalek · on May 24, 2022

One of the harder things with XMPP is that it is a badly-formed document up until the connection is closed. You need a SAX-style/event-based parser to handle it. That makes rolling your own understandable in some cases (e.g. dotnet's System.Xml couldn't do this prior to XLinq).

That being said, as you indicated Gloox is C-based, and the reference implementation of SAX is in C. There is no excuse.

forty · on May 25, 2022

Not only that, but before the TLS session starts you have to handle an invalid XML document (the starttls mechanism start encrypting stuff right in the middle of the initial XML document). Also some XML constructs are not valid in XMPP (like comments)

I think rolling out your own XML parser for XMPP is a fairly reasonable thing to do. In the past at least, many, if not most, implementations had their own parser (often a fork of a proper XML parser). What is more surprising to me is why would they choose XMPP for their proprietary stuff. I don't think they want to interroperate or federate with anything?

(if I remember correctly and if it hasn't changed compared to many years ago, when I looked at that stuff.)

Flowdalic · on May 24, 2022

> One of the harder things with XMPP is that it is a badly-formed document up until the connection is closed. You need a SAX-style/event-based parser to handle it.

That is a common misconception, although I am not sure of its origin. I know plenty of XMPP implementations that use an XML pull parser.

zamalek · on May 24, 2022

It's possible by blocking the thread that's reading the XML, but now you're in thread-per-client territory, and that doesn't scale.

Flowdalic · on May 25, 2022

Smack uses an XML pull parser and non-blocking I/O. It does so by splitting the XMPP stream top-level elements first and only feeding complete elements to the pull parser.

forty · on May 25, 2022

https://github.com/igniterealtime/Smack/blob/master/smack-xm...

I don't see any opportunity not to block when calling "next"

TedDoesntTalk · on May 24, 2022

DOM-based XML parsers use SAX parsing under the hood.

zamalek · on May 24, 2022

Right, but if they don't give you access to the SAX parser then you are SOL.

Aeolun · on May 24, 2022

I find that response a bit strange, since the whole reason the Zoom client has these particular vulnerabilities is because they didn’t roll their own, and instead rely on layers of broken libraries.

It’s quite possible they’d have more bugs without doing that, but re-using existing modules could just as easily have been an even worse idea.

WesolyKubeczek · on May 24, 2022

Using what everyone and their dog is using is prone to bugs just as much because software without bugs doesn't exist or is not very useful, but it also has the benefit of many versatile eyeballs looking at it in many different contexts.

So if there's a bug found and fixed in libxml2 which is used by almost everything else, everyone else instantly benefits. Same with libicu which is being used, for example, by NodeJS with its huge deployments footprint. Oh, and every freakin' Webkit-based browser out there.

OTOH, they rolled their own, so all bugs they hit are confined only to zoom, and are only guaranteed to get Zoom all the bad press.

Choose your poison carefully.

Aeolun · on May 24, 2022

If they roll their own it also becomes less interesting to actively exploit.

Obviously this doesn’t really work for Zoom any more, since their footprint is too large, but it can stop driveby attackers in other situations. Nobody is going to expend too much effort figuring out joe schmuck’s homegrown solution, where they’d happily run a known exploit against the unpatched wordpress server.

pixl97 · on May 24, 2022

Security by obscurity has been debated to hell and back. It only works if you stay obsecure... and don't leak your code.

eli · on May 24, 2022

I think the point is that Unicode and XML parsing are known to be security critical components and you should take care that they are handled only by well tested code designed specifically for the purpose. You need to not roll your own and also ensure that any third party components didn’t roll their own.

remus · on May 24, 2022

> You need to not roll your own and also ensure that any third party components didn’t roll their own.

If you're not writing the code and somebody else isn't writing the code then who is writing the code?!

eli · on May 24, 2022

A well-tested Unicode library built for security should be doing your Unicode parsing in security critical components.

It’s just another way of saying you should be doing a security audit as part of selecting a library and integrating it into your product.

Flowdalic · on May 24, 2022

I get your confusion. But keep in mind that it is not only about just picking the library that shows as first result of your Google search. My naive self thinks that a million dollar company should do some research and evaluate different options when choosing external codebase to build their flagship product on. There a dozens of XMPP libraries, and they picked the one that does not seem to delegate XML and Unicode handling to other libraries, which should raise a flag.

mwcampbell · on May 24, 2022

I think that's a false dichotomy; IMO the best default choice is to rely on the most well-tested library in any given category. That suggests to me that they should have used expat on the client side.

powerapple · on May 25, 2022

IMO we should use external libraries, and should invest engineering time on the library rather than just take a library. Not using good third party library means you need to invest at least a few engineer-month in it to get the same result, and you will need to invest a lot more to do better than third party library. Instead, you can take the library and invest a few engineer month to improve the opensource library.

account42 · on May 25, 2022

Why? If anything, the client does the more reasonable interpretation of the XML-in-malformed-UTF-8 - skipping to the next valid UTF-8 sequence start. It's the server that has really weird behavior for their UTF-8 handling where it somehow special cases multi-byte UTF-8 sequences but then does not handle invalid ones.

xxpor · on May 24, 2022

This is a very common issue across all of software engineering I've found. But I really don't get why. If I was given the task of parsing Unicode or XML, I'd run and find a library as fast as possible, because that sounds terrible and tedious, and I'd rather do literally anything else!

Why aren't people more lazy, in other words?

Flowdalic · on May 1, 2022

It is not an RFC, it is an I-D (Internet Draft).

Flowdalic · on Nov 18, 2021

Using XML is one of XMPP its biggest strengths. XML is well designed, good documented and has a rich set of supporting libraries. XML documents can be composed of other XML documents in a sound fashion, which is a major feature for an extensible protocol as XMPP is, and XML documents compress well, making them suitable in low-bandwidth conditions (see [XEP-0365](https://xmpp.org/extensions/xep-0365.html)). I also never experienced a considerable battery drain when autark devices are using XMPP compared to a binary protocol.

MattJ100 · on Nov 18, 2021

I've worked with XMPP for years, and while I'm not going to give XML a glowing review, I do think most alternatives would have been worse for XMPP's use-case (extensible interoperable messaging). I also think XMPP is one of a small handful of examples of XML at its best (most of the time when I see XML used extensively in other projects, it's usually a horrific experience).

It's worth noting that XMPP uses a subset of XML (so no DTDs, comments, processing instructions, is restricted to UTF-8 only, for some examples).

Framing has been mentioned elsewhere, and it's worth noting that XMPP-over-websocket is defined, well-supported and widely deployed. The websocket layer adds framing.

Flowdalic · on Nov 18, 2021

I wonder why people jump fast to conclusion that to "let XMPP die", when the protocol can also be iteratively improved. Presence is not required in XMPP, its an optional feature. Everything you said has been considered in newer XMPP extension protocols (like XEP-0396: MIX).

oofbey · on Nov 18, 2021

Why improve when you can start over? Modern tools for pub sub and websockets are so much better. We don’t need to use XML any more. (For the love of god please let XML die.) The old code and implementations are more of a liability than an asset, especially if you want to take out major functionality. Unless interop with AOL instant messenger’s install base is a key feature or something. (Don’t know if AIM uses XMPP but the timing is about right and representative of what you’d get.)

Zash · on Nov 18, 2021

Why start over when you can improve? You'll most likely end up with the second system effect, probably slowly rediscover all the valid reasons for the things you disliked, probably re-make all the mistakes.

Flowdalic · on Oct 16, 2020

I wouldn't be sure that this is authentic, i.e. actually from John Nagle.

JdeBP · on Oct 16, 2020

It is amusing that you doubt that "John Nagle" on Stack Overflow is M. Nagle, but don't express any doubt that "Animats" here on this WWW site, and indeed this very discussion, is M. Nagle. Surely the reverse stance is the more logical, if one has no idea what "Animats" is.

It is odd that some people give more credence to a pseudonym, or in this case a company name, than to using one's own name.

perl4ever · on Oct 16, 2020

Among people who would pretend to be a given well known person, surely more of them know that person's real name than something that would make a plausible pseudonym. Or in any case, more of them would choose it.

And if someone thinks a pseudonym is a particular famous person, they must have a reason, which is unlikely to be weaker than just assuming a normal name is accurate.

I'm not up to doing a Bayesian analysis right now, but I feel like one could show it makes more sense to doubt an unverified name.

Flowdalic · on Jan 30, 2020

It still shows sign of ongoing development though: https://github.com/psi-im/psi/graphs/contributors?from=2016-...

aidenn0 · on Jan 30, 2020

Wow, looks like it's back from the dead!