In most scenarios, you are no longer running with multiple users on the same machine.
Either this is a server, which has an admin team, or a client machine, which _usually_ have a single user.
That isn't 100% true, and local privilege escalation matters, but it is a far cry from remote code execution or remote privilege escalation.
User privilege separation is a foundation that allows many container implementations to work, and for sandboxes software like Tor or, for however unlikely it is that you're running atop on it, Android use, etc.
If someone is running Tor to not end up in prison/dead, their Tor sandbox can be opened for anyone to own, for example.
I once was in charge of a legacy C codebase that passed around data files, including to customers all around the world. The “file format” was nothing more than a memory dump of the C data structures. Oh, and the customers were running this software on various architectures (x86, sparc - big endian, alpha, itanium) and these data files had to all be compatible.
Occasionally, we had to create new programs that would do various things and sit in the middle of this workflow - read the data files, write out new ones that the customers could use with no change to their installed software. Because the big bosses could never make up their minds, we at various times used C#, Go, Python, and even C.
They all work just fine in a production environment. Seriously, it’s fine to choose any of them. But C# stands out as having the ugliest and sketchiest code to deal with it. But it works just fine!
More telling though. I used this scenario in interview questions many times. How would you approach it with C#? 99% of the time I would get a blank stare, followed by “I don’t think it’s possible”. If you want developers that can maintain code that can do this, perhaps don’t choose C# :)
Two things that have brought up in interviews. They don't seem to believe that AOT compiled C# is mature enough and can give the best possible performance on all their supported platforms. Their current codebase consists of more or less pure functions acting on simple data structures and since they want the port to be as 1:1 as possible, idiomatic Go is closer to that style than idiomatic C#.
Requiring all typescript users to install the .net runtime will probably kill adoption, especially on linux build servers. It still requires custom microsoft repos, if they're even available for your distro, and is barely upstreamed.
For Go, you just run the binary without any bullshit. This can easily be wrapped in an npm package to keep the current experience (`npm install` and it works) on all platforms.
The spec for HTTP GET is of course in no way similar to the spec for read(). On the other hand, I have to concede that (as I’ve just learned) an HTTP server is actually within its rights[1] to return only part of the requested range(s) and expect the client to redo the request if it needs the rest:
> A server that supports range requests (Section 14) will usually attempt to satisfy all of the requested ranges, since sending less data will likely result in another client request for the remainder. However, a server might want to send only a subset of the data requested for reasons of its own, such as temporary unavailability, cache efficiency, load balancing, etc. Since a 206 response is self-descriptive, the client can still understand a response that only partially satisfies its range request.
The only thing I'd note is that the spec seems to be pretty clear about the content-length response header needing to match how many bytes are actually in the response, and the 206 from Chrome is not returning a number of bytes matching the content-length header. Spec:
> A Content-Length header field present in a 206 response indicates the number of octets in the content of this message, which is usually not the complete length of the selected representation.
While in the article (and in the mailing group discussion) it seems that Chrome is responding with a `content-length` of 1943504 while the body of the response only contains 138721 octets. Unless there's some even more obscure part of the spec, that definitely seems like a bug as it makes detecting the need to re-request more annoying.
The problem with IVF is that you need to find the right centroids.
And that doesn't work well if your data grow and mutate over time.
Splitting a centroid is a pretty complex issue.
As are clustering in an area. For example, let's assume that you hold StackOverflow questions & answers. Now you have a massive amount of additional data (> 25% of the existing dataset) that talks about Rust.
You either need to re-calculate the centroids globally, or find a good way to split.
The posting list are easy to use, but if you are unbalanced, it gets really bad.
Hi, I'm the author of the article. Meta have conducted some experiments on dynamic IVF with datasets of several hundred million records. The conclusion was that recall can be maintained through simple repartitioning and rebalancing strategies. You can find more details here: DEDRIFT: Robust Similarity Search under Content Drift https://arxiv.org/pdf/2308.02752. Additionally, with the help of GPUs, KMeans can be computed quickly, making the cost of rebuilding the entire index acceptable in many cases.
Usually the pro-union business owners who don't have unions justify it like this" "I'm such a a good boss that my workers don't need a union, If I ever hear that they want to unionize I would be shocked and would think I failed" - Linus Sebastian of Linus Tech Tips.
In European countries, you do an agreement with an industry union, everyone on the building gets the agreement, regardless of how they feel about it.
From this side of the pound, usually the large majority is happy with the benefits, even without being themselves union members.
That is why countries with strong union culture have negociation rounds of industry leaders with the main union groups, to discuss how the sector will be handled for the current fiscal year.
As I suggested in the first comment here, yes. I believe the GP believed that you could have one PRNG that did not have global state for its consumers.
I'm saying that if you have one PRNG, then you have global state no matter how it's designed. This is true whether you write it so that you get decent statistics, or you write it so you get tons of duplicate values.
And many of the fixes remove global state. Per-thread PRNGs are one option, but so are PRNGs that are used by specific batches of objects.
So, the straightforward broken option has global state, and the non-broken options might or might not have global state.
Which means I have no idea what you're talking about when you use the phrase "need to have global state". What is the PRNG API where that need exists, and what does the version without global state look like?
Every PRNG algorithm has state that is shared among all of its consumers. That is basically true by definition. Put another way, a PRNG is an iterative algorithm that compresses a very long random-looking (the mathematical term is "equidistributed") sequence. That iterative algorithm needs to maintain its position in the sequence, and there's no alternative. There is no algorithm for a PRNG that does not have state that is shared among all of its consumers.
The only way to make a PRNG that does not share state between its consumers is to shard per-consumer. The PRNG in libc doesn't do that, it just uses a lock internal to the PRNG to keep things safe.
You could attempt to make a concurrent PRNG that has atomic-sized state and use CAS loops when generating numbers to keep the consumers in sync, but that's probably worse than using a lock.
> Every PRNG algorithm has state that is shared among all of its consumers. That is basically true by definition.
Right, but again why did you say it needs to have global state "if you want to get decent statistics"?
What is the alternative to having global state that doesn't have decent statistics?
That "if" is the part I'm confused about.
The part of your comment directly after that sounds like you're describing the situation where clients are data-racing the PRNG and generating repeat numbers and other problems. But that's a problem specifically because of global state being used incorrectly, not something that happens as an alternative to global state.
The alternative is not being random, and "decent statistics" means "being pseudorandom" - the statistics are inseparable from the randomness. You can make a PRNG that has data races, but from a randomness perspective, that is equivalent to just using /dev/null as your random number generator.
Okay, I think I understand. You're being extremely harsh in how you consider brokenness, so if it's broken it might as well not be generating anything at all, no state needed.
I was considering "fairly broken PRNG" as its own significant group, and those need state too if you treat them as separate from "this literally gives the same number every time". But if those all go in the same bucket, then the comparison you made makes sense.
That isn't 100% true, and local privilege escalation matters, but it is a far cry from remote code execution or remote privilege escalation.