Hacker News new | past | comments | ask | show | jobs | submit | ayende's comments login

In most scenarios, you are no longer running with multiple users on the same machine. Either this is a server, which has an admin team, or a client machine, which _usually_ have a single user.

That isn't 100% true, and local privilege escalation matters, but it is a far cry from remote code execution or remote privilege escalation.


User privilege separation is a foundation that allows many container implementations to work, and for sandboxes software like Tor or, for however unlikely it is that you're running atop on it, Android use, etc.

If someone is running Tor to not end up in prison/dead, their Tor sandbox can be opened for anyone to own, for example.


Root privileges allow for a much wider attack surface for escaping out of a VM. Not using root everywhere still helps with defense in depth.


That is a well structure system, yes Both cleanup for error and allocation happens in the same place

That means you won't forget to call it, and the success flag is an obvious way to ha dle it


Starting in v7.0, ravendb has ve tor search and ai integration Run on windows , linux and mac


C# has both of those capabilities and more The answer doesn't make sense


I once was in charge of a legacy C codebase that passed around data files, including to customers all around the world. The “file format” was nothing more than a memory dump of the C data structures. Oh, and the customers were running this software on various architectures (x86, sparc - big endian, alpha, itanium) and these data files had to all be compatible.

Occasionally, we had to create new programs that would do various things and sit in the middle of this workflow - read the data files, write out new ones that the customers could use with no change to their installed software. Because the big bosses could never make up their minds, we at various times used C#, Go, Python, and even C.

They all work just fine in a production environment. Seriously, it’s fine to choose any of them. But C# stands out as having the ugliest and sketchiest code to deal with it. But it works just fine!

More telling though. I used this scenario in interview questions many times. How would you approach it with C#? 99% of the time I would get a blank stare, followed by “I don’t think it’s possible”. If you want developers that can maintain code that can do this, perhaps don’t choose C# :)


Ask anyone maintaining even remotely interesting project in C# on GitHub and the answers would have likely surprised you.


Two things that have brought up in interviews. They don't seem to believe that AOT compiled C# is mature enough and can give the best possible performance on all their supported platforms. Their current codebase consists of more or less pure functions acting on simple data structures and since they want the port to be as 1:1 as possible, idiomatic Go is closer to that style than idiomatic C#.

See this thread https://news.ycombinator.com/item?id=43332830 for much more discussion.


When you code using regular OOP it's def not the case. Go struct memory layout is very straightforward.


Requiring all typescript users to install the .net runtime will probably kill adoption, especially on linux build servers. It still requires custom microsoft repos, if they're even available for your distro, and is barely upstreamed.

For Go, you just run the binary without any bullshit. This can easily be wrapped in an npm package to keep the current experience (`npm install` and it works) on all platforms.


Modern .NET usually ships the runtime or embeds it inside the binary. This is very different from the old Windows-only .NET Framework.



c# also has a runtime you have to ship with any binaries


So is Java.


Chrome gives you what data it has, and you are expected to issue the next request to get the rest of the data.

Consider a read() call in Linux if you ask to read 16kb and the cache has 4kb page ready, it may give you that.

You'll need another call to get the rest, and if there is a bad disk sector, that first read() may bot notice that


The spec for HTTP GET is of course in no way similar to the spec for read(). On the other hand, I have to concede that (as I’ve just learned) an HTTP server is actually within its rights[1] to return only part of the requested range(s) and expect the client to redo the request if it needs the rest:

> A server that supports range requests (Section 14) will usually attempt to satisfy all of the requested ranges, since sending less data will likely result in another client request for the remainder. However, a server might want to send only a subset of the data requested for reasons of its own, such as temporary unavailability, cache efficiency, load balancing, etc. Since a 206 response is self-descriptive, the client can still understand a response that only partially satisfies its range request.

[1] https://www.rfc-editor.org/rfc/rfc9110.html#name-206-partial...


The only thing I'd note is that the spec seems to be pretty clear about the content-length response header needing to match how many bytes are actually in the response, and the 206 from Chrome is not returning a number of bytes matching the content-length header. Spec:

> A Content-Length header field present in a 206 response indicates the number of octets in the content of this message, which is usually not the complete length of the selected representation.

While in the article (and in the mailing group discussion) it seems that Chrome is responding with a `content-length` of 1943504 while the body of the response only contains 138721 octets. Unless there's some even more obscure part of the spec, that definitely seems like a bug as it makes detecting the need to re-request more annoying.


I’m not sure I follow. The second request is for an overlapping range and can’t be satisfied because of the 403. There’s just no way around that.


The problem with IVF is that you need to find the right centroids. And that doesn't work well if your data grow and mutate over time.

Splitting a centroid is a pretty complex issue.

As are clustering in an area. For example, let's assume that you hold StackOverflow questions & answers. Now you have a massive amount of additional data (> 25% of the existing dataset) that talks about Rust.

You either need to re-calculate the centroids globally, or find a good way to split.

The posting list are easy to use, but if you are unbalanced, it gets really bad.


Hi, I'm the author of the article. Meta have conducted some experiments on dynamic IVF with datasets of several hundred million records. The conclusion was that recall can be maintained through simple repartitioning and rebalancing strategies. You can find more details here: DEDRIFT: Robust Similarity Search under Content Drift https://arxiv.org/pdf/2308.02752. Additionally, with the help of GPUs, KMeans can be computed quickly, making the cost of rebuilding the entire index acceptable in many cases.


Do you have a union in your business? If not, why not? If yes, what is the experience like?


Usually the pro-union business owners who don't have unions justify it like this" "I'm such a a good boss that my workers don't need a union, If I ever hear that they want to unionize I would be shocked and would think I failed" - Linus Sebastian of Linus Tech Tips.


Genuine question, if I was a business owner, how would I force my employees to join a union?

Isn’t it a non-managerial activity (on purpose)?

Seems like there would be very limited influence on creating one

Linus’s position and other people like him is indefensible.


In European countries, you do an agreement with an industry union, everyone on the building gets the agreement, regardless of how they feel about it.

From this side of the pound, usually the large majority is happy with the benefits, even without being themselves union members.

That is why countries with strong union culture have negociation rounds of industry leaders with the main union groups, to discuss how the sector will be handled for the current fiscal year.


If you wanted to share a bit of music, let's say a disc in MP3 format

You could either schlep the hard disk or try to carry 20 - 30 floppies around

With a high chance of at least some of them bring bad


Yes, I addressed it a bit further in the post

I was trying to explain how my 12 years old self saw that code

And I had no other resource to turn to at the time


Thread local state

You don't need it global

Seed it once per thread from /dev/random and you are done


As I suggested in the first comment here, yes. I believe the GP believed that you could have one PRNG that did not have global state for its consumers.


No, I'm not saying that.

I'm saying that if you have one PRNG, then you have global state no matter how it's designed. This is true whether you write it so that you get decent statistics, or you write it so you get tons of duplicate values.

And many of the fixes remove global state. Per-thread PRNGs are one option, but so are PRNGs that are used by specific batches of objects.

So, the straightforward broken option has global state, and the non-broken options might or might not have global state.

Which means I have no idea what you're talking about when you use the phrase "need to have global state". What is the PRNG API where that need exists, and what does the version without global state look like?


Every PRNG algorithm has state that is shared among all of its consumers. That is basically true by definition. Put another way, a PRNG is an iterative algorithm that compresses a very long random-looking (the mathematical term is "equidistributed") sequence. That iterative algorithm needs to maintain its position in the sequence, and there's no alternative. There is no algorithm for a PRNG that does not have state that is shared among all of its consumers.

The only way to make a PRNG that does not share state between its consumers is to shard per-consumer. The PRNG in libc doesn't do that, it just uses a lock internal to the PRNG to keep things safe.

You could attempt to make a concurrent PRNG that has atomic-sized state and use CAS loops when generating numbers to keep the consumers in sync, but that's probably worse than using a lock.


> Every PRNG algorithm has state that is shared among all of its consumers. That is basically true by definition.

Right, but again why did you say it needs to have global state "if you want to get decent statistics"?

What is the alternative to having global state that doesn't have decent statistics?

That "if" is the part I'm confused about.

The part of your comment directly after that sounds like you're describing the situation where clients are data-racing the PRNG and generating repeat numbers and other problems. But that's a problem specifically because of global state being used incorrectly, not something that happens as an alternative to global state.


The alternative is not being random, and "decent statistics" means "being pseudorandom" - the statistics are inseparable from the randomness. You can make a PRNG that has data races, but from a randomness perspective, that is equivalent to just using /dev/null as your random number generator.


Okay, I think I understand. You're being extremely harsh in how you consider brokenness, so if it's broken it might as well not be generating anything at all, no state needed.

I was considering "fairly broken PRNG" as its own significant group, and those need state too if you treat them as separate from "this literally gives the same number every time". But if those all go in the same bucket, then the comparison you made makes sense.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: