IPv6 doesn't require modifications to IPv4 devices, applications, networks etc etc either. You just cannot reach IPv6 networks and devices from them, and the same applies to IPv8. 8to4 is nothing innovative because 6to4 already exists. In the end this proposal has all the disadvantages of IPv6 with less advantages.
Even if you tried something like that it will eventually break when other commits are added to main that are not present in PR B, even if those commit don't conflict with neither PR A nor PR B changes.
> Try to rebase it and you're going to be manually looking at every non-conflicting change that ever happened on that branch, for no apparent reason
My "fix" is to do an interactive rebase of PR B on main and drop all of PR A's commits from PR B in the process.
I remember seeing a way to do this automatically, but it requires an option that I never remember. IMO this is kind of the issue with git: a lot of improved workflows sit behind some flags that most people never learn. Interactive rebases work for me because they are one primitive, always working in the same way.
I think older processors used to have a slower implementation for shifts, which made this slower.
Nowadays swisstable and other similar hashtables use the top bits and simd/swar techniques to quickly filter out collisions after determining the starting bucket.
In Rust? The two I'm big fans of, CompactString and ColdString do not use unions although historically CompactString did so and it still has a dependency on smallvec's union feature
ColdString is easier to explain, the whole trick here is the "Maybe this isn't a pointer?" trick, ColdString might be a single raw pointer onto your heap with the rest of the data structure at the far end of the pointer, this case is expensive because nothing about the text lives inline, but... the other case is that your entire text was hidden in the pointer, on modern hardware that's 8 bytes of text, at no overhead, awesome.
CompactString is more like a drop-in replacement, it's much bigger, the same size as String, so 24 bytes on modern hardware, but that's all SSO, so text like "This will all fit nicely" fits inline, yet the out-of-line case has the usual affordances such as capacity and length in the data structure. This isn't doing the "Maybe this isn't a pointer?" trick but is instead relying on knowing that the last byte of a UTF-8 string can't have certain values by definition.
I realise that I don't do the best job of explaining ColdString here. After all most 8 byte strings of UTF-8 text could equally be a pointer so, why can this work?
All ColdStrings which look like 8 bytes of UTF-8 text really are 8 bytes of UTF-8 text, just the type label on those 8 bytes isn't "[u8; 8]" an array of 8 bytes but instead "mut *u8" a raw pointer. "Validate" for example is 8 bytes of ASCII, thus UTF-8, and Rust is OK with us just saying we want a pointer on a 64-bit machine with those bytes. It's not a valid pointer, but it is a pointer and Rust is OK with that, we just need to be careful never to [unsafely] dereference the pointer because it's invalid
OK, so there are two cases left: First, what if there are fewer bytes of text? Zero even?
Since there are fewer than 8 bytes of text we can use the whole first byte to signal how many of the remainder are text, we use the UTF-8 over-long prefix indicator in which the top five bits of the byte are all set, bytes 0xF8 through 0xFF for this, there are eight of these bytes corresponding to our 8 lengths 0 through 7 inclusive. Because it's over-long this indicator isn't itself a valid UTF-8 prefix. Again we can pretend this is a pointer while knowing it's invalid.
Lastly, the seemingly trickiest problem, what if the string didn't fit inline? We use a heap allocation to store the text prefixed by a variable size integer length and we insist this allocation is aligned to 4 bytes. This means a valid pointer to our allocation has zeroes for the bottom two bits, then we rotate that pointer so those bottom two bits are at the top of the first byte position (depending on machine word layout) and we set the top bit. This is now always invalid UTF-8 because it has the continuation marker - the top bit is set but the next is not, which cannot happen in the first byte of any UTF-8 text, and so our code can detect this and reverse the transformation to get back a valid pointer using the strict provenance APIs if this marker is present.
> Localization files for every language on Earth - [...] - Samsung really wanted to make sure everyone on the planet could experience this suffering equally
Why are you considering localization as bloat? I bet your reaction wouldn't be positive if your native language(s) were missing instead.
The alternative would be the installer only installing the languages that match the system settings. Which yes is imperfect, but not nearly as bad as separate downloads or god forbid the two tier base language and modification pack system Microsoft came up with.
reply