That comment is not the reason for the function’s existence, but the details of how the function interacts with GC.
The reason for the function’s existence is that it allows typed arrays to dynamically switch between a fast/compact representation for the case that the JSVM owns the data, and a slightly slower and slightly less compact version for when the JSVM allows native code to share ownership of the data.
This function, slowDownAndWasteMemory, switches to the less efficient version that allows aliasing with native code.
Of course the name is sarcastic. The actual effect of having this is that JSC can handle both the owned case and the aliased case, and get a small will if you’re in the owned case while being able to switch to the aliased case at any time. Since there’s no way to support the aliased case without some slightly higher memory/time usage, we are sort of literally slowing down and wasting memory when we go aliased.
Source: I’m pretty sure I wrote most of this code.
I often wonder how much more productive I could be if I didn't need to split changes like this into 50 small commits, where every intermediate commit is working and has passing tests...
I understand the desire to make larger patches, but how do you effectively manage them in the review process? For super large commits in the past, I’ve had other engineers hop on a screen share to review the diff together and answer questions, but it feels inefficient.
how do you effectively manage them in the review process
The overriding principle is "no surprises at final review time."
For a big impactful change discuss the work in progress beforehand (specifically: architecture stuff, database changes, impact this will have on other code) with them as much as possible before final review time. There's no other sane way to do it.
Make sure you're all in agreement on the approach you're taking or, at least, make sure they're aware of it. Also a good way to (hopefully) make sure you're not going to conflict with somebody else's work!
Dumping a big commit full of surprises on reviewers is AT BEST poor practice and is at worst kind of shady and a sign you don't want to collab, don't respect your coworkers and/or or are hiding something.
I definitely prefer in avoiding surprises. In this case the review was upgrading and migrating from Vue 2 to Vue 3 and, while the rest of the team was in the loop and aware of coming changes, the change set itself was massive. I definitely would do it differently the next time around, and this is an edge case. I will say that that position saw several “larger than any reasonable person should submit” change sets, glad it’s behind me.
Y'all are very thorough. Just make sure the function has tests and won't blow up prod. You needn't waste forever reviewing. It's code. It can be iterated upon.
In this particular case, we're talking some tricky code around array handling in the JS implementation of a major browser. It's pretty much the archetypical location for a VM escape bug. That's most definitely not something you want to be cavalier about.
In general, the large, sprawling diffs we're talking about here, the sort that actually justify commits of this size, are almost always going to also be the sort that justify more closer scrutiny at review time.
No. In any actually complex piece of code even adding significant amounts testing over new code is not going to cover every possible code path.
It's also immensely difficult to write tests to find errors in code you're writing - most patches I see with "extensive tests" test all the correct code paths, and _maybe_ a few error cases. It's a very easy trap to get yourself into.
The purpose of review is not to catch obvious issues - though it hopefully will - but the subtle ones.
I’ve had both thorough and cursory reviews of changes like this.
It makes no difference for finding actual bugs.
Reviews are great for spreading knowledge about the change and making sure the style is dialed in. Reviews aren’t great for ensuring test coverage or finding security bugs.
Test coverage should be ensured by your coverage tools in the CI. (At least basic test coverage.)
You are right that reviews aren't necessarily there to catch bugs. Reviews are there to tell you to write code that's simple and understandable. If your coworkers can't understand what's going on, how could you expect anyone else to do so in a year or so?
Coverage tools only tell you have tests that hit each line, but most serious bugs aren't the "you didn't hit one line", they're the result of specific interactions between multiple paths. e.g. (note this is a super simple and trivial example)
int size = 0;
char *buffer;
if (a) {
buffer = new char[1];
size = 1;
} else if (b) {
buffer = new char[2];
size = 2;
}
...
if (b) {
buffer[1]++;
}
...
Coverage tests will happily show that every line is hit, without tests necessarily hitting the buffer overflow if an and b are both true.
Indeed. I always push my teams into not hunting for spelling mistakes. Instead look for things that will be hard to fix in the future ('did you realise that this bootloader config stops us doing field updates??') or where the overall direction is wrong ('Not sure about os.system('grep') for this XML - why can't we just use the xml.etree here?').
I worked for a while with Henrik Frystyk Nielsen, who worked on the early HTTP spec and httpd. We would give him a hard time for this — among other things.
I know that Phillip Hallam-Baker probably first used that spelling in a document
This is code running on client machines which you have no control over. Once it's out there good luck getting everyone to upgrade to your latest version because "hey, we didn't want this one to behave like that. oops". Especially when you're google and don't want certain freedoms in your software.
You make sure that more than one person is aware of the changes and design, and have all people frequently discuss the progress before the mega patch hits PR.
Then the mega patch PR is " one last look ", and not thousands of changes storming the gates of your sanity.
Yeah its best to go over something like this is a meeting/zoom or a series of meetings. And most likely not every single line will be gone over, at some point the engineer writing this code is likely pretty senior and shouldn't need that many eyes on their boilerplate, and its the nasty stuff that really needs talking about and focusing on.
Code review doesn't magically catch all bugs even with tiny PRs. At best it's a basic sanity check. In fact I'd say it's more of a design sanity check than a bug catching one.
To prevent bugs you need tests, fuzzing, static analysis, static & strong type checking, etc.
We missed an OS update cycle and had to consume all the changes in one giant patch. But upstream had migrated from Python 2 to 3 in-between and rewrote half their build scripts to accommodate. Each of those changes needed to be reviewed manually because we had modified things to support reproducible builds and the resulting merge conflict was monstrous. I contributed maybe 2300 of those before my sanity failed and called for help.
Now we have a whole team to do that job more regularly.
I find this to be more of a reflection of how well aligned you and the code reviewer are. When both of you know the code base inside out and trust each other, it's fine. But sometimes the author is a noob and the code reviewer has to review every little detail, much like an examiner looking at a student's paper. Or sometimes they might have a personal beef.
> I often wonder how much more productive I could be if I didn't need to split changes like this into 50 small commits, where every intermediate commit is working and has passing tests...
This is down to a trade-off between ease of writing and ease of review.
Yes, if you didn't have to make your work presentable, you could bang it out faster. But then no one could read it.
Native code wants to think it’s a ref counted buffer that could be accessed from any thread including ones that don’t participate in GC. So it’s gotta be allocated in the malloc heap and it’s gotta have a ref count.
JS code wants the buffer to be garbage collected. If native code doesn’t ref it (count is zero) and it’s not reachable via JS heap then it’s gotta die. So, it needs a GC cell and associated machinery (JSC object headers, mark bits, possibly other stuff).
So, the reason why the aliased case is not as awesome as the owned case is that you need double the things. You need some malloc memory and you need a GC cell. You need a ref count and you need a GC header and some mark bits.
Having both malloc/RC and cell/GC overhead isn’t like the end of the world or anything, but it means that allocations are slower and waste more memory than if it was either just a buffer owned by native code or just an object in the GC heap.
With a project that could live on for many years, I’d object to this code, but only because it doesn’t really explain _why_ it’s necessary despite wasting time/memory. The humorous name is great, though. It might be explained in the VCS history, but people aren’t likely to look it up there and I’ve noticed the history tends to get lost for a lot of old projects.
The name isn’t just funny. It’s also accurate. And it’s quite evidently stood the test of time.
In a VM, often the most important kind of documentation is about labeling what is the fast path and what is the slow path. This method name alerts the code reader that they’re looking at one of the slow paths.
Isn't part of the goal with deliberately weird names like this is to alert callers that HERE BE DRAGONS so, like, don't call this unless you've read _and understood_ the source code?
The reason for the function’s existence is that it allows typed arrays to dynamically switch between a fast/compact representation for the case that the JSVM owns the data, and a slightly slower and slightly less compact version for when the JSVM allows native code to share ownership of the data.
This function, slowDownAndWasteMemory, switches to the less efficient version that allows aliasing with native code.
Of course the name is sarcastic. The actual effect of having this is that JSC can handle both the owned case and the aliased case, and get a small will if you’re in the owned case while being able to switch to the aliased case at any time. Since there’s no way to support the aliased case without some slightly higher memory/time usage, we are sort of literally slowing down and wasting memory when we go aliased.
Source: I’m pretty sure I wrote most of this code.