Hacker News new | past | comments | ask | show | jobs | submit login

That comment is not the reason for the function’s existence, but the details of how the function interacts with GC.

The reason for the function’s existence is that it allows typed arrays to dynamically switch between a fast/compact representation for the case that the JSVM owns the data, and a slightly slower and slightly less compact version for when the JSVM allows native code to share ownership of the data.

This function, slowDownAndWasteMemory, switches to the less efficient version that allows aliasing with native code.

Of course the name is sarcastic. The actual effect of having this is that JSC can handle both the owned case and the aliased case, and get a small will if you’re in the owned case while being able to switch to the aliased case at any time. Since there’s no way to support the aliased case without some slightly higher memory/time usage, we are sort of literally slowing down and wasting memory when we go aliased.

Source: I’m pretty sure I wrote most of this code.




Indeed you did, 10 years ago. This is the (mega!) commit:

https://github.com/WebKit/WebKit/commit/93a48aa964f25ce8ec9f...

I often wonder how much more productive I could be if I didn't need to split changes like this into 50 small commits, where every intermediate commit is working and has passing tests...


Yeah sometimes you just have to write a megapatch. I’d be very sad if I had to work on a project that didn’t allow that.


I understand the desire to make larger patches, but how do you effectively manage them in the review process? For super large commits in the past, I’ve had other engineers hop on a screen share to review the diff together and answer questions, but it feels inefficient.


    how do you effectively manage them in the review process
The overriding principle is "no surprises at final review time."

For a big impactful change discuss the work in progress beforehand (specifically: architecture stuff, database changes, impact this will have on other code) with them as much as possible before final review time. There's no other sane way to do it.

Make sure you're all in agreement on the approach you're taking or, at least, make sure they're aware of it. Also a good way to (hopefully) make sure you're not going to conflict with somebody else's work!

Dumping a big commit full of surprises on reviewers is AT BEST poor practice and is at worst kind of shady and a sign you don't want to collab, don't respect your coworkers and/or or are hiding something.


I definitely prefer in avoiding surprises. In this case the review was upgrading and migrating from Vue 2 to Vue 3 and, while the rest of the team was in the loop and aware of coming changes, the change set itself was massive. I definitely would do it differently the next time around, and this is an edge case. I will say that that position saw several “larger than any reasonable person should submit” change sets, glad it’s behind me.


Sounds like you did well! Congrats on the big upgrade - not easy.


It’s simple. Find a reviewer who isn’t that thorough.


I call this "review shopping". If someone is pushing back on a change, find a weaker link.


I laughed


Just don't tell Oliver


Oliver is the best reviewer ever


Y'all are very thorough. Just make sure the function has tests and won't blow up prod. You needn't waste forever reviewing. It's code. It can be iterated upon.


In this particular case, we're talking some tricky code around array handling in the JS implementation of a major browser. It's pretty much the archetypical location for a VM escape bug. That's most definitely not something you want to be cavalier about.

In general, the large, sprawling diffs we're talking about here, the sort that actually justify commits of this size, are almost always going to also be the sort that justify more closer scrutiny at review time.


I doubt that a thorough review is going to find security bugs in a change like this.

Good tests, a great fuzzer, and a highly motivated security research community - that’s how you find the great security bugs!


No. In any actually complex piece of code even adding significant amounts testing over new code is not going to cover every possible code path.

It's also immensely difficult to write tests to find errors in code you're writing - most patches I see with "extensive tests" test all the correct code paths, and _maybe_ a few error cases. It's a very easy trap to get yourself into.

The purpose of review is not to catch obvious issues - though it hopefully will - but the subtle ones.


> and _maybe_ a few error cases.

Also it's often unit tests only, with no hint of any integration tests. I'd settle for a tests that check integration of at least two units, but alas.


Purpose of review is to ask questions to help you write more tests to catch more bugs.


I’ve had both thorough and cursory reviews of changes like this.

It makes no difference for finding actual bugs.

Reviews are great for spreading knowledge about the change and making sure the style is dialed in. Reviews aren’t great for ensuring test coverage or finding security bugs.


Test coverage should be ensured by your coverage tools in the CI. (At least basic test coverage.)

You are right that reviews aren't necessarily there to catch bugs. Reviews are there to tell you to write code that's simple and understandable. If your coworkers can't understand what's going on, how could you expect anyone else to do so in a year or so?


Coverage tools only tell you have tests that hit each line, but most serious bugs aren't the "you didn't hit one line", they're the result of specific interactions between multiple paths. e.g. (note this is a super simple and trivial example)

    int size = 0;
    char *buffer;
    if (a) {
      buffer = new char[1];
      size = 1;
    } else if (b) {
      buffer = new char[2];
      size = 2;
    }
    ...
    if (b) {
      buffer[1]++;
    }
    ...
Coverage tests will happily show that every line is hit, without tests necessarily hitting the buffer overflow if an and b are both true.


I agree. Hence my parenthetical remark about '(At least basic test coverage.)'

Code review can help you prod you to find ways to simplify your code, so that you have fewer paths to consider.


Indeed. I always push my teams into not hunting for spelling mistakes. Instead look for things that will be hard to fix in the future ('did you realise that this bootloader config stops us doing field updates??') or where the overall direction is wrong ('Not sure about os.system('grep') for this XML - why can't we just use the xml.etree here?').


> not hunting for spelling mistakes

Unless it's the name of an HTTP header?


Not sure what you’re refering to.



You might want to check how that word was spelled in the comment you're replying to :)


Woosh. Also Henrik is Danish, he can claim that English isn't his first language


Henrik?


I worked for a while with Henrik Frystyk Nielsen, who worked on the early HTTP spec and httpd. We would give him a hard time for this — among other things.

I know that Phillip Hallam-Baker probably first used that spelling in a document


Well done :-D


This kind of mentality is absurdly inappropriate when working on something as security sensitive as a browser's JS engine.


This is code running on client machines which you have no control over. Once it's out there good luck getting everyone to upgrade to your latest version because "hey, we didn't want this one to behave like that. oops". Especially when you're google and don't want certain freedoms in your software.


That's fair. I mostly work on servers and web FE which is also easy to redeploy.


You make sure that more than one person is aware of the changes and design, and have all people frequently discuss the progress before the mega patch hits PR.

Then the mega patch PR is " one last look ", and not thousands of changes storming the gates of your sanity.


How do you know if the "discussion" made it into the code?


You can formalise that discussion by breaking the patch into smaller commits.


Yeah its best to go over something like this is a meeting/zoom or a series of meetings. And most likely not every single line will be gone over, at some point the engineer writing this code is likely pretty senior and shouldn't need that many eyes on their boilerplate, and its the nasty stuff that really needs talking about and focusing on.


You just don't review it as well. It's not the end of the world.


It's just your bank accounts, your crypto, your Gmail with all your passwords to everything....


Code review doesn't magically catch all bugs even with tiny PRs. At best it's a basic sanity check. In fact I'd say it's more of a design sanity check than a bug catching one.

To prevent bugs you need tests, fuzzing, static analysis, static & strong type checking, etc.


I don't know what my largest commit has been, but I do find that merge conflicts end up being a real pain when you're touching so many files...


Mine was a very memorable ~2900 files changed.

We missed an OS update cycle and had to consume all the changes in one giant patch. But upstream had migrated from Python 2 to 3 in-between and rewrote half their build scripts to accommodate. Each of those changes needed to be reviewed manually because we had modified things to support reproducible builds and the resulting merge conflict was monstrous. I contributed maybe 2300 of those before my sanity failed and called for help.

Now we have a whole team to do that job more regularly.


I find this to be more of a reflection of how well aligned you and the code reviewer are. When both of you know the code base inside out and trust each other, it's fine. But sometimes the author is a noob and the code reviewer has to review every little detail, much like an examiner looking at a student's paper. Or sometimes they might have a personal beef.


I don't want a personal review, I want a code reviewer.

"Trusting" someone means normalization of deviance, where bugs slip in.


> I often wonder how much more productive I could be if I didn't need to split changes like this into 50 small commits, where every intermediate commit is working and has passing tests...

This is down to a trade-off between ease of writing and ease of review.

Yes, if you didn't have to make your work presentable, you could bang it out faster. But then no one could read it.


This is classic HN.

"Look at this crazy, obscure code everybody!"

*ORIGINAL DEVELOPER APPEARS*


Obscurity is mostly a property of the project it's in, so I don't know if it can really apply to WebKit.


Out of curiosity, what makes the aliased version less compact? Naively I'd think there's not much room to optimize the representation of a byte array.


Native code wants to think it’s a ref counted buffer that could be accessed from any thread including ones that don’t participate in GC. So it’s gotta be allocated in the malloc heap and it’s gotta have a ref count.

JS code wants the buffer to be garbage collected. If native code doesn’t ref it (count is zero) and it’s not reachable via JS heap then it’s gotta die. So, it needs a GC cell and associated machinery (JSC object headers, mark bits, possibly other stuff).

So, the reason why the aliased case is not as awesome as the owned case is that you need double the things. You need some malloc memory and you need a GC cell. You need a ref count and you need a GC header and some mark bits.

Having both malloc/RC and cell/GC overhead isn’t like the end of the world or anything, but it means that allocations are slower and waste more memory than if it was either just a buffer owned by native code or just an object in the GC heap.


Did you choose the name? Have there been any reactions to it?

I like the name, but can easily imagine somebody preferring a more serious alternative.


That’s why it’s important to find a team and coworkers that aren’t so uptight.


With a project that could live on for many years, I’d object to this code, but only because it doesn’t really explain _why_ it’s necessary despite wasting time/memory. The humorous name is great, though. It might be explained in the VCS history, but people aren’t likely to look it up there and I’ve noticed the history tends to get lost for a lot of old projects.


The name isn’t just funny. It’s also accurate. And it’s quite evidently stood the test of time.

In a VM, often the most important kind of documentation is about labeling what is the fast path and what is the slow path. This method name alerts the code reader that they’re looking at one of the slow paths.


Fortunately, git helps you stitch together histories (without changing hashes), if you later recover it: https://git-scm.com/book/en/v2/Git-Tools-Replace


Isn't part of the goal with deliberately weird names like this is to alert callers that HERE BE DRAGONS so, like, don't call this unless you've read _and understood_ the source code?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: