More

JoelEinbinder · 2025-01-24T16:29:58 1737736198

When I've talked to people running this kind of ai scraping/agent workflow, the costs of the AI parts dwarf that of the web browser parts. This causes computational cost of the browser to become irrelevant. I'm curious what situation you got yourself in where optimizing the browser results in meaningful savings. I'd also like to be in that place!

I think your ram usage benchmark is deceptive. I'd expect a minimal browser to have much lower peak memory usage than chrome on a minimal website. But it should even out or get worse as the websites get richer. The nature of web scraping is that the worst sites take up the vast majority of your cpu cycles. I don't think lowering the ram usage of the browser process will have much real world impact.

fbouvier · 2025-01-24T17:10:27 1737738627

The cost of the browser part is still a problem. In our previous startup, we were scraping >20 millions of webpages per day, with thousands of instances of Chrome headless in parallel.

Regarding the RAM usage, it's still ~10x better than Chrome :) It seems to be coming mostly from v8, I guess that we could do better with a lightweight JS engine alternative.

radium3d · 2025-01-25T01:52:44 1737769964

As a web developer and server manager AI trainers scraping websites with no throttle is the problem. lol

cush · 2025-01-24T20:14:18 1737749658

> there are hundreds of Web APIs, and for now we just support some of them (DOM, XHR, Fetch)

> it's still ~10x better than Chrome

Do you expect it to stay that way once you've reached parity?

fbouvier · 2025-01-25T15:03:37 1737817417

I don't expect it to change a lot. All the main components are there, it's mainly a question of coverage now.

nwienert · 2025-01-24T21:07:14 1737752834

Playwright can run webkit very easily and it's dramatically less resource-intensive than Chrome.

fbouvier · 2025-01-24T23:16:42 1737760602

Yes but WebKit is not a browser per se, it's a rendering engine.

It's less resource-intensive than Chrome, but here we are talking orders of magnitude between Lightpanda and Chrome. If you are ~10x faster while using ~10x less RAM you are using ~100x less resources.

bdhcuidbebe · 2025-01-25T01:13:44 1737767624

How well does it compare to specialized headless scraper browsers, like camoufox (firefox based) or secret agent (chrome based)?

Either should reduce your ram usage compared to stock chrome by a lot.

whatevaa · 2025-01-25T10:31:35 1737801095

Careful, as you implement misssing features your RAM usage might grow too. Happened to many projects, lean at the beggining, get's just as slow when dealing with real world mesiness.

msoad · 2025-01-25T12:19:50 1737807590

Does it work nicely on Linux? I'm very curious about this

niutech · 2025-01-29T11:14:59 1738149299

How about using QuickJS instead of full-blown V8? For example, Elinks has support for SpiderMonkey, QuickJS, MuJS: https://github.com/rkd77/elinks/blob/master/doc/ecmascript.t... and takes a few MB of RAM.

Tostino · 2025-01-24T17:38:15 1737740295

You may reduce ram, but also performance. A good JIT costs ram.

fbouvier · 2025-01-24T17:57:27 1737741447

Yes, that's true. It's a balance to find between RAM and speed.

I was thinking more on use cases that require to disable JIT anyway (WASM, iOS integration, security).

Tostino · 2025-01-24T18:02:01 1737741721

Yeah, could be nice to allow the user to select the type of ECMAScript engine that fits their use-case / performance requirements (balancing the resources available).

cxr · 2025-01-25T04:30:47 1737779447

If your target is consistent enough (perhaps even stationary), then at some point "JIT" means wasting CPU cycles.

refulgentis · 2025-01-24T17:01:44 1737738104

Generally, for consumer use cases, it's best to A) do it locally, preserving some of the original web contract B) run JS to get actual content C) post-process to reduce inference cost D) get latency as low as possible

Then, as the article points out, the Big Guns making the LLMs are a big use case for this because they get a 10x speedup and can begin contemplating running JS.

It sounds like the people you've talked to are in a messy middle: no incentive to improve efficiency of loading pages, simply because there's something else in the system that has a fixed cost to it.

I'm not sure why that would rule out improving anything else, it doesn't seem they should be stuck doing nothing other than flailing around for cheaper LLM inference.

> I think your ram usage benchmark is deceptive. I'd expect a minimal browser to have much lower peak memory usage than chrome on a minimal website.

I'm a bit lost, the ram usage benchmark says its ~10x less, and you feel its deceptive because you'd expect ram usage to be less? Steelmanning: 10% of Chrome's usage is still too high?

JoelEinbinder · 2025-01-24T17:09:47 1737738587

The benchmark shows lower ram usage on a very simple demo website. I expect that if the benchmark ran on a random set of real websites, ram usage would not be meaningfully lower than Chrome. Happy to be impressed and wrong if it remains lower.

fbouvier · 2025-01-24T17:15:43 1737738943

I believe it will be still significantly lower as we skip the graphical rendering.

But to validate that we need to increase our Web APIs coverage.

szundi · 2025-01-25T11:42:09 1737805329

Then came deepseek

JoelEinbinder · 2024-10-28T20:36:57 1730147817

Google's monorepo of closed source code.

JoelEinbinder · 2024-10-04T14:36:55 1728052615

Seems like Hyundai own 33% of Kia, rather than it just being a brand under the same company like Lexus/Toyota. They share some things and compete on others.

bryanlarsen · 2024-10-04T14:45:19 1728053119

It's a lot more complicated than that. They also have some common owners, and Kia owns parts of some Hyundai subsidiaries. Chaebol's are complicated beasts.

JoelEinbinder · on Aug 24, 2024

I think Hacker News might appreciate some of the behind the scenes of this post.

Getting this page to load quickly was not trivial. The initial dataset of books starting sentences was over 20 megabytes. By only sending the unique prefix of each book, I was able to get that to be much smaller. Using a custom format, sorting the prefixes, and gzipping got the size down to 114kb. About 3 bytes per book. The full first sentences are downloaded on demand as the books are filtered down.

Rendering the books requires 5 million triangles. I used WebGL 2's drawArraysInstanced method. This allows me to define the book geometry only once, and each book is just defined by it's rotation/position/color. Then it's just a matter of keeping the fragment shader simple.

Going into this project, I wasn't sure if it was possible. But I have left feeling really impressed with how capable the web is these days if you are willing to push a bit.

JoelEinbinder · on Aug 24, 2024

It would make competitive chess even more draw-ish. It is much easier to see when you accidentally get into a losing position than when you miss a winning idea. So the take back would be used defensively.

JoelEinbinder · on Aug 18, 2024

On the full set of 1000 questions, the language models are getting 30-35% correct. With patience, humans can do 40-50%.

The language models were prompted with the text + each candidate answer, and the one with the lowest perplexity was picked. I tried to avoid instruction tuned models wherever possible to avoid the "voice" problem.

exit · on Aug 18, 2024

i'm curious, how did you arrive at "40-50%" possible human performance?

the task of "predicting the next word" can be understood as either "correctly choosing the next word in the hidden context", or "predicting the likelihood of each possible word".

the quiz is evaluating against the former, but humans are still far from being able to express a percentile likelihood for each possibility.

i only consciously arrive at a vague feeling of confidence, rather than being able to weigh the prediction of each word with fractional precision.

one might say that LLMs have above human introspective ability in that regard.

JoelEinbinder · on Aug 18, 2024

What scores are you getting using this technique?

JoelEinbinder · on Aug 18, 2024

The prompts you see in the quiz are from real hacker news comments. Whatever word the commenter said next is the "correct" word.

moralestapia · on Aug 18, 2024

This is what I see,

  Are you smarter than a language model?

  There are a lot of benchmarks that try to see how good language models are at human tasks. But how good are you at the quintessential language model task of predicting the next word?

And then a list of questions.

How am I supposed to know it has anything to do with HN?

JoelEinbinder · on Aug 18, 2024

After the quiz, the source is linked along with the full comment.

JoelEinbinder · on Aug 18, 2024

Temperature doesn't play a role here, because the LLM is not being sampled (other than to generate the candidate answers). Instead the answer the llm picks is decided by computing the complexity for the full prompt + answer string.

JoelEinbinder · on Aug 17, 2024

The language model generating the candidate answers generates tokens until a full word is produced. The language models picking their answer choose the completion that results in the lowest perplexity independent of the tokenization.

lostmsu · on Aug 17, 2024

I'd say the test is still not quite valid, and more of in between the original "valid" task and "guess what LLM would say" as suggested in another comment here. The reason is: it might be easier for LLMs to choose the completion out of their own generated variants (1) than the real token distribution.

1. perhaps even out of variants generated by other LLMs