More

lucgagan · 2025-03-20T17:20:04 1742491204

Can confirm. My website is flooded with AI bots despite attempts to block crawlers to certain parts of it.

lucgagan · 2024-10-11T19:06:02 1728673562

Correct me if I am wrong, but these implementations are all CPU bound?, i.e. if I have a good GPU, I should look for alternatives.

bt1a · 2024-10-11T19:13:09 1728673989

You are correct. This project is "on the CPU", so it will not utilize your GPU for computation. If you would like to try out a Rust framework that does support GPUs, Candle https://github.com/huggingface/candle/tree/main may be worth exploring

littlestymaar · 2024-10-11T19:11:56 1728673916

It's all implemented on the CPU, yes, there's no GPU acceleration whatsoever (at the moment at least).

> if I have a good GPU, I should look for alternatives.

If you actually want to run it, even just on the CPU, you should look for an alternative (and the alternative is called llama.cpp) this is more of an educational resource about how things work when you remove all the layers of complexity in the ecosystem.

LLM are somewhat magic in how effective they can be, but in terms of code it's really simple.

J_Shelby_J · 2024-10-11T19:17:06 1728674226

Yes. Depending on gpu 10-20x difference.

For rust you have the llama.cpp wrappers like llm_client (mine), and the candle based projects mistral.rs, and Kalosm.

Although, my project does try and provide an implementation of mistral.rs, I haven’t fully migrated from llama.cpp. A full rust implementation would be nice for quick install times (among other reasons). Right now my crate has to clone and build. It’s automated for mac, pc, and Linux but it adds about a minute of build time.

kkielhofner · 2024-10-12T03:40:07 1728704407

CPU, yes, but more importantly memory bandwidth.

An RTX 3090 (as one example) has nearly 1TB/s of memory bandwidth. You'd need at least 12 channels of the fastest proof-of-concept DDR5 on the planet to equal that.

If you have a discrete GPU, use an implementation that utilizes it because it's a completely different story.

Apple Silicon boasts impressive numbers on LLM inference because it has a unified CPU-GPU high-bandwidth (400GB/s IIRC) memory architecture.

tormeh · 2024-10-12T13:05:22 1728738322

Depends. Good models are big, and require a lot of memory. Even the 4090 doesn't have that much memory in an LLM context. So your GPU will be faster, but likely can't fit the big models.

lucgagan · on June 7, 2024

Woah, this is very _fun_!

lucgagan · on May 7, 2024

Something similar I worked on in the past https://github.com/lucgagan/auto-playwright/

worldsayshi · on May 7, 2024

Does it use ChatGPT every time you run the test or only when a test fails (to check if the selector has changed)?

lucgagan · on Dec 7, 2023

Very cool. I've been using a similar approach to summarize videos on my website https://ray.run/videos/112-playwright-installation-2022 Doesn't cost much and getting great feedback about it.

lucgagan · on Nov 24, 2023

Is there a way to try this before I buy it?

I am building something that would probably benefit from this, but with that price tag (solo indy dev) that's going to be a big ask! might be worth it, just no way of knowing without trying it first

lucgagan · on Nov 17, 2023

Thanks for the response. I paused for a day responding to everyone to think through all the feedback I've received, and shared an update https://github.com/lucgagan/auto-playwright/issues/15

lucgagan · on Nov 16, 2023

Theirs is a commercial project with MIT client. I made an open-source version of their project. (Their backend is closed source. I have no knowledge of it.)

lucgagan · on Nov 13, 2023

If I was the creator of the project, I'd be happy with it. Most companies would not have donated anything.

lucgagan · on Nov 12, 2023

Most people I would imagine

bibinmohan · on Nov 12, 2023

Would you use Facebook login with other apps? Or Google or Apple login? Which one do you prefer?

skeptrune · on Nov 12, 2023

Google

Second to trad email/pass