Show HN: Hyperbrowser – Scalable Browser Infrastructure for AI Apps

blumberg · 2024-12-11T21:48:44 1733953724

Nice one - We have a similar service provided from Charity Engine, our "crowdsourced cloud service" - https://www.charityengine.com/marketplace . (Relevant docs here: https://www.charityengine.com/docs/Smart+Proxy+Service )

Of note: as Charity Engine is a general-purpose distributed compute platform, it's possible to run applications to post-process web content "at the edge" of the network, as the data is streamed back. Ex we're now running small LLMs along with the browsers, so it's possible to do things like run sentiment analysis, generate image descriptions, summarize content, etc – all at scale, enabled by hundreds of thousands of processors.

For more, see this talk: "Distributed Intelligence for Distributed Data" - https://youtu.be/YQe--AZFUuQ?si=_HoqkdooNPSR7dDQ (13mins)

AznHisoka · 2024-12-11T00:00:28 1733875228

Your pricing is too confusing. Either price it based on number of requests like ScrapingBee does, or base it entirely on bandwidth. Instead your pricing is based on both bandwidth and “minutes” (which is also hard to predict when some sites can load slower than others)

2nd, the pricing also seems way way below what ScrapingBee offers.

You charge $100 for 60K credits. So assuming 18 credits for 1 page (3 MB per page) that allows me to crawl 3333 web pages. $250 would allow me to crawl under 10k pages

In comparison it costs $249 for 3 million credits in ScrapingBee which enable me to crawl 100,000 webpages (with JS and proxies enabled)

Am i missing something obvious here?

ashekhawat · 2024-12-11T00:18:15 1733876295

Hey! Completely hear you on that, right now we just have endpoints for headless browsers, this is a instance running chrome with dedicated resources that you have full access to. This comes with proxy services and captcha solving. This comes at a higher cost but you will get blocked less and can do much more, like browser automations.

Dedicated endpoints for just scraping like scraping bee are coming next week at a discounted price! In the mean time let me know if there is anything you would like that scraping bee is missing :)

AznHisoka · 2024-12-11T00:33:08 1733877188

Thanks, ScrapingBee has everything I am looking for in terms of functionality. My honest opinion is that most people who are looking for an alternative just want something cheaper. Thats 99% of what I am looking for, at least

Things like captcha solving are nice no doubt but for many people, their use cases just aren’t compelling enough to make it worth the ROI. I would love captcha solving, but definitely not at your price point..

shrisukhani · 2024-12-11T00:51:05 1733878265

Other cofounder of Hyperbrowser here - thanks for the feedback, and noted!

We aim to be pretty competitive on price across the board. Today we're just launching our headless browser service and I think we're significantly less expensive than competitors on that as dbmikus mentioned below

Totally hear you on the scraping specific use case without captcha solving etc though - we'll probably launch competitive pricing for this specific use case in the next few days. I'll make a note to leave a comment here when we do or feel free to email me at shri@hyperbrowser.ai and I can follow-up there :)

AznHisoka · 2024-12-11T12:22:36 1733919756

OK i am waiting to hear what you have in store!

HyprMusic · 2024-12-10T23:32:16 1733873536

Shame I already rolled this out in-house (and spent way too long trying to achieve it) - this would have been perfect for my use case. If you don't mind me asking, how do you get chrome to boot so quickly? GCP is unbelievably good at launching containers quickly but chrome is slow to launch.

ashekhawat · 2024-12-10T23:55:43 1733874943

We actually keep a pool of browsers ready so when a request comes we just assign it to an already running browser. This way in most situations there is no actual waiting for chrome to startup!

HyprMusic · 2024-12-11T00:01:29 1733875289

Ah of course, makes total sense. Thanks for answering!

dmarti · 2024-12-11T16:21:53 1733934113

Is each browser process dedicated to a single customer, or do you clear cookies and other state (localStorage, cache, Chrome advertising features...) between customers?

ashekhawat · 2024-12-11T17:47:36 1733939256

We use a new browser for each session you create! We don't share the browsers between different customers.

AznHisoka · 2024-12-11T00:02:48 1733875368

What challenges did you face in rolling this out? I did as well, and it didnt take long but thats only because I didnt aim for 100% perfection and was OK with being able to crawl 90% of webpages i encountered

lmeyerov · 2024-12-11T03:25:59 1733887559

Super cool. Any way to know if the endpoints are 'ethically sourced' and how they are secured wrt confidentiality & integrity?

(We have a 100K+/day scraping workload, and TBD full interactive automation)

ashekhawat · 2024-12-11T03:34:45 1733888085

Thank you! We do make sure we are using legitimate proxy providers. Also happy to manually set it to the provider of your choice if you are interested in doing that :)

skilbjo · 2024-12-10T21:46:26 1733867186

Sweet! trying this out now.

one quick nit on your docs: https://docs.hyperbrowser.ai/guides/scrape-site

```

import Hyperbrowser from "@hyperbrowser/sdk";

const client = new Hyperbrowser({ apiKey: process.env.HYPERBROWSER_API_KEY, });

(async () => {

  const job = await client.startScrapeJob({ url: "https://example.com" });

  console.log(job);

  const job = await client.getScrapeJob(job.jobId);

  console.log(job);

})(); ```

should be:

```

import Hyperbrowser from "@hyperbrowser/sdk";

const client = new Hyperbrowser({ apiKey: process.env.HYPERBROWSER_API_KEY, });

(async () => {

  let job = await client.startScrapeJob({ url: "https://example.com" }); // s/const/let

  console.log(job);

  job = await client.getScrapeJob(job.jobId); // remove const

  console.log(job);

})();

```

shrisukhani · 2024-12-10T21:57:31 1733867851

thanks! fixing now

shrisukhani · 2024-12-10T21:58:27 1733867907

update: fixed now :)

eulercoder · 2024-12-11T01:31:54 1733880714

Hi,

This looks very interesting and I’d love to try it.

We have data extraction and enrichment platform. We process over 12 million automations for our users, half of this is done on lambda and rest we try to process in a container.

We also utilize 10 GB+ in proxies.

I tried to do all the math on how much it would cost on your platform if we do a pilot project but I’m so confused with the pricing.

Can you please explain for 10,000 hours and 10 GB proxy usages what would the cost?

shrisukhani · 2024-12-11T02:00:41 1733882441

Hey - thanks!

For your specific example, this is how the math works out:

- Browser hours: 10K hours * 60 minutes per hour * 1 credit / minute = 600K credits

- Proxy usage: 10GB proxy data * 1024 MB / GB * 6 credits / 1MB proxy data = 61,440 credits

The scale plan includes 60K credits for $100/mo and overage credits are ~0.2c each, so the total would be $100 + (601.44 * $0.002) = 1302.88

Sorry about the confusing pricing btw! As I wrote this out, clearly we have our work cut out for us to simplify pricing - we'll try to make it simpler over the next few days and add a calculator that let's you see how much this would cost.

Feel free to email me at shri@hyperbrowser.ai if I can help with anything!

eulercoder · 2024-12-11T11:07:10 1733915230

This seems like a good option for us. I will write an email to you to discuss this in more detail. Thank you!

cr125rider · 2024-12-11T05:52:44 1733896364

That seems like an incredible value on the surface. Great job.

shrisukhani · 2024-12-11T06:03:20 1733897000

thank you! :)

lapumawan · 2024-12-11T04:27:52 1733891272

Incredible that you managed to achieve such a short start time for chrome. Although for only scraping data I think something like scrapingbee or crawlbase gets the job done better, faster and cheaper. But I will consider your solution for browser automation

ashekhawat · 2024-12-11T05:02:28 1733893348

Of course, would love for you to take a look! I mentioned in another comment but we are working on lower cost/faster scraping focused endpoints, It's coming next week

dbmikus · 2024-12-10T22:40:28 1733870428

Just did a pricing check, and you give twice as much concurrency/browsing/data-transfer as Browserbase. Nice!

Do you have any restrictions on sites we can scrape? I am likely going to do some public LinkedIn scraping (logged out, legally compliant).

ashekhawat · 2024-12-10T22:46:21 1733870781

Hey! So we have to use a different proxy service for some sites (LinkedIn included). Just ask in the app support and we can do that. We are also making custom endpoints for LinkedIn as well that should be ready soon, Ill shoot you an email once we have that!

anieve01 · 2024-12-13T20:21:54 1734121314

Will Hyperbrowser soon offer live session view like BrowserBase does e.g., https://docs.browserbase.com/features/session-live-view It'll be great to enable end users to view and control the remote browser session. Btw, Hyperbrowser seems great! Thanks for building this

dcx · 2024-12-11T09:48:00 1733910480

I'm quite curious, what are people doing with AI-powered browser automation at the moment?

This is such a new capability that I'm having a hard time getting a sense of interesting use cases. I'm quite sure this is more than just a shim for web services which don't expose APIs. But I also wonder whether LLMs are good enough to be trusted with more open-ended tasks at this stage.

shrisukhani · 2024-12-11T10:28:28 1733912908

Pretty wide variety of use cases tbh - everything from classic scraping on the very simple side to integrating with 3P services that don’t offer APIs, agentic QA testing etc

The companies we’ve seen automatic agentic workflows well typically send a bunch of context to the LLM and somewhat constrain the actions that the model can take. Actually works better than you’d expect :)

ATechGuy · 2024-12-10T22:43:41 1733870621

Congrats on launching!

> spin up hundreds of browser sessions in secure, isolated environments, with sub-second launch times

What tech do you use for this, and where is this hosted?

ashekhawat · 2024-12-10T22:49:22 1733870962

Just docker containers in GCP and AWS. We do all the instance management ourselves. It seems like you are working on infra for this from your bio! Send me an email would love to chat :)

bomewish · 2024-12-11T13:03:36 1733922216

Wouldn’t hetzner bare metal allow you to do that an order of magnitude cheaper ??

ashekhawat · 2024-12-11T15:44:33 1733931873

We will definitely look into it! As we are adding more users scalability is our primary concern right now though

devops000 · 2024-12-11T08:42:21 1733906541

Is it possibile to run the chrome browser with a chrome extension installed that use content script?

shrisukhani · 2024-12-11T08:52:19 1733907139

Yup! We need to add it to the docs in a day or two but happy to help do it myself. Drop me a line at shri@hyperbrowser.ai!

khromem · 2024-12-11T15:53:56 1733932436

Is this a cheaper alternative to browserbase? I am a bit confused since you mention creating endpoints for scraping many times, I just want to do browser automation. Is that a use case you support?

ashekhawat · 2024-12-11T16:35:03 1733934903

Sorry for the confusion, yes we do! https://docs.hyperbrowser.ai/guides/connect-with-puppeteer

Happy to help with anything :)

Dingway98 · 2024-12-10T21:37:55 1733866675

Seems pretty cool for getting some data off web pages. How many browsers can I open concurrently with hyperbrowser?

ashekhawat · 2024-12-10T21:39:16 1733866756

Hey, 100 max on our paid plan but I can enable 1k concurrently if you want!

gregpr07 · 2024-12-11T04:33:55 1733891635

Creator of Browser Use here. This looks super cool! How do you spin them up so fast, firecracker?

ashekhawat · 2024-12-11T05:00:53 1733893253

We keep "prewarmed" browsers ready for new sessions. Looking into firecracker though!

choilive · 2024-12-11T04:25:20 1733891120

Do you guys support handling 2FA workflows/scenarios?

ashekhawat · 2024-12-11T05:00:06 1733893206

We don't have that yet, but you could definitely do it in your own scripts right now using our browser. We'll try to think of ways to make it easier though :)

shivasurya · 2024-12-11T02:55:11 1733885711

had usecase of keeping visa appointments slots and instantly blocked by cloudflare :sad:

shrisukhani · 2024-12-11T03:06:50 1733886410

Sorry - Cloudflare bypass is coming this weekend!

This is also something I need btw so if you built a product on top of it, I'd be user #2 :)