More

weakwire · 2025-11-23T16:46:03 1763916363

So? 73% of Saas startups are DB connectors & queries.

tylerchilds · 2025-11-23T16:51:34 1763916694

The difference is, if your company “moat” is a “prompt” on a commodity engine, there is no moat.

Google even said they have no moat, when clearly the moat is people that trust them and not any particular piece of technology.

ojr · 2025-11-23T19:36:35 1763926595

the orchestration layer is the moat, ask any LLM and they will give paragraphs explaining why this is...

indymike · 2025-11-23T16:50:30 1763916630

And 73% of paas are deploy scripts for existing software. It's how the industry works.

parineum · 2025-11-23T16:53:26 1763916806

If tekens aren't profitable then prices per token are likely to go up. If that's all these businesses are, they're all very sensitive to token prices.

weakwire · 2025-11-23T16:58:14 1763917094

Not with open weight models you can deploy yourself. Different economics but not venerable to price increases.

parineum · 2025-11-23T19:32:10 1763926330

First, someone has to develop those models and that's currently being done with VC backing. Second, running those models is still not profitable, even if you self host (obviously true because everything is self hosted eventually).

Burning VC money isn't a long term business model and unless your business is somehow both profitable on Llama 8b (or some such low power model) _and_ your secret sauce can't be easily duplicated, you're in for a rough ride.

The only barrier between AI startups at this point is access to the best models, and that's dependent on being able to run unprofitable models that spend someone else's money.

Investing in a startup that's basically just a clever prompt is gambling on the first mover's advantage because that's the only advantage they can have.

weakwire · 2025-09-21T07:54:34 1758441274

LOVED!!! the presentation.

weakwire · 2025-08-06T18:50:34 1754506234

That’s actually great!

weakwire · on Dec 10, 2024

I read them as sarcastic. Please reply here with your output.

AnimalMuppet · on Dec 10, 2024

Since you read them as sarcastic, I also read them as sarcastic. Quantum entanglement at work.

weakwire · on Aug 29, 2024

Perfect. Now I have a calendar event to “Install Covid”. Thanks!

LiamPrevelige · on Aug 29, 2024

The name association is a genuine concern of ours :)

wazdra · on Aug 29, 2024

Let's hope it spreads as well :p

weakwire · on Jan 12, 2024

I came to the comments for the summary only to find out there is no summary...

weakwire · on Jan 7, 2024

Very interesting. what is your tech stack?

weakwire · on Oct 22, 2023

Yes, no one said they did that.

EduardoBautista · on Oct 22, 2023

[flagged]

yardstick · on Oct 22, 2023

The “in 3 years” was meant as a forward looking statement.

The post was being sarcastic that Uber will in 3+ years time claim another victory, based on abandoning the cloud.

gremlinunderway · on Oct 22, 2023

Thanks, I also misunderstood "in 3 years" meant forward-looking (i.e. "3 years from now...") rather than backward-looking ("for over 3 years...")

arthurcolle · on Oct 22, 2023

They were making a joke about a hypothetical post in another three years

weakwire · on April 21, 2023

It's really fast! That is very important. Also it provides 3 variation drafts. I see this as a winner

cush · on April 21, 2023

The Copilot VSCode extension does 10 drafts

londons_explore · on April 21, 2023

Both can be made to give more drafts by just asking the same question again...

I don't see more drafts as a real differentiator.

lavasalesman · on April 21, 2023

UX is a feature

cush · on April 23, 2023

> I don't see more drafts as a real differentiator.

Well the person I replied to did..

weakwire · on April 21, 2023

it spits out 3 drafts in 1 go. without the need to wait. It is simply way faster than GPT4

verdverm · on April 21, 2023

I wonder how much of that has to do with TPUv4 vs the hardware used for GPT4?

Google has invested in custom AI hardware for some time now and does not run their workloads on nvidia cards

londons_explore · on April 21, 2023

Neural networks are really parallelizable. If I scale up my AI service to handle double the number of users by buying double the number of GPU's, it is theoretically possible to also serve each user in half the time.

To do so, you need to split the matrix multiplies across the new machines. You also need more inter-machine network bandwidth, but with GPT-3 that works out to 48 kilobytes per token predicted collected from every processing node and given to every processing node. Even if Bard is 100x as big, that is still very doable within datacenter scale networking.

However, OpenAI doesn't seem to have done this - I suspect an individual request is simply routed to one of n machine clusters. As they scale up, they are just increasing n, which doesn't give any latency benefit for individual requests.

verdverm · on April 21, 2023

Yup, the TPUv4 pod is highly optimized

They are claiming to be the first to achieve >50% saturation during training. Pretty sure I recall Midjourney is using TPUv4 pods too

https://cloud.google.com/blog/products/ai-machine-learning/g...

https://cloud.google.com/tpu/docs/system-architecture-tpu-vm

weakwire · on Feb 7, 2023

Looks like we love the same things.