Hacker Newsnew | past | comments | ask | show | jobs | submit | Ologn's commentslogin

It sure wasn't when AlexNet won the ImageNet challenge 13 years ago

https://news.ycombinator.com/item?id=4611830


Wow, look at the crowd of NN doubters in the comments there. I see the quality of foresight in the commentariat hasn’t improved given the state of this thread, either.

The ten most valuable S&P 500 companies are, in order of market cap:

Nvidia, Microsoft, Apple, Alphabet/Google, Amazon, Meta/Facebook, Broadcom, Tesla, Berkshire Hathaway, and Walmart.

A commonality to most of them (and to a lesser extent all of them) is they write software.

If a company not on the list like Ford has an F-150 truck come off the assembly line, some of that $40,000 cost is in the capital expenditure for the plant, any automation it has, the software in the car and so on. But Ford has to pay for the aluminum, steel and glass for each truck. It has to pay for thousands of workers on the assembly line to attach and assemble parts for each truck.

Meanwhile, at Apple a team writes iOS 18, mostly based on iOS 17, and it ships with the devices. Once it is written that's it for what goes off on iPhone 16. There may be some additional tweaks up until iOS 18.6. The relatively small team working on iOS has it going out with tens of millions of units. Their work is not as connected to the process of production as the assembly line people attaching and assembling the F-150 truck. If some inessential feature is not done as a phone is being made, it will be punted to next release. This can't be done with an F-150 truck.

Software properly done is just much more profitable than non-software work. We can see this here. Yes, some of the latest boost is due to AI hype (which may or may not come to fruition in the near future), but these companies got to this position before all of that.

I was watching a speech by Gabe Newell talking about the (smaller) software industry of the 1990s, and the idea back then to outsource and try to save on salary costs. He said he and his partners went the other way and decided to look for the most expensive and best programmers they could find, and Valve has had great success with that. Over the past 2 1/2 years we've seen a lot of outsourcing to cheaper foreign labor, FAANG layoffs (including Microsoft's recent Xbox layoffs), and more recently attempts to lower costs by having software produced by less experienced vibe coders using "AI". I have seen myself at Fortune 100 companies, especially non-tech ones, that the lessons of the late 1960s NATO software engineering conferences, or the lessons learned by Fred Brooks while managing the OS/360 project in the 1960s haven't been learned. Software can be a very, very profitable enterprise, and it is sometimes done right, but companies are still often doing things in the same way they were attempting such projects in the early 1960s. Even attempts to fix things like agile and scrum get twisted around as window dressing to doing things in the old-fashioned corporate way.


> I've been reading this website for probably 15 years, its never been this bad.

People here were pretty skeptical about AlexNet, when it won the ImageNet challenge 13 years ago.

https://news.ycombinator.com/item?id=4611830


Ouch that thread makes me quite sad at the state of discourse on HN today. It's a lot better than this thread.


That thread was skeptical, but it's still far more substantive than what you find here today.


I think that's because the announcement there actually told you something technically interesting. This just presents a result (which is cool), but the actual method is what is really cool!


Cisco stock (which I thought about buying in 1992 and didn't, unfortunately) doubled in 1990, tripled in 1991, doubled in 1992, and kept going up every year - in 1995 it doubled, in 1998 it doubled, in 1999 it doubled. So it had a long run (and is also still worth over $250 billion).

The monetary push is very LLM based. One thing being pushed that I am familiar with is LLM assisted programming. LLMs are being pushed to do other things as well. If LLMs don't improve more, or if companies don't see the monetary benefits of using them in the short/medium term, that would drag Nvidia down.

Nvidia has a lot of network effects. Probably only Google has some immunity to that (with its TPUs). I doubt Nvidia will have competition in training LLMs for a while. It is possible a competitor could start taking market share on the low end for inference, but even that would take a while. People have been talking about AMD competition for over two years, and I haven't seen anything that even seems like it might have potential yet, especially on the high end.


There's a lot of push for inference hardware now (e.g. Ironwood TPUs). How does Nvidia maintain an edge there?

Also, I think the market has to expand beyond LLMs to areas like robotics and self driving cars (and they need to have real success) for Nvidia to maintain this valuation. I don't think only LLMs are enough because I don't see code assist/image generation/chatbots as a massive market.


That's because Nvidia is offering a full ecosystem stack with HW, SW and networking clusters.

And that gives customers the most flexibility. Nvidia dominates training and is highly competitive in inferencing. At the same time, SW improvements speed up single node and networking performance. H100 released 3 years ago is today several times faster than it was on release with constant SW updates.

Customers who buy Nvidia for training today can use the older GPUs from Nvidia for inferencing later. And Nvidia supports even V100 still in SW updates and speed improvements. And since all is based on the same SW ecosystem, it allows for more seamless operations for customers. You can mix different Nvidia GPU clusters but you can't easily mix Nvidia solutions with other solutions.

That is also why Nvidia has always been dominant and that's flexibility. NVFP4 is a good example of what they do to stay ahead. And it is even supported by Hopper so any old customer can use Nvidia's new format to further improve model training performance. Suddenly old Hopper clusters become more valuable with some SW releases by Nvidia.

Nvidia has a track record which no competitor can match. Going with Nvidia is no mistake today while going with any competitor is a risky bet. If you spend billions, you think twice about making bets.


Nvidia's trailing P/E ratio is 53 (stock hitting a new high today). Its forward P/E ratio is 38.

A year ago both its trailing and forward P/E were higher. So the stock is relatively a bargain compared to what it was a year ago.

The price implies that revenues and profits are expected to continue to grow.

> My intuition is that the absence of the rapid, generationally transformative, advances in tech and industry that were largely seen in the latter half of the 20th-century (quickly followed with smartphones and social networking), stock market investors seem content to force similar patterns onto any marginally plausible narrative that can provide the same aesthetics of growth

I wouldn't disagree with this.


Thanks for the layman’s explanation for the logic involved, that was precisely what I was confused about.


Yes. In 2021, Nvidia was actually making more revenue from its home/consumer/gaming chips than from its data center chips. Now 90% of its revenue is from its data center hardware, and less than 10% of its revenue is from home gpus. The home gpus are an afterthought to them. They take up resources that can be devoted to data center.

Also, in some sense there can be some fear 5090s could cannibalize the data center hardware in some aspects - my desktop has a 3060 and I have trained locally, run LLMs locally etc. It doesn't make business sense at this time for Nvidia to meet consumer demand.


The book Androids by Chet Haase talks about how the early Android team had a lot of ex-Palm people on it.


Most human ten year olds in school can add two large numbers together. If a connectionist network is supposed to model the human brain, it should be able to do that. Maybe LLMs can do a lot of things, but if they can't do that, then they're an incomplete model of the human brain.


No LLM or other modern AI architecture I'm aware of is supposed to model the human brain. Even if they were, LLMs can add large numbers with the level of skill I'd expect from a 10 year old:

----

What's 494547645908151+7640745309351279642?

ChatGPT said: The sum of 494,547,645,908,151 and 7,640,745,309,351,279,642 is:

7,641,239,857,997,187,793

----

(7,641,239,856,997,187,793 is the correct answer)


I tried it on gpt-4-turbo and it seems to give the right answer:

>Let's calculate:494,547,645,908,151+7,640,745,309,351,279,642=7,641,239,856,997,187,793 >494,547,645,908,151+7,640,745,309,351,279,642=7,641,239,856,997,187,793 >Answer: 7,641,239,856,997,187,793


If I were to guess, most (adult) humans could not add two 3 digit numbers together with 100% accuracy. Maybe 99%? Computers can already do 100%, so we should probably be trying to figure out how to use language to extract the numbers from stuff and send them off to computers to do the calculations. Especially because in the real world most numbers that matter are not just two digits addition


Artificial neural nets are pretty far from brains. We don’t use them because they are like brains, we use them because they can approximate arbitrary functions given sufficient data. In other words, they work.

For what it’s worth, people are also pretty bad at math compared to calculators. We are slow and error prone. That’s ok.

What I was (poorly) trying to say is that I don’t care if the neural net solves the problem if it can outsource it to a calculator. People do the same thing. What is important is reliably accomplishing the goal.


Most human ten year olds can add two large numbers together with the aid of a scratchpad and a pen. You need tools other than a single dimensional vector of text to do some of these things.


AI apologists need to decide whether they are claiming LLMs are almost-AGI, or not.

This backlash of pointing out LLM failures is a reaction to the overblown hype. We don't expect a statistical-language-processing-gadget to do math well, but then people need to stop claiming they're something other than statistical-language-processing-gadgets.


People have different opinions about this, but I think one problem is there are different questions.

One is - Google, Facebook, OpenAI, Anthropic, Deepseek etc. have put a lot of capital expenditure into train frontier large language models, and are continuing to do so. There is a current bet that growing the size of LLMs, with more or maybe even synthetic data, with some minor breakthroughs (nothing as big as the Alexnet deep learning breakthrough, or transformers), will have a payoff for at least the leading frontier model. Similar to Moore's law for ICs, the bet is that more data and more parameters will yield a more powerful LLM - without that much more innovation needed. So the question for this is whether the capital expenditure for this bet will pay off.

Then there's the question of how useful current LLMs are, whether we expect to see breakthroughs at the level of Alexnet or transformers in the coming decades, whether non-LLM neural networks will become useful - text-to-image, image-to-text, text-to-video, video-to-text, image-to-video, text-to-audio and so on.

So there's the business side question, of whether the bet that spending a lot of capital expenditure training a frontier model will be worth it for the winner in the next few years - with the method being an increase in data, perhaps synthetic data, and increasing the parameter numbers - without much major innovation expected. Then there's every other question around this. All questions may seem important but the first one is what seems important to business, and is connected to a lot of the capital spending being done on all of this.


If I look at Nvidia stock from mid-June of last year, or the IYW index (Apple, Microsoft, Facebook, Google) - NVDA is down 10%, IYW is down maybe 2-3%. It doesn't feel like I'm in the middle of a huge bubble like, say, the beginning of 2000.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: