Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> On a single multi-GPUs server, even with the highest-end A100 80GB GPU, PyTorch can only launch ChatGPT based on small models like GPT-L (774M), due to the complexity and memory fragmentation of ChatGPT. Hence, multi-GPUs parallel scaling to 4 or 8 GPUs with PyTorch's DistributedDataParallel (DDP) results in limited performance gains.

Where are these numbers coming from? An 80GB A100 GPU is certainly more than capable of hosting a 1.5B GPT. We were running 774M on rinky-dink cards back in 2019 for our inference purposes.

I don’t understand how they went from talking about 175B params across 32 cards to 774M on one card. 175B divided by 32 is 5.4B.

In fact, I’m not sure what they’re saying in general. They seem to be confusing data parallelism with model parallelism with memory fragmentation, while namedropping a bunch of training techniques.

The hard part of ChatGPT isn’t the size. It’s the training process. It took a small army of contractors rating outputs as good or bad. Once that dataset gets replicated, we can start talking about size. Hopefully LAION will deliver.



I think they are correctly referring to ChatGPT as GPT-3 + RLHF. In other words ChatGPT = GPT-3 + RLHF. So, 80GB A100 GPU would be required for both GPT-L AND RLHF (PyTorch version). And it looks to me from the TFA that the main thing that takes a lot of space is actually RLHF.

>I don’t understand how they went from talking about 175B params across 32 cards to 774M on one card. 175B divided by 32 is 5.4B.

They claim 774M is the size of GPT-L which if run in conjunction with their RLHF would require 80GB A100 GPU to train (using their RLHF PyTorch implementation). They additionally claim that training GPT-3(175B params) plus RLHF would need 64 * 80gb = 5120gb of memory if using PyTorch implementation of RLHF or 32 * 80gb = 2560gb if going Colossal AI route.

To be honest, these numbers do look to me to be a bit of a cheesy ad for their product but hey they need to put food on their table too. I'm not sure if the dataset would be such a huge problem otherwise Britannica would still be ahead of Wikipedia. Given an army of volunteers willing to produce it OpenAI brigade of contractors has no chance.


If someone created a folding@home to crowd train an actually open ChatGPT, I'd gladly donate my spare resources to the cause.


That's unlikely to work. The memory has to be fast with low latency, even switching from on-board VRAM to system RAM slows performance at least 10-100x. The bottleneck isn't computing power it's I/O. Total bus bandwidth of a common small AI cluster is around 1 terabyte per second.

We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon. The weaponization power will be made clear soon and that will justifiably spook everyone.


There's a significant number of people working hard on making certain tech illegal or at least heavily restricted. E2EE and Onion Routing comes to mind. That doesn't mean we should abandon them. In fact, in many cases it's an indicator that we should keep going.

Why do you think we should avoid an open source AI?


How do you plan to have differential technological development and careful alignment research if anyone is allowed to build Skynet in their garage?

I use and generally support E2EE and onion routing. E2EE and onion routing aren't inherently existential risks to the continued existence of life on Earth.


Please stop with the flagrant "AI" fearmongering over LLMs and other current-generation ML software. Not only are they not Skynet now, I do not believe it will be possible for simple iteration on this type of ML software to create anything remotely like Skynet.


LLMs are not going to pose an existential risk to anyone. Also, making AI development less accessible to the general public will not make it any safer.

I am willing to bet all this fear mongering singularity bullshit is just being peddled by large corporations with a vested interest to keep AI development out of reach from the general public.


These failure modes have been recognized since long before the current crop of AI developments.

You're spreading both incorrect information "making AI development less accessible to the general public will not make it any safer" and conspiracy theories "this fear mongering singularity bullshit is just being peddled by large corporations with a vested interest."

There is no alarm bell that tells us when we've reached the point of no return. Even if we don't end up with agentic AI and a sharp left turn, we don't want to live in a world where every organization with a few million dollars can build swarms of flying drones that flood a target area and stab to death anyone out in the open.

Some Nvidia hardware is already export controlled in the same manner as other dual use technologies. More restrictions are coming, not less.


> we don't want to live in a world where every organization with a few million dollars can build swarms of flying drones that flood a target area and stab to death anyone out in the open.

We already can? Just take a look at the maker community, there is so much information/open source software available about building and controlling rockets, drones, etc at home.

Even for stuff like DeepFakes it only makes stuff, that was possible before, a lot cheaper. This certainly won’t be the end of humanity.


Biohacking and minor isotope enrichment projects are par for the course in garages nowadays. Three-letter agencies don't care about me, so why should they care about ML 101 skynet adventures?


For this reason alone (corpos making AI illegal to maintain for mere mortals) we should strive to make as much progress in the truly open AI as possible.

The current dystopia is fairly dystopian as it is.


>We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon. The weaponization power will be made clear soon and that will justifiably spook everyone.

Encryption was illegal not that long ago for the same reasons. Now it's the basis of all the digital economy. If we made it illegal again of the top 10 tech companies by market cap only Nvidia and TSMC would not be outright illegal to operate.

The timid cowardice that's taken over tech will not serve it well in the coming 20 years.


How do you plan to have differential technology development and thoughtful and cautious alignment research if we go building these things without a speed limit?

Giving a baby a hand grenade would be more responsible.


> How do you plan to have differential technology development and thoughtful and cautious alignment research if we go building these things without a speed limit?

We aren’t going to have those things anyway; the closest we’ll get is if development is relatively public and open and thus subject to outsider critique. The only interest the closed corporate restricted approach has in alignment is in controlling the research, suppressing unwelcome avenues of inquiry, and generating PR to assuage public fears.


Caution is for losers.


> We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon.

How do you make that illegal while still allowing private corporations to build AI? How do you legally define AI without applying it to all kinds of existing applications and without stopping all research on AI? And while staying broad enough that simply using a slightly different technique would still qualify under that definition?


Replace "AI" with "uranium enrichment and nuclear research" and the answers fill themselves in.


Yes, and if you replace "uranium enrichment" with "teddy bears" it's a bedtime story for kids. That argument makes no sense.


Yeah.... Having spent a lot of cycles replicating ML work, it's much more difficult than taking a stab at replicating a paper. It's typically doable (results really do replicate) but it can take a few good brains a year to pull it off. There's typically a lot of small decisions that add up, and a lot of hyperparameter sweeps to land in a good region of the optimization space.


> Once that dataset gets replicated, we can start talking about size. Hopefully LAION will deliver.

Is LAION starting a community project to rate model outputs? I didn't see anything on their site.



For reference, GPT-NeoX is a 20B parameter model, and it runs on 45 GB of VRAM. On an 80 GB A100 you could probably run a 35B parameter model. Maybe 8 A100 cards to do inference on ChatGPT?

Or 32 3090 cards, which would run you under $40k total.


20B GPT-NeoX runs on a 3090 in 8 bit mode




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: