Vanilla UNET is around 7-8M parameters, this is 100M(?) so the model itself is an order of magnitude larger. There are larger models though as pointed out in the other Hacker News thread.
The fine-tuning datasets are much smaller, but that's the point - they don't need to be large, because of the foundation model underneath.
Vanilla UNET is around 7-8M parameters, this is 100M(?) so the model itself is an order of magnitude larger. There are larger models though as pointed out in the other Hacker News thread.
The fine-tuning datasets are much smaller, but that's the point - they don't need to be large, because of the foundation model underneath.