GH200 is nowhere near $343,000 number. You can get a single server order around 45k (with inception discount). If you are buying bulk, it goes down to sub-30k ish. This comes with a H100's performance and insane amount of high bandwith memory.
For traditional LLMs this might be true (especially large MoEs at bs=1) but I highly disagree with "multi-modal models" phrase since most of the models that output in other modalities are generally compute bound. Which means less flops will make the experience so much worse (imagine waiting a couple minutes for an image and hours for videos).
For anyone that wants to test the original (non-distilled) HunyuanVideo (which is an amazing model) we have 580p version taking under a minute and 720p version taking around 2.5-3 minutes in our playground: https://fal.ai/models/fal-ai/hunyuan-video (it requires github login & and is pay-per-use but new accounts get some free credits).
Open source video models are going to beat closed source. Ecosystem and tools matter.
Midjourney has name recognition, but nobody talks about Dall-E anymore. The same will happen to Sora. Flux and Stable Diffusion won images, and Hunyuan and similar will win video.
Hunyuan, LTX-1, Mochi-1, and all the other open models from non-leading foundation model companies will eventually leapfrog Sora and Veo. Because you can program against them and run them locally or in your own cloud. You can fine tune them to do whatever you want. You can build audio reactive models, controllable models, interactive art walls, you name it.
Sora and Veo just aren't interesting. They're at one end of the quality spectrum, and open models will quickly close that gap and then some.
> Open source video models are going to beat closed source. Ecosystem and tools matter.
Midjourney has name recognition, but nobody talks about Dall-E anymore. The same will happen to Sora. Flux and Stable Diffusion won images, and Hunyuan and similar will win video.
Neither Flux (except the distilled Flux Schnell model) nor Stable Diffusion has open licensed weights, Stable Diffusion and Flux Dev are weights-available with limited, non-open licenses, Flux Pro is hosted-only.
Just because the OSI doesn't like Open RAIL doesn't make it not open source unless you're strictly talking about the OSD. The OSI can't even figure where the boundaries of open models lie - data, training code, weights, etc.
The RAIL licenses do have usage restrictions (eg. against harming minors, use in defamation, etc.), but they're completely unenforced.
> Just because the OSI doesn’t like Open RAIL doesn’t make it not open source unless you’re strictly talking about the OSD.
If you aren’t talking about the OSD, you end up reducing “open source” to a semantically-null buzzword. But, in any case, I intentionally didn’t mention “open source”. The weights are under a use-restrictive license, not an open license, even leaving out the debates over what “source” is. And tha’s just SD1.x, SD2.x, and SDXL, which have the CreativeML OpenRAIL-M license (SD1.x) or CreativeML OpenRAIL++M licenses (SD2.x/SDXL). SD3.x has a far more restrictive license, as does Flux Dev.
> Flux Schnell is Apache.
Huh. It’s almost like I should have explicitly except Flux Schnell from the other Stable Diffusion and Flux models when I said they didn’t have open licenses.
Oh, I did.
> LTX-1 is Apache.
Yes, it is. LTX-1 is “neither Flux (except the distilled Flux Schnell model) nor Stable Diffusion”. AuraFlow (an image model) is also Apache, and while its behind Flux – Dev or Schnell – or SDXL in current mindshare, it got picked – largely for licensing reasons – as the basis for the next version of Pony Diffusion, a popular (largely, though not exclusively, for NSFW capabilities) community model series whose previous versions were based on SD1.5 and SDXL, which gives it a good chance of becoming a major player.
Statements that begin like this are nearly always rhetorical attempts to subvert the standard usage of the terminology.
> but they're completely unenforced
Utterly irrelevant from a legal perspective. Also entirely circumstantial in that it depends entirely on the license holder and can easily vary between end users.
I'm also rather confused how RAIL entered into this to begin with. Unless I've missed something significant, most variants (or at least high end variants) of Stable Diffusion [0] and Flux [1] are under non-commercial licenses.
Not that I take issue with that. I've no delusion that a company is going to spend hundreds of thousands of dollars on compute and then open the floor to competitors who literally clone their data.
Easily Gimp and Krita or painting (you can buy the latter on steam, if you want to support open source).
Photoshop is a round and mature product, but since I don't do any print, I can do everything with Gimp (perhaps you can do print too, no experience here).
Creative cloud or however it is called today is a non-starter for me. Also, I can integrate Gimp in image pipelines more easily. I also use Blender for modelling.
Maybe I am not entirely up to date, but today you can use these tools to make things that were just not possible a few years ago. In a quality that is competitive with high quality media products.
For me it is a hobby and I get the advantages in a professional environment to use the same tools that fit long and complicated pipelines. But if you just want to create high quality art, the tooling is readily available.
It’s not comparable because GIMP has never had the effort put into it to compete with Photoshops most basic features. 15-20 years ago they were arguing that adjustment layers were not needed and they only managed to ship some form of it this year.
Blender vs commercial 3D software is a better example.
Hunyuan at other providers like fal.ai is cheaper than SORA for the same resolution (720p 5 seconds gets you ~15 videos for $20 vs almost 50 videos at fal). It is slower than SORA (~3 minutes for a 720p video) but faster than replicate's hunyuan (by 6-7x for the same settings).
I know this is more of a throwaway cynical quip, but this is a biased line of thinking. CEO's are, for obvious reasons, more likely to do things they wouldn't be held liable for, versus do things which would see them likely to be punished. So executives might, for example, get away with things by successfully skirting the line of legality.
Say an AI CEO blatantly crosses this line, now who is liable?
It is already used widely across industries where one would think people should be more conservative ( healthcare transcription services come to mind, but it is hardly the only example of this ). As always in America, only lawsuits will shows us how the dust has settled.
It excels at particularly text and scene composition, as well as being able to generate vector graphics. You can use it through their website or through fal.ai https://fal.ai/models/fal-ai/recraft-v3/playground.
reply