That is great, did someone ask the programmers about it though? Or did someone else see the higher FLOPs per dollar and bought in? If your job is literally programming a supercomputer and you can get AMD to fix stuff for you, maybe it aint so bad. That is not where most software comes from though.
SYCL is a terrible option. The performance of any application on SYCL is currently quite poor. Why drop ROCm (used on the world's largest supercomputer?) for an even more unproven product that AMD has no control over the roadmap?
> Banding together with Intel to support SYCL would in my opinion
Except that Intel has more control over SYCL and has repeatedly hurt AMD products with anticompetitive behavior in the past. Why would AMD permit their software to be controlled be a competitor?
AMD is executing with ROCm NOW, with multiple Top500 supercomputer wins and deployment in major cloud vendors with MI300X. Yes, the software needs to improve, but the practice of throwing out software to start over is not a good strategy.
AMD has far more momentum in the data center than INTC at present.
When it comes to comparison to SYCL, HIP is much closer to the spirit of SYCL than ROCm is. Both aim to help writing a single codebase that'll run across multiple hardwares. For now though, the trajectory of SYCL appears much more promising to me than HIP. HIP is already split in two parts for CPU and GPU, which is baffling, and neither part seems to receive much love from AMD.
> The performance of any application on SYCL is currently quite poor.
SYCL can get pretty much equivalent performance in Kernels to eg. CUDA. Try looking at SYCL performance papers on Arxiv. Eg. see [1].
That isn't to say that SYCL code is optimised on every platform without tweaking - you do still need to put effort into target specific optimizations to get the best performance, like you would in the CUDA or HIP.
> Why drop ROCm (used on the world's largest supercomputer?)
Some of the world's largest super-computers / HPC applications do use SYCL for AMD! The application I'm most aware of for this is GROMACS. As to why? - because having 3 version of the same code using different programming APIs is a big maintenance burden.
> Some of the world's largest super-computers / HPC applications do use SYCL for AMD! The application I'm most aware of for this is GROMACS. As to why? - because having 3 version of the same code using different programming APIs is a big maintenance burden.
The fact that GROMACS is unwilling to drop CUDA support to stand fully behind SYCL is very telling.
I wouldn't expect them to drop CUDA support, even if SYCL is a viable alternative:
* The CUDA backend is mature, featureful, and significant effort has been invested into optimising it on Nvidia hardware. One does not simply throw away a performant, well validated HPC code!
* Nvidia GPUs dominate the GPGPU market - unlike AMD's.
* The SYCL backend is still is very new in comparison (they even state in the docs to pay extra attention to validation), and doesn't have Nvidia-specific optimisations yet. Why prioritise reimplementing what already exists?
> AMD has the hardware but the support for HPC is non-existent outside of the joke that is bliss and AOCL.
You are probably two years behind the state of the art. The world's largest supercomputer, OLCF's Frontier, runs AMD CPUs and GPUs. It's emphatically using ROCm, not just BLIS and AOCL. See for example: https://docs.olcf.ornl.gov/systems/frontier_user_guide.html
Agreed...the main gap is support on consumer and workstation cards, which is where nVidia made headway, but that is starting erode super recently. ROCm works pretty well for me, I have had a lot more problems with specific packagers than the ROCm layer.
HIP is an API equivalent to CUDA. The kernel code is identical. The host code is the same API with a change from the CUDA to HIP namespace. This seems to be an extremely minimal form of 'porting'.
Right. I can't speak to its correctness/completeness as I've only done a quick installation and smoke test of the ROCm/HIP/MIOpen stack, but there's even a tool that automates the translation [1].
Exactly. I heard recently the Lumi supercomputer team converted loads of scientific code from CUDA to Hip to run on MI250, and apparently it was pretty seamless, so a strong sign this approach works
> In office jobs, by contrast, productivity remains rooted in notions of busyness and multi-faceted activity. The most productive knowledge workers are those who stay on top of their inboxes and somehow juggle the dozens of obligations, from the small tasks to major projects, hurled in their direction every week.
I disagree. I think this looks productive, but people who only do this will hit a terminal level and find themselves stuck.
Getting above this, into Principle/Director/Executive ranks certainly requires being a great communicator, but also typically involves __success__. I see this as shipping a successful product, closing a key customer deal, etc.
It's easy to see productive people and expect that being on top of email and product plans is what made them successful, but this is only an effect of their productivity and mastery of the product development.
Note that John Wick did have great production values: it needed much more than 'just' Keano. That logistics happened in the background.
You might be misunderstanding what Cal Newport is saying here because he agrees with you. He is saying is the people who stay on top of their inboxes and multitask are perceived as being the most productive, not that they are in fact the most productive. He also wrote an entire book that might as well be about his hatred of email titled "A World Without Email," if you need further evidence of his position.
Cal’s writing isn’t clear on where he lands. But he’s probably around to clarify, somewhere. Let’s all send him emails, tweets, etc and see how quickly he replies.
Yeah, I’m all against gatekeeping too. If I called myself a scientist and I didn’t have a degree (yet) would I still be recognized as a scientist? That’s all I’m saying. So long as you identify as one and pursue it, you’ll become one. If you do the courses and earn it then welcome to the fold. I am not about to call myself an astrophysicist simply because I have an affinity for space and know the classifications of stars. That’s my point.
I think there's a lot of fuzziness about all of this.
Calling yourself a surgeon without all the medical degrees and certifications is ... alarming.
Calling yourself a scientist -- while it does imply a stereotype academic type in a white coat -- to me means you have an inquiring mind about the natural and physical world, that you observe carefully, that you experimentally test things, and you are prepared to change your mind based on evidence. Anyone can be a scientist: even a 10 year old.
It impressed me that on radio station Triple J in Australia, every Thursday they'd do a phone-in show with Dr Karl Kruszelnicki who would answer science questions. He would dignify everyone with a question by calling them "Doctor" in a kind of tounge-in-cheek way, and would give a prize if they'd done an experiment to test aspects about the question they had, and would use the phrase "you've done an experiment".
He really did a lot to bring the idea that science was not some far-away esoteric thing you need a degree for, but something that was in every-day life and anyone could do.
Nvidia's NCCL and AMD's RCCL provide parallelism constructs that really are hidden at the framework level (such as PyT).
However, I don't think that you would want to hide model, data, or tensor parallelism. It's too important a consideration for performance and training convergence impact.
At least in scientific computing, I've never observed effective means of automatic parallelism expressed across many nodes despite decades of research. I'm not optimistic this will be effective anytime soon.
The world's largest computer, Frontier at oak ridge national laboratory, runs AMD GPUs. AMD is undisputably the #2 in the GPU space.
https://www.top500.org/lists/top500/2023/11/