My understanding is that LLaMa's architecture is open, so the most difficult part is:
1. Getting data of equal or better quality
2. Securing the funding/hardware required for training
3. Learning/figuring out the training challenges needed to tune the process (the PhD part)
It seems #1 is the relatively lowest hanging fruit and a prerequisite for the other two, and that's what the project is (rightfully) tackling at this stage.
#2 could be solved by many ways, and doesn't require much innovation if the project and the team are solid.
Which takes me to #3, which on the other hand seems to be the make or break part of the project.
I'm not one to doubt the technical prowesses of the RedPajama's team and their contributors, I rather see it economically. How can an AI open-source project compete with big tech in attracting the brilliant minds of our generation? It's enough to look at levels.xyz to see the battle is not ... level.
There's a serious economical challenge in here to have any sort of sustainable open source initiative in AI.
1. Getting data of equal or better quality
2. Securing the funding/hardware required for training
3. Learning/figuring out the training challenges needed to tune the process (the PhD part)
It seems #1 is the relatively lowest hanging fruit and a prerequisite for the other two, and that's what the project is (rightfully) tackling at this stage. #2 could be solved by many ways, and doesn't require much innovation if the project and the team are solid. Which takes me to #3, which on the other hand seems to be the make or break part of the project.
I'm not one to doubt the technical prowesses of the RedPajama's team and their contributors, I rather see it economically. How can an AI open-source project compete with big tech in attracting the brilliant minds of our generation? It's enough to look at levels.xyz to see the battle is not ... level.
There's a serious economical challenge in here to have any sort of sustainable open source initiative in AI.