Given that a computer should be able to simulate at least some applicable aspects and processes of reality billions of times faster than the speed at which our own universe runs at: Yes, I think it is entirely reasonable to have these agents follow at least some kind of from-scratch evolutionary history. It might also be valuable: As it could further research in understanding what the word "applicable" there even means; what parts of our evolutionary history are important toward inductively reasoning your way toward a diamond in Minecraft? What parts aren't? How can that generalize?
If you code a reward function for each step necessary to get a diamond, you are teaching the AI how to do it. There is no other way to look at it. Its extremely unethical to claim, as Nature does, that it did this without "being taught", and it is in my eyes academic malpractice to claim, as their paper does, that it did this "without human data or curricula"; though mitigated by the reality that they admit this in the paper. If this is the case; I am still digesting the paper, as it is quite technical.
This isn't an LLM, I'm aware of this, but I am at the point where if I could bet on the following statement being true, I'd go in at five figures: Every major AI benchmark, advancement, or similar accomplishment in the past two years can almost entirely be explained by polluted training data. These systems are not nearly as autonomously intelligent as anyone making money on them says they are.
If you code a reward function for each step necessary to get a diamond, you are teaching the AI how to do it. There is no other way to look at it. Its extremely unethical to claim, as Nature does, that it did this without "being taught", and it is in my eyes academic malpractice to claim, as their paper does, that it did this "without human data or curricula"; though mitigated by the reality that they admit this in the paper. If this is the case; I am still digesting the paper, as it is quite technical.
This isn't an LLM, I'm aware of this, but I am at the point where if I could bet on the following statement being true, I'd go in at five figures: Every major AI benchmark, advancement, or similar accomplishment in the past two years can almost entirely be explained by polluted training data. These systems are not nearly as autonomously intelligent as anyone making money on them says they are.