That “probability distribution” is just a mathematical function assigning numbers to tokens, defined using a model that the person creating the model and the omniscent entity know, applying a set of deterministic mathematical functions to a sequence of observed inputs that the person creating the model and the omniscent entity also know.
That “probability distribution” is just a mathematical function assigning numbers to tokens, defined using a model that the person creating the model and the omniscent entity know, applying a set of deterministic mathematical functions to a sequence of observed inputs that the person creating the model and the omniscent entity also know.