I'm sorry, but even if I am maximally charitable and assume that everything you are saying is meaningful and makes sense, it still has essentially nothing to do with the original article. The original article is about imposing constraints on the weights of a neural network, during training, so that they lie on a particular manifold inside the overall weight space. The "modular" part is about being able to specify these constraints separately for individual layers or modules of a network and then compose them together into a meaningful constraint for the global network.
You are talking about latent space during inference, not weight space during training, and you are talking about interleaving tokens with random Gaussian tokens, not constraining values to lie on a manifold within a larger space. Whether or not the thing you are describing is meaningful or useful, it is basically unrelated to the original article, and you are not using the term "modular manifold" to refer to the same thing.
hmm / hear you. my point wasn't that we are applying modular manifolds in the same way it was that we are working on model reliability from two extremal ends using the same principle. there are various ways to induce modular manifolds in model at various levels of resolution / power. we started at outside / working in level and so it works with any black-box model out of the box and zero knowledge needed, dont even need to know token dictionary to show effect.
We're already working on pushing construction deeper into model both architecture and training. currently that's for fine-tuning and ultimately full architecture shrinkage / pruning and raw training vs. just fine-tuning etc.
& it was just great to see someone else using modular manifolds even if they are using them at the training stage vs. inference stage. they're exploiting modular form at training, we're doing it at inference. cool to see.
You are talking about latent space during inference, not weight space during training, and you are talking about interleaving tokens with random Gaussian tokens, not constraining values to lie on a manifold within a larger space. Whether or not the thing you are describing is meaningful or useful, it is basically unrelated to the original article, and you are not using the term "modular manifold" to refer to the same thing.