Now that's a blast from the past - T5! I always thought it was underused, but it's also from so long ago now, and even Google has moved past it to UL2 etc AFAIK. What's the use-case here for reproducing it? Aren't there already many good code models?
A lot of embedding models are built on top of T5's encoder, this offers a new option
The modularity of the enc-dec approach is useful - you can insert additional models in between (e.g. A diffusion model), you can use different encoders for different modalities, etc
I've downloaded the data for every model on Huggingface and, if nothing else, flan-t5-xxl is the largest encoder (discriminant) model in terms of total file sizes. To be clear T5 is an encoder-decoder model. But the shift in focus to decoder-only has made it so encoder and encoder-decoder models have been, for lack of a better word, neglected, in terms of training focus.