The Google API models support 1M+ tokens, but these are just 8K. Is there a fundamental architecture difference, training set, something else?
The Google API models support 1M+ tokens, but these are just 8K. Is there a fundamental architecture difference, training set, something else?