Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ML Noob here. Went though the tutorial[0] and the readability of the API is impressive. Like composing a whole model layer by layer.

Stupid question - can this also be used for composing transformer based LLMs?

[0]. https://keras.io/getting_started/intro_to_keras_for_engineer...



Yes, Keras can be used to build LLMs. In fact this is one of the main use cases.

There are some tutorials about how to do it "from scratch", like this: https://keras.io/examples/nlp/neural_machine_translation_wit...

Otherwise, if you want to reuse an existing LLM (or just see how a large one would be implemented in practice) you can check out the models from KerasNLP. For instance, this is BERT, basically just a stack of TransformerEncoders. https://github.com/keras-team/keras-nlp/blob/master/keras_nl...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: