1) Retro, which is essentially attention over large databases, and fast as hell.
2) S4 Layers, explicitly designed for handling long dependencies.
These are orthogonal approaches to memory, and both very effective at what they do.
1) Retro, which is essentially attention over large databases, and fast as hell.
2) S4 Layers, explicitly designed for handling long dependencies.
These are orthogonal approaches to memory, and both very effective at what they do.