Hi, This was a post I worked on to better understand how tokenizers are used in the latest Deep learning models such as BERT and what impact they have on how those models learn. It was a really interesting and difficult topic to research so would love any thoughts or feedback on it
Really interesting article. Whats fascinating is the way embeddings can be used for many NLP tasks such as translations and word embeddings like word2vec but can be also applied to image tasks like this.
For image tasks I read that adding more layers to the model helps find more complicated features of the images, i.e. first layer identifies edges, next layer finer outlines and so on. Would adding more layers in this model mean you need less epochs to make it more accurate or would there be any trade off like that?
If there is more information in the embedding than the baseline Generator is extracting, then adding in Convolutions between the Transpose Convolution blocks will certainly add more parameters the model could use to learn that.
Since the question is about training speed though, in general, larger models require more data which requires more time to train, so you'd actually increase both your data requirements, and time required to train.
Hi, my name is Cathal and I am the author of this post. Sentence embedding is still a relatively new technology and I wanted to explore it a bit to try and understand how it could be used in a typical business scenario. I still think there is alot more to discover about sentence embeddings and other ways they can be used. So I am happy to answer any questions about the post or sentence embeddings in general.