GPT is using multi-headed attention, so of course it's not as simple as putting ...

sillysaurusx · on Aug 17, 2020

That's a really interesting idea. Could you go into detail about how you're searching for similar texts using GPT?

It's true that the probability distribution is a sort of "edit distance". And GPT has already been used for text compression: https://bellard.org/nncp/gpt2tc.html so it seems not too far of a stretch to use it for similarity matching.

(Sure, perhaps there are more efficient or more effective techniques than using GPT for this, but I like the idea and am curious how it works.)

xiphias2 · on Aug 17, 2020

I was using Ctrl-F in Chrome for the words on the training data (Shakespeare texts).

With Teansformer models it's quite easy to print out top Query-Key pairs to debug what happened, but that was not my intention.

master_yoda_1 · on Aug 17, 2020

why don't use TF-IDF and use just elastic search?