But if the rest of your model is frozen the head will never see actual words, just contextual vectors from the LM.
It feels like we are in strong agreement but using slightly different terms or something
But if the rest of your model is frozen the head will never see actual words, just contextual vectors from the LM.
It feels like we are in strong agreement but using slightly different terms or something