Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any body have any good readings they read and liked to kind of understand what is going on with how this works?



hey, creator of spreadsheets-are-all-you-need.ai here. Thanks for mentioning!

I now have a web version of GPT2 implemented in pure JavaScript for web developers at https://spreadsheets-are-all-you-need.ai/gpt2/.

The best part is that you can debug and step through it in the browser dev tools: https://youtube.com/watch?v=cXKJJEzIGy4 (100 second demo). Every single step is is in plain vanilla client side JavaScript (even the matrix multiplications). You don't need python, etc. Heck, you don't even have to leave your browser.

I recently did an updated version of my talk with it for JavaScript developers here: https://youtube.com/watch?v=siGKUyTk9M0 (52 min). That should give you a basic grounding on what's happening inside a Transformer.


This is a faithful reproduction of the original Transformer paper [1]. Except, these days we use trainable parameters for positional embedding. The paper used a static calculation for positional embedding using sine and cosine.

Figure 1 in the paper can be seen implemented in the forward() method of the GPT class in model.py. Here are the rough steps:

1. Tokens are embedded using a nn.Embedding layer. 2. Tokens are positionally embedded using a nn.Embedding layer. 3. The two embedding values are added to make the input x. 4. A sequence of N number of transformer blocks are then executed. This is the grey box in the left of the Figure 1. This is where all the magic happens. Chiefly in the self attention calculation. You can see this in the forward() method of the CausalSelfAttention class. 5. A regular nn.Linear layer is executed. 6. Finally the output token probabilities are calculated using F.cross_entropy (shown as softmax in the figure).

I hope this helps a little. Please feel free to suggest improvements and additions.

[1] https://arxiv.org/pdf/1706.03762




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: