Probably, but "two Python books" mentioned in this thread are even more suited t...

hedgehog0 · on Feb 16, 2023

Thank you. Maybe it's not modest, but I do not consider myself to be a "general beginner". I majored in math and CS, so I'm thinking about getting something more hard-core and serious...

barrenko · on Feb 16, 2023

Great. Have you tried just jumping in to Karpathy's video series on youtube? Maybe that plus reading some papers?

havercosine · on Feb 20, 2023

I second this. Given your background (lisper) maybe do the Little Learner book once it is out + Karpathy's video series. Follow it up by building a slightly complicated application in your favourite domain (text, images, videos, time series).

Also word of advise from my experience (I'm not an expert in DL either): Think of DL field as a game of lego blocks. The ideas in this book / Karpathy's videos are the basic lego blocks: parameterised linear functions, non-linearities, auto-grad, cross-entropy / KL divergence loss and gradient descent. Then there is entire body of more complex legos discovered simply by practice (alchemy!): transformer blocks, layer norm, max-pooling etc. It is impossible to understand how the second kind were obtained from first principles. The trick is to not beat yourself up about the advanced blocks too much but just play around, read up things in papers. Just focus on fundamental blocks.