"The email including them got lost to Meta's two-year auto-delete policy by the ...

pengaru · on Feb 3, 2023

> If it's any consolation, it sounds like the list is at least three years old by now.

In my experience when it comes to learning technical subjects from a position of relative total ignorance, it's the older resources that are the easiest to bootstrap knowledge from. Then you basically work your way forward through the newer texts, like an accelerated replay of a domain's progress.

I think it's kind of obvious that this would be the case when you think about it. Just like how history textbooks can't keep growing in size to give all past events an equal treatment, nor can technical references as a domain matures.

You're forced to toss out stuff deemed least relevant to today, and in technical domains that's often stuff you've just started assuming as understood by the reader... where early editions of a new space would have prioritized getting the reader up to speed in something totally novel to the world.

chfritz · on Feb 3, 2023

"considering that 2016 is generally regarded as the date of the deep learning revolution" --

I thought it was 2012, when AlexNet took the imagenet crown?

sillysaurusx · on Feb 3, 2023

That's probably fair. But you'd be hard-pressed to find a DL stack to try out your ideas with prior to 2016, since that's when Tensorflow launched. :)

(Gosh, it's been less than a decade. Time sometimes doesn't fly, considering how much it's changed the world since then...)

abrichr · on Feb 3, 2023

Theano was first released in 2007.

sillysaurusx · on Feb 3, 2023

That’s actually fascinating. Were there many experiments done in it back in the 00’s?

I’m just trying to imagine the things you could do with it back then. 2007 had relatively fast gpus for the time, but certainly nothing compared to today. Yet it’d certainly be enough for MNIST training, which makes me wonder what else could be done.

abrichr · on Feb 3, 2023

You can look at Yoshua Bengio's Google Scholar profile [1] and scroll down to see what they were working on around that time.

Here are some papers with many citations:

- An empirical evaluation of deep architectures on problems with many factors of variation [2]

- Extracting and composing robust features with denoising autoencoders [3]

- Scaling learning algorithms towards AI [4]

[1] https://scholar.google.com/citations?hl=en&user=kukA0LcAAAAJ...

[2] https://scholar.google.com/citations?view_op=view_citation&h...

[3] https://scholar.google.com/citations?view_op=view_citation&h...

[4] https://scholar.google.com/citations?view_op=view_citation&h...

ladberg · on Feb 3, 2023

FWIW in 2016 I was at an ML team at Apple that had been shipping production neural networks on-device for a while already. At the everyone used an assortment of random tools (Theano, Torch, Caffe). I worked on an internal tool that originally started as a Theano fork but was closer to a modern-day Tensorflow XLA (and has since been axed in favor of Tensorflow for most teams).

jebarker · on Feb 4, 2023

Yep, I worked on a production DL system based on Theano in ~2014

vtantia · on Feb 6, 2023

Whoops, Carmack referenced the thread and tagged Ilya in it a veiled request to publish the list - https://twitter.com/ID_AA_Carmack/status/1622673143469858816

mellosouls · on Feb 4, 2023

Sorry - where is that sourced from? Or are you meaning it was a personal communication to you? Or it's a joke?

sillysaurusx · on Feb 4, 2023

He told me.

mellosouls · on Feb 5, 2023

Thank you. I missed the clue when I checked your Twitter profile before asking. Made sense on the revisit. :)

webmaven · on Feb 9, 2023

Which "he", Carmack or Sutskever?