Converged to something similar after spending 2 days bissecting a repo to reproduce a training run, having to wait 3hr on each commit before conclusive results. I couldn't get myself to use hydra though, it felt like a lot of bloat vs loading a yaml with pydantic
Alex Nichol worked on "Gotta Learn Fast" in 2018 which Carmack mentions in his talk, he also worked on foundational deep learning methods like CLIP, DDPM, GLIDE, etc. Reducing him to a "seething openai insider" seems a bit unfair
My issue with regexes is that the formal definition of regex I learned at university is clear and simple [0] but then using them in programming languages is always a mess
The issue is the formal definition of regex only deals with whether a string belongs to language recognized by regex or not (boolean accept/non-accept), but regex in practice often talks in terms of "find the substring (if any) that matches". Which then causes issues because a regex is equivalent to an NFA so a given string can be matched in possibly multiple ways, which forces you to bring in the notion of a "greedy" vs "non-greedy" match in order to disambiguate. And then add in top of that the desire to define sub-matches in terms of capturing groups, and it's just a complete mess. And that's not even getting to not-strictly regular PCRE extensions like lookaround, backreferences, etc.
Overall LLMs (that I've tested) don't know how to use a search engine, their queries are bad and naive, probably because the way to use a search engine isn't part of training data, it's just something that people learn to do by using them. Maybe Google has the data to make LLMs good at using search engines but would it serve their business?
> A revealing anecdote shared at one panel highlighted the cultural divide: when AI systems reproduced known mathematical results, mathematicians were excited, while AI researchers were disappointed
This seems very caricatural, one thing I've often heard in the AI community is that it'd be interesting to train models with an old data cutoff date (say 1900) and see whether the model is able to reinvent modern science
TBH only one author out of four has a Google affiliation, and their personal webpage [1] says "part-time (20%) staff research scientist at Google DeepMind", so it's a stretch to call this a "Google technique". I notice that this is a common thing when discussing research paper, people associate it with the first company name they can find in the affiliations
reply