Hacker News new | past | comments | ask | show | jobs | submit | thesz's comments login

https://arxiv.org/pdf/2503.02113

This paper shows that polynomials show most features of deep neural nets, including double descent and ability to memorize entire dataset.

It connects dots there - polynomials there are regularized to be as simple as possible and author argues that hundredths of billions of parameters in modern neural networks work as a regularizers too, they attenuate decisions that "too risky."

I really enjoyed that paper, a gem that puts light everywhere.


This is fascinating!

If I understand correctly, they approximate language of inputs of a function to discover minimal (in some sense, like "shortest description length") inputs that violate relations between inputs and outputs of a function under scrutiny.


> It is straightforward to write a fast parser generator for languages that require just one character of lookahead...

Then you get VHDL.

https://news.ycombinator.com/item?id=15017974

You need (at least an approximation to) the symbol table for correct lexing.

Or Postgres/MariaDB's SQL with the the DELIMITER statement that can change semicolon to something else.


As you mentioned "improvement of existing language," I'd like to mention that Haskell has green threads that most probably are lighter (stack size 1K) than goroutines (minimum stack size 2K).

Haskell also has software transactional memory where one can implement one's own channels (they are implemented [1]) and atomically synchronize between arbitrarily complex reading/sending patterns.

[1] https://hackage.haskell.org/package/stm-2.5.3.1/docs/Control...

In my not so humble opinion, Go is a library in Haskell from the very beginning.


> Symbolic AI like SAT solvers and planners is not trying to learn from data and there's no context in which it has to "scale with huge data".

Actually, they do. Conflict-Driven Clause Learning (CDCL) learns from conflicts encountered during working on the data. The space of inputs they are dealing with oftentimes is in the order of the number of atoms in Universe and that is huge.


"Learning" in CDCL is a misnomer: the learning process is Resolution and it's deductive (reasoning) not inductive (learning).


You invented a new kind of learning that somewhat contradicts usual definition [1] [2].

  [1] https://www.britannica.com/dictionary/learning
  [2] https://en.wikipedia.org/wiki/Learning
"Learning" in CDCL is perfectly in line of "gaining knowledge."


I'm pretty sure most "industrial scale" SAT solvers involve both deduction and heuristics to decide which deductions to make and which to keep. At a certain scale, the heuristics have to be adaptive and then you have "induction".


I don't agree. The derivation of new clauses by Resolution is well understood as deductive and the choice of what clauses to keep doesn't change that.

Resolution can be used inductively, and also for abduction, but that's going into the weeds a bit- it's the subject of my PhD thesis. Let me know if you're in the mood for a proper diatribe :)


Take a look at Satisfaction-Driven Clause Learning [1].

[1] https://www.cs.cmu.edu/~mheule/publications/prencode.pdf


I'd love a diatribe if you're still following this post.


As would I.

You know, this seems like yet another reason to allow HN users to direct message each other, or at least receive reply notifications. Dang, why can't we have nice things?


Oh, hi guys. Sorry just saw this.

Oh gosh I gotta do some work today, so no time to write what I wanted. Maybe watch this space? I'll try to make some time later today.


> LLMs did scale with huge data, symbolic AI did not.

Symbolic AI have not had a privilege to be applied or "trained" with huge data. 30 millions assertions is not a big number.


This is correct. Those 30M assertions were basically entered by hand.


"It allows AI to understand its physical environment and also to self-improve over time, without a human having to tell it exactly what to do."


I my view, the 'exactly' is crucial here. They do implicitly tell the model what to do by encoding it in the reward function:

In Minecraft, the team used a protocol that gave Dreamer a ‘plus one’ reward every time it completed one of 12 progressive steps involved in diamond collection — including creating planks and a furnace, mining iron and forging an iron pickaxe.

This is also why I think the title of the article is slightly misleading.


It's kind of fair, humans also get rewarded for those steps when they learn Minecraft


But they don't learn that way at all, my 7yo learns by watching youtubers. There's a whole network of people teaching each other the game, that's almost more fun than playing it alone.


> there is no probabilistic link between the words of a text and the gist of the content

Using n-gram/skip-gram model over the long text you can predict probabilities of word pairs and/or word triples (effectively collocations [1]) in the summary.

[1] https://en.wikipedia.org/wiki/Collocation

Then, by using (beam search and) an n-gram/skip-gram model of summaries, you can generate the text of a summary, guided by preference of the words pairs/triples predicted by the first step.


Because circles there also need operations over them (union, intersection or subtraction), it is a good example of low complexity art [1].

[1] https://en.wikipedia.org/wiki/Low-complexity_art

My son is a big fan of bytebeat [2], which is also a low complexity art, but music.

[2] https://dollchan.net/bytebeat/#4AAAA+kUryC/X0CixswNhQyM1Q01N...


Big Flash.

https://www.youtube.com/watch?v=eSSDi22NVFo

As if matter existed and then there was a creation of light at some moment. Red shift is explained by interaction of light with gravitational field - the more distant source of light, the longer it travels under the influence of gravity and the more red it becomes.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: