Hacker News new | past | comments | ask | show | jobs | submit | ZeljkoS's comments login


It is interesting that HN community discussed why it wouldn't work all the way back in 2010: https://news.ycombinator.com/item?id=1701724


Interesting to note that (according to author, DHH), this article was removed by LinkedIn: https://www.linkedin.com/posts/david-heinemeier-hansson-374b...


We have a partial understanding of why distillation works—it is explained by The Lottery Ticket Hypothesis (https://arxiv.org/abs/1803.03635). But if I am understanding correctly, that doesn't mean you can train a smaller network from scratch. You need a lot of randomness in the initial large network, for some neurons to have "winning" states. Then you can distill those winning subsystems to a smaller network.

Note that similar process happens with human brain, it is called Synaptic pruning (https://en.wikipedia.org/wiki/Synaptic_pruning). Relevant quote from Wikipedia (https://en.wikipedia.org/wiki/Neuron#Connectivity): "It has been estimated that the brain of a three-year-old child has about 10^15 synapses (1 quadrillion). This number declines with age, stabilizing by adulthood. Estimates vary for an adult, ranging from 10^14 to 5x10^14 synapses (100 to 500 trillion)."


So, can a distilled 8B model (say, the Deepseek-R1-Distil-Llama-8B or whatever) be "trained up" to a higher parameter 16B Parameter model after distillation from a superior model, or is it forever stuck at the 8B parameters that can just be fine tuned?


So more 'mature' models might arise in the near future with less params and better benchmarks?


That's been happening consistently for over a year now. Small models today are better than big models from a year or two ago.


"Better", but not better than the model they were distilled from, at least that's how I understand it.


I think this is how the "child brain" works too. The better the parents and the environement are, the better the child evolution is :)


Not at all — how many people were geniuses and their parents not? I can name several and I’m sure with a quick search you can too.


How is that relevant? A few examples do not disprove anything. It's pretty common knowledge that the more successful/rich etc. your parents were, the more likely you'll be successful/rich etc.

This does not directly prove the theory your parent comment posits, being that better circumstances during a child's development improve the development of that child's brain. That would require success being a good predictor of brain development, which I'm somewhat uncertain about.


They might also be more biased and less able to adapt to new technology. Interesting times.



Not mentioned in the article, one of the main reasons to ALWAYS do this is SEO. Regardless if users will play the video or not, web crawlers will not play the video and Google will penalize your SEO ranking if you use official Google's YouTube embed :D

We implemented our own thumbnail image trick on TestDome homepage a few years ago (https://www.testdome.com/). Thumbnail is from: https://img.youtube.com/vi/gPQQg4yZqt8/sddefault.jpg




Wow, to my surprise, the sound is incredibly annoying and stress-inducing.

At first I wasn't sure if it's some kind of poor-taste soundtrack, but that is only the background tune. The tap tap tap is the stainless steel device.

Edit: @hombre_fatal - Steel on steel will always clang.


it's supposed to be alarming, so you can go turn the heat down once something starts boiling enough to boil over. Similar to a stove-top pressure cooker, you have to wait for the sound to know where to set the heat.


I use a glass milk watcher while boiling milk for making yogurt, and the clak (not clank) sound is not that stress inducing. On the contrary, it's a heartbeat which says that your milk is safe from boiling over and tasty yogurt prospects are still strong.

I think it's a matter of getting used to it.


Do you still have to stir to keep the milk from sticking to the bottom of the pot?


If your milk has around 3.3% fat, no you don't. I didn't try with milk with >5% fat yet.


Thank you!


Stress-inducing that's the right word to describe it



Thank you for the video. The sound would definitely make me come and stop it.


If you like this video, you will probably also like Peter Pringle's The Epic Of Gilgamesh In Sumerian: https://www.youtube.com/watch?v=QUcTsFe1PVs


Legal protection is nice, but it can be circumvented, like the Lufthansa fiasco showed: https://svedic.org/travel/screwed-by-lufthansa-german-govern...

Since then, I always try to book plane tickets with PayPal. It is a bit ironic that as an EU citizen, I was screwed by EU company (Lufthansa), EU politicians (German government), but saved by a private US company (PayPal) :D


Selling vouchers where you know some of them will go unused should be straight up illegal. It's fraud imo - taking money without actually providing a service. At the very least they should be automatically refunded after a reasonable time period.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: