Neurons that fire together, wire together, but how?

dr_dshiv · on June 29, 2020

Hebb actually talked about causation, not synchrony (firing together):

"When an axon of cell A is near enough to excite cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A ‘s efficiency, as one of the cells firing B, is increased”

Synchrony is extremely important, particularly for the formation of cortical columns and neural pruning. But in spike timing dependent plasticity, where growth is potentiated if the presynaptic fires just before post synaptic firing, the connection is actually depressed if up and downstream neurons fire exactly synchronously. (there is a huge amount of variation in this across the brain, though)

dr_dshiv · on June 29, 2020

Note that there is also a mechanism for association between two presynaptic neurons. Probabilistically, when those upstream neurons fire synchronously, the downstream neuron will be more likely to actually fire. When that occurs, the postsynaptic neuron will, as a result of Hebb's Postulate, increase the connectivity to the synchronously firing neurons. So, "cells that fire together wire together" is more true of presynaptic neurons than pre-to-post synaptic neurons (and the wiring together is occurring through the postsynaptic neuron)

vinay427 · on June 29, 2020

I find it strange that the author couldn't find this in a textbook. This is rather common material in a developmental neuroscience textbook or lecture. I've looked through two such books during my (only) course on the topic, and all three sources covered this material.

ramraj07 · on June 29, 2020

What did they say ? Perhaps you misread what was in the textbooks? My understanding is that the authors questions are legitimate and still not fully answered.

vinay427 · on June 29, 2020

Yep, I agree that there are legitimate questions raised which still lack answers. However, I'm responding to the claim that the information the author provides in the summary is not found in any textbook they saw:

> So there you have it, a quick summary of one part of neural connectivity I’ve yet to see described in a textbook about the brain, but which really should be given out there, along with the classic Hebbian principle

throwitawayday · on June 29, 2020

The question of how neurons find each other to connect was recently studied with experimental connectomics--altering neurons and then mapping their synaptic circuits with electron microscopy--in this paper by Javier Valdes Aleman et al. 2019 https://www.biorxiv.org/content/10.1101/697763v1 , using Drosophila's somatosensory axons and central interneurons as a model.

If the OP's website Disqus worked (can't ever get the "post" button for comments after login), the above could have gone straight into the page.

g_airborne · on June 29, 2020

The connectedness of neurons in neural nets is usually fixed from the start (i.e. between layers, or somewhat more complicated in the case CNNs etc). If we could eliminate this and let neurons "grow" towards each other (like this article shows), would that enable smaller networks with similar accuracy? There's some ongoing research to prune weights by finding "subnets" [1] but I haven't found any method yet where the network grows connections itself. The only counterpoint I can come up with is that is probably wouldn't generate a significant performance speed up because it defeats the use of SIMD/matrix operations on GPUs. Maybe we would need chips that are designed differently to speed up these self-growing networks?

I'm not an expert on this subject, does anybody have any insights on this?

1. https://www.technologyreview.com/2019/05/10/135426/a-new-way...

tjwhitaker · on June 29, 2020

I think this is a really interesting area of machine learning. Some efforts have been made in ideas that are tangential to this one. Lots of papers in neuroevolution deal with evolving topologies. NEAT is probably the prime example http://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf and another paper I read recently called pathnet that is different but very interesting https://arxiv.org/abs/1701.08734.

g_airborne · on June 29, 2020

This is very cool! Thanks!

londons_explore · on June 29, 2020

I experimented with networks where weights were removed if they did not contribute much to the final answer.

My conclusion was I could easily set >99% of weights to zero on my (fully connected) layers with minimal performance impact after enough training, but the training time went up a lot (effectively after removing a bunch of connections, you have to do more training before removing more), and inference speed wasn't really improved because sparse matrices are sloooow.

Overall, while it works out for biology, I don't think it will work for silicon.

pshc · on June 30, 2020

Would you say you found a result similar to the lottery ticket hypothesis? https://arxiv.org/abs/1803.03635

londons_explore · on June 30, 2020

Not really - I had to do multiple steps of 'prune a bit, train a bit' to be able to prune to 99%. If I had done all the pruning in one big step as they do, I don't think it would have trained well, even if I had been able to see the future and remove the same weights.

mennis16 · on June 29, 2020

Here is a relevant paper, which was the coolest thing I saw at this past NeurIPS: https://weightagnostic.github.io/

It is based on NEAT (as other commenters mentioned) and also ties in some discussion of the Lottery Ticket Hypothesis as you mentioned.

blamestross · on June 29, 2020

(See sibling comment NEAT is awesome)

The only reason we architect ANNs the way we do is optimization of computation. The bipartite graph structure is optimized for GPU matrix math. Systems like NEAT have not been used at scale because they are a lot more expensive to train and to utilize the trained network with. ASICs and FPGAs have a change to utilize a NEAT generated network in production, but we still don't have a computer well suited to training a NEAT network.

g_airborne · on June 29, 2020

So this might be an enormous opportunity for low-cost and more performant AI if someone was able to build an FPGA of some sort that could handle these types of computations as efficiently right?

blamestross · on June 29, 2020

Running the post-training network is a solved problem. (FPGA and ASIC can do it just fine). TRAINING the network is the difficulty. The problem is that the structure of the network is arbitrary and is a result of the learning process. You can't optimize a computation for a structure you don't know yet. Bipartite layer networks have the benefit of never changing structure but they can approximate other subset structures. I don't know if we could easily tell where we are on the tradeoff between "bipartite graphs are trained efficiently but are inefficiently simulating a smaller network in practice"

Der_Einzige · on June 29, 2020

NEAT just doesn't have good, modern GPU powered implementations.

NEAT would totally be competitive if someone actually gets a version running in PyTorch/Tensorflow

blamestross · on June 29, 2020

It's not that simple. Backpropagating a bipartite graph of nodes works out to a series of matrix operations that parallelize efficiently on a GPU as long as the matrices fit into the GPU's working memory. Running a GA (part of neat) doesn't normally work well on a GPU. The good NEAT algorithms even allow different neurons to have different firing response curves. This inherently defies the "same operation multiple values" style of parallelization in GPUs. The way GPUs work just fundamentally isn't well suited to speeding up NEAT.

jawarner · on June 29, 2020

You may be interested in this implementation [1] which builds the networks using PyTorch.

[1] https://github.com/uber-research/PyTorch-NEAT

blamestross · on June 29, 2020

It uses pytorch (and I'm probably going to use it), but doesn't effectively leverage a GPU for training.

jawarner · on June 29, 2020

What do you think is the best way to accomplish this?

blamestross · on July 1, 2020

You don't. You need a different parallelism model than a GPU provides. It could work well on machines with very high CPU count, but the speedup on GPUs is the main reason bipartite graph algorithms have seen such investment.

hirenj · on June 29, 2020

Funnily enough over this last weekend, I read a great review on this subject from earlier this year:

“Synaptic Specificity, Recognition Molecules, and Assembly of Neural Circuits” by Sanes and Zipursky

https://doi.org/10.1016/j.cell.2020.04.008

For me, the hard part has always been understanding how this whole thing is orchestrated on a cellular and molecular level.

dr_dshiv · on June 29, 2020

When Hebb talks about "reverberation" in neural circuits, he still thinks in advance of our current knowledge of oscillatory neurodynamics. Here he speculates about the short term memory trace that holds position dynamically, prior to physical changes in the synapse:

"It might be supposed that the mnemonic trace is a lasting pattern of reverberatory activity without fixed locus like a cloud formation or eddies in a millpond"

From Hebb's 1948 "Organization of Behavior"

punnerud · on June 29, 2020

Main point: “(..) if the target neuron already has too many connections, it will tend to remove the weakest ones, and this includes the most recent ones. The scaling goes both ways after all – it goes for more synapses when it starts with too few, but for less, if it starts with too many.

But synaptic scaling is not everything. As it turns out, the tips of the growth cone constantly produce structures called filopodia, and these react to specific chemical attractants and repellents. These chemicals are produced by both cells at the target area, and by so-called guidepost cells along the way. There are suggestions that the system for such targeting is fairly robust, especially in early development (and its limitations in later life might explain why spinal cord injuries and the like are so hard to fix).“

Mirioron · on June 29, 2020

This makes me want to know how quickly this type of growth happens. Is it on the order of seconds? Minutes? Hours? Days? Is this why when you learn something, take a break, come back later and everything makes more sense happens?

jcims · on June 29, 2020

I've noticed that there's a weird area when learning a physical skill that there's a strange growth curve. You suck at first, then quickly get to some kind of milestone, then get worse before you get better. It feels like my brain is attempting to delegate some of the motor activity to lower levels before they are 'ready', but in fact it might be an essential part of training those neurons.

mikhailfranco · on June 29, 2020

See Mastery by George Leonard

which is a great book and highly recommended

even if you are not into karate or martial arts.

You have echoed his sketch of punctuated plateaus (p14):

The Mastery Curve

  There's really no way around it. Learning any new skill 
  involves relatively brief spurts of progress, each of
  which is followed by a slight decline to a plateau
  somewhat higher in most cases than that which preceded it.

[pdf] http://index-of.co.uk/Social-Interactions/Mastery%20-%20The%...

jcims · on June 29, 2020

Will definitely check this out thank you

joveian · on June 29, 2020

A lot of learning involves interaction between the cerebellum and the rest of the brain, and the cerebellum has a different structure. "How the brain works" type descriptions should mention the cerebellum quite a lot, but usually don't. Partly because when I was looking at this stuff a while back there was much less known about this interaction.

Waterluvian · on June 29, 2020

I think I get what you're saying. You _do_ get worse, then get better. It's like the brain is taking the proof of concept code and re-writing it with the necessary abstractions to make it a more performant routine. And then suddenly I start to experience the skills from new_task feeding into tasks I already know, and it feels like it's happening subconsciously.

whymauri · on June 29, 2020

This is a bit of an open problem, to the point of being controversial. I'd hesitate to say anyone has a real answer even though we certainly have real experimental data. The philosophical takes range from:

* You never enter the same room twice.

* Your brain partially re-wires every time you sleep.

* Your brain rewires, but the way it rewires is surprisingly predictable and we can track the dynamics.

* Your brain is rewiring literally every second, but not every rewiring is functional - does this imply an implicit robustness?

Etc, etc.

buboard · on June 29, 2020

Growth cones are only relevant during development and in regenerating neurons which are not common. Everyday neurons do however continuously extend (and contract) filopodia which may reach nearby axon terminals and eventually form a synapse, thus causing synaptic rewiring. Synaptic scaling is usually used to refer to a homeostatic and uniforum up- or down- scaling of synaptic weights and is not really relevant to rewiring.

dr_dshiv · on June 29, 2020

What about hippocampal neurogenesis? Those are spewing out at a nearly constant rate all the time

buboard · on June 29, 2020

it's still a tiny amount of neurons that are being turned over. About 1.75% of the dentate gyrus is renewed per year.

dr_dshiv · on June 29, 2020

But that's the piano roll that is recording our sense of time!

greyface- · on June 30, 2020

The Neuronal Gene Arc Encodes a Repurposed Retrotransposon Gag Protein that Mediates Intercellular RNA Transfer https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5884693/

sesuximo · on June 29, 2020

How do neurons fire less?

plutonorm · on June 29, 2020

Memory RNA

mongojunction · on June 29, 2020

What if something about the electrical signal attracts growth in certain direction, toward other signals firing at the same time?

Or what if some neuron pairs that are not yet connected share quantum entangled structures, that if activated simultaneously ... but still how does direction occur?

What if neurons emit light, that's why you can stimulate them with light...and what if they can somehow detect the faint light from other neurons and get the direction the light comes from, and grow towards that?

dr_dshiv · on June 29, 2020

You are getting downvoted for speculating on Quantum entanglement, but I think all of your speculations are useful here and to be encouraged.

mongojunction · on July 1, 2020

Thank you. so you work in this area? what do you think about the software available for your research? Could it be better or does software not play so much of a role?

dr_dshiv · on July 1, 2020

Yes the software in this area is critical -- and there are major challenges.

mongojunction · on July 1, 2020

If it's not too much trouble could you point me to some resources, compilations of relevant software? Maybe I could add some value...

mikhailfranco · on June 29, 2020

Hebbian Learning is just applying the function:

enhance transitive closure on a temporal window

plus the dual negation, whatever that is

under the space-time corollary of De Morgan's Laws:

atrophy atemporal uncorrelated direct connection