More

marmaduke · 2025-01-16T10:35:21 1737023721

similar to RWKV7’s new (sub quadratic) attention mechanism which models key values as v≈kS’ and does an in-context descent on ||v - kS’||^2/2 (where the state matrix S is one attentional head) , explained more by the author here https://raw.githubusercontent.com/BlinkDL/RWKV-LM/main/RWKV-...

and i tried to unpack it a bit here https://wdmn.fr/rank-1-take-on-rwkv7s-in-context-learning/

marmaduke · 2025-01-11T18:33:30 1736620410

looks like a nice overview. i’ve implemented neural ODEs in Jax for low dimensional problems and it works well, but I keep looking for a good, fast, CPU-first implementation that is good for models that fit in cache and don’t require a GPU or big Torch/TF machinery.

sitkack · 2025-01-11T18:50:24 1736621424

Did you use https://github.com/patrick-kidger/diffrax ?

JAX Talk: Diffrax https://www.youtube.com/watch?v=Jy5Jw8hNiAQ

yberreby · 2025-01-11T23:23:29 1736637809

Anecdotally, I used diffrax (and equinox) throughout last year after jumping between a few differential equation solvers in Python, for a project based on Dynamic Field Theory [1]. I only scratched the surface, but so far, it's been a pleasure to use, and it's quite fast. It also introduced me to equinox [2], by the same author, which I'm using to get the JAX-friendly equivalent of dataclasses.

`vmap`-able differential equation solving is really cool.

[1]: https://dynamicfieldtheory.org/ [2]: https://github.com/patrick-kidger/equinox

sitkack · 2025-01-12T07:00:38 1736665238

Thanks, that looks neat.

Kidger's thesis is wonderful https://arxiv.org/abs/2202.02435

marmaduke · 2025-01-11T21:41:48 1736631708

no, wrote it by hand for use with my own Heun implementation, since it’s for use within stochastic delayed systems.

jax is fun but as effective as i’d like for CPU

Iwan-Zotow · 2025-01-12T04:32:46 1736656366

Not as effective as I'd like?

marmaduke · 2025-01-12T08:11:46 1736669506

ha, yeah, thanks.

barrenko · 2025-01-13T09:57:15 1736762235

How would you describe what a neural ODE is in the simplest possible terms? Let's say I know what an NN and a DE are :).

kk58 · 2025-01-13T18:04:20 1736791460

classic NN takes a vector of data through layers to make a prediction. Backprop adjusts network weights till predictions are right. These network weights form a vector, and training changes this vector till it hits values that mean "trained network".

Neural ODE reframes this: instead of focusing on the weights, focus on how they change. It sees training as finding a path from untrained to trained state. At each step, it uses ODE solvers to compute the next state, continuing for N steps till it reaches values matching training data. This gives you the solution for the trained network.

barrenko · 2025-01-13T19:05:04 1736795104

Pretty cool approach, looking more into it, thank you!

marmaduke · on Jan 6, 2025

i like how the contraction it’s and abbreviation T’is are anagrams

sigzero · on Jan 6, 2025

It is not T'is, it is Tis. No apostrophe.

IncRnd · on Jan 6, 2025

The word is 'tis with an apostrophe.

https://www.britannica.com/dictionary/%27tis

https://www.merriam-webster.com/dictionary/%27tis

https://dictionary.cambridge.org/us/dictionary/english/tis

hollerith · on Jan 6, 2025

I've only ever seen it spelled 'tis

dkjaudyeqooe · on Jan 6, 2025

It's a contraction of "it is" so " 'tis " is correct.

jcranmer · on Jan 6, 2025

The apostrophe is in the wrong spot, but 'tis is the correct spelling, and the only one I've ever seen.

marmaduke · on Jan 2, 2025

I didn’t look it up, but at first glance, it reminded me of discussions like this one

https://youtu.be/qnT48wO0UL0

marmaduke · on Jan 1, 2025

that’s not what the story says. in any case, the point is to explain, in terms of dualistic if-then logic, that the if (you practice now) and then (you will wake up) are a single non-dual thing. but to communicate in in terms which make sense to the dual, if-then mind, one needs to use dualistic language.

marmaduke · on Dec 28, 2024

how far from your previous experience is the game work?

vbi8iBEX · on Dec 30, 2024

Very far. I'm a fullstack web developer. Independent game dev has been my hobby for 10 years, and game dev is what got me interested in tech when I was a kid. I started teaching myself at a young age with qbasic.

Building interfaces and menu systems etc feels very similar to frontend web development. Much of the domain knowledge transfers when it comes to programming. Mainly my skills transfer to gameplay development and debugging. I make small games and only just recently released my first commercial game, which has only sold two copies.

marmaduke · on Dec 31, 2024

do you have a link for your game? how was the release process?

marmaduke · on Dec 25, 2024

i wonder if there are any semi automated approaches to finding outliers or “things worth investigating” in these traces, or is it just eyeballs all the way down?

valyala · on Dec 25, 2024

This is possible by semi-automatic detection of anomalies over time for some preset of fields used for grouping the events (aka dimensions) and another preset of fields used in stats calculations (aka metrics). In general case this is hard to resolve taks, since it is impossible to check for anomalies across all the possible combinations of dimensions and metrics for wide events with hundreds of fields.

This is also complicated by the possibility to apply various filters for the events before and after ststs' calculations.

arccy · on Dec 25, 2024

honeycomb "bubble up"

tomjen3 · on Dec 25, 2024

That seems a good usecase for AI: Its trivial to have it suggest some queries and test if they give interesting results.

marmaduke · on Dec 24, 2024

having worked on whole brain modeling the last 15 years and european infra for supporting this kinda research, this is a terrible buzzword salad. the pdf is on par with a typical masters project.

rrr_oh_man · on Dec 25, 2024

https://github.com/the-virtual-brain

This looks cool!

marmaduke · on Dec 25, 2024

https://www.biorxiv.org/content/10.1101/2024.10.25.620245v1....

covers some of the recent perspectives on this modeling approach if you’re interested.

exrhizo · on Dec 26, 2024

Awesome, made this worth it from my pov

nynx · on Dec 25, 2024

yeah, i did my undergrad research on biological neuron emulation and most of the research in the area is hilariously moronic things just for pumping "research" out and getting students their pieces of paper.

marmaduke · on Dec 24, 2024

HH is kinda the opposite of LIF on the abstraction spectrum.

dfgtyu65r · on Dec 24, 2024

I mean HH is an elaboration of the LIF with the addition of several equations for the various ion channels, but yeah I see what you mean.

marmaduke · on Dec 1, 2024

The pigeon experiment is a great one to learn from not just about programming or software, but about life in general. Where are you getting your next dopamine hit? Is it random? Maybe that’s where our idiosyncrasies come from.