colinnordin's comments

colinnordin · 2025-04-29T18:56:41 1745953001

Your app looks cool! I've tried a few other apps doing something things, Clearspace is the one I'm using now. Will give yours a try!

I'm in a similar situation as you (developer having to do marketing) but have not gotten as far, so far I've only posted on a few subreddits and here on HN. Have you found any nice learning resources?

nullderef · 2025-04-29T19:02:38 1745953358

Thank you for the support! I actually wrote an article about it here: https://speedbumpapp.com/en/blog/mobile-app-promotion/. It has the most useful resources I found so far (mostly B2C). I'm also happy to have a chat and help out: https://chat.nullderef.com :)

colinnordin · 2025-03-30T21:29:18 1743370158

Working on a tool to remember things I read.

You select some text with your phone and share it with my app, then the shared text is reformulated to a flashcard (with the help of a llm).

You can then browse your flashcards in the app, but I’m also working on ways to show the cards to you with less friction: Like on the phone lock screen or on the face of your watch.

https://komihag.com

redrove · 2025-04-02T06:01:50 1743573710

This is really cool. Any chance you can make the backend selfhostable with ollama as an option for the LLM?

colinnordin · 2025-02-28T17:25:27 1740763527

What do you have in mind? I’ve thought about affirmations, that could maybe fit this approach as well.

colinnordin · 2025-02-28T17:24:04 1740763444

Not yet ;)

colinnordin · 2025-01-28T14:20:20 1738074020

R1 and O1 points towards a world where training models will be a small bucket of the overall compute. That doesn't mean the total amount of AI compute will stop accelerating, just that interconnected mega clusters is not the only or most efficient way to run a majority of future workloads. That should be negative news for the company that is currently the only one that is capable of making chips for these clusters, and positive news for the players that can run inference on a single chip, as they will be able to grab more parts of the compute pie.

colinnordin · 2025-01-27T14:01:53 1737986513

Great article.

>Now, you still want to train the best model you can by cleverly leveraging as much compute as you can and as many trillion tokens of high quality training data as possible, but that's just the beginning of the story in this new world; now, you could easily use incredibly huge amounts of compute just to do inference from these models at a very high level of confidence or when trying to solve extremely tough problems that require "genius level" reasoning to avoid all the potential pitfalls that would lead a regular LLM astray.

I think this is the most interesting part. We always knew a huge fraction of the compute would be on inference rather than training, but it feels like the newest developments is pushing this even further towards inference.

Combine that with the fact that you can run the full R1 (680B) distributed on 3 consumer computers [1].

If most of NVIDIAs moat is in being able to efficiently interconnect thousands of GPUs, what happens when that is only important to a small fraction of the overall AI compute?

[1]: https://x.com/awnihannun/status/1883276535643455790

tomrod · 2025-01-27T14:46:47 1737989207

Conversely, how much larger can you scale if frontier models only currently need 3 consumer computers?

Imagine having 300. Could you build even better models? Is DeepSeek the right team to deliver that, or can OpenAI, Meta, HF, etc. adapt?

Going to be an interesting few months on the market. I think OpenAI lost a LOT in the board fiasco. I am bullish on HF. I anticipate Meta will lose folks to brain drain in response to management equivocation around company values. I don't put much stock into Google or Microsoft's AI capabilities, they are the new IBMs and are no longer innovating except at obvious margins.

stormfather · 2025-01-27T14:58:07 1737989887

Google is silently catching up fast with Gemini. They're also pursuing next gen architectures like Titan. But most importantly, the frontier of AI capabilities is shifting towards using RL at inference (thinking) time to perform tasks. Who has more data than Google there? They have a gargantuan database of queries paired with subsequent web nav, actions, follow up queries etc. Nobody can recreate this, Bing failed to get enough marketshare. Also, when you think of RL talent, which company comes to mind? I think Google has everyone checkmated already.

shwaj · 2025-01-27T18:00:59 1738000859

Can you say more about using RL at inference time, ideally with a pointer to read more about it? This doesn’t fit into my mental model, in a couple of ways. The main way is right in the name: “learning” isn’t something that happens at inference time; inference is generating results from already-trained models. Perhaps you’re conflating RL with multistage (e.g. “chain of thought”) inference? Or maybe you’re talking about feeding the result of inference-time interactions with the user back into subsequent rounds of training? I’m curious to hear more.

stormfather · 2025-01-27T19:43:14 1738006994

I wasn't clear. Model weights aren't changing at inference time. I meant at inference time the model will output a sequence of thoughts and actions to perform tasks given to it by the user. For instance, to answer a question it will search the web, navigate through some sites, scroll, summarize, etc. You can model this as a game played by emitting a sequence of actions in a browser. RL is the technique you want to train this component. To scale this up you need to have a massive amount of examples of sequences of actions taken in the browser, the outcome it led to, and a label for if that outcome was desirable or not. I am saying that by recording users googling stuff and emailing each other for decades Google has this massive dataset to train their RL powered browser using agent. Deepseek proving that simple RL ca be cheaply applied to a frontier LLM and have reasoning organically emerge makes this approach more obviously viable.

shwaj · 2025-01-28T07:02:47 1738047767

Makes sense, thanks. I wonder whether human web-browsing strategies are optimal for use in a LLM, e.g. given how much faster LLMs are at reading the webpages they find, compared to humans? Regardless, it does seem likely that Google’s dataset is good for something.

stormfather · 2025-02-04T16:19:27 1738685967

Take this example:

A human googles "how much does a tire cost?"

They pick out a website from search results, then nav within it to the correct product page and maybe scroll until the price is visible on screen.

Google captures a lot of that data on third party sites. From Perplexity:

Google Analytics: If the website uses Google Analytics, Google can collect data about user behavior on that site, including page views, time on site, and user flow.

Google Ads: Websites using Google Ads may allow Google to track user interactions for ad targeting and conversion tracking.

Other Google Services: Sites implementing services like Google Tag Manager or using embedded YouTube videos may provide additional tracking opportunities

So you can imagine that Google has a kajillion training examples that go: search query (which implies task) -> pick webpage -> actions within webpage -> user stops (success), or user backs off site/tries different query (failure)

You can imagine that even if an AI agent is super efficient, it still needs to learn how to formulate queries, pick out a site to visit, nav through the site, do all that same stuff to perform tasks. Google's dataset is perfect for this, huge, and unparalleled.

_DeadFred_ · 2025-01-27T17:58:26 1738000706

How quickly the narrative went from 'Google silently has the most advanced AI but they are afraid to release it' to 'Google is silently catching up' all using the same 'core Google competencies' to infer Google's position of strength. Wonder what the next lower level of Google silently leveraging their strength will be?

stormfather · 2025-01-27T19:45:27 1738007127

Google is clearly catching up. Have you tried the recent Gemini models? Have you tried deep research? Google is like a ship that is hard to turn around but also hard to stop once in motion.

moffkalast · 2025-01-27T17:52:44 1738000364

Never underestimate Google's ability to fall flat on their face when it comes to shipping products.

onlyrealcuzzo · 2025-01-27T16:19:32 1737994772

If you watch this video, it explains well what the major difference is between DeepSeek and existing LLMs: https://www.youtube.com/watch?v=DCqqCLlsIBU

It seems like there is MUCH to gain by migrating to this approach - and it theoretically should not cost more to switch to that approach than vs the rewards to reap.

I expect all the major players are already working full-steam to incorporate this into their stacks as quickly as possible.

IMO, this seems incredibly bad to Nvidia, and incredibly good to everyone else.

I don't think this seems particularly bad for ChatGPT. They've built a strong brand. This should just help them reduce - by far - one of their largest expenses.

They'll have a slight disadvantage to say Google - who can much more easily switch from GPU to CPU. ChatGPT could have some growing pains there. Google would not.

wolfhumble · 2025-01-27T17:01:38 1737997298

> I don't think this seems particularly bad for ChatGPT. They've built a strong brand. This should just help them reduce - by far - one of their largest expenses.

Often expenses like that are keeping your competitors away.

onlyrealcuzzo · 2025-01-27T17:20:32 1737998432

Yes, but it typically doesn't matter if someone can reach parity or even surpass you - they have to surpass you by a step function to take a significant number of your users.

This is a step function in terms of efficiency (which presumably will be incorporated into ChatGPT within months), but not in terms of end user experience. It's only slightly better there.

ReptileMan · 2025-01-27T17:41:22 1737999682

One data point but my subscription for chatgpt is cancelled every time. So I made every month decision to resub. And because the cost of switching is essentially zero - the moment a better service is up there I will switch in an instant.

onlyrealcuzzo · 2025-01-27T18:28:09 1738002489

There are obviously people like you, but I hope you realize this is not the typical user.

tomrod · 2025-01-27T20:34:03 1738010043

That is a fantastic video, BTW.

danaris · 2025-01-27T14:55:21 1737989721

This assumes no (or very small) diminishing returns effect.

I don't pretend to know much about the minutiae of LLM training, but it wouldn't surprise me at all if throwing massively more GPUs at this particular training paradigm only produces marginal increases in output quality.

tomrod · 2025-01-27T15:30:22 1737991822

I believe the margin to expand is on CoT, where tokens can grow dramatically. If there is value in putting more compute towards it, there may still be returns to be captured on that margin.

simpaticoder · 2025-01-27T18:46:39 1738003599

>Imagine having 300.

Would it not be useful to have multiple independent AIs observing and interacting to build a model of the world? I'm thinking something roughly like the "councelors" in the Civilization games, giving defense/economic/cultural advice, but generalized over any goal-oriented scenario (and including one to take the "user" role). A group of AIs with specific roles interacting with each other seems like a good area to explore, especially now given the downward scalability of LLMs.

JoshTko · 2025-01-27T20:11:44 1738008704

This is exactly where Deepseeks enhancements come into play. Essentially deepseek lets the model think out loud via chain of thought (o1 and Claude also do this) but DS also does not supervise the chain of thought, and simply rewards CoT that get the answer correctly. This is just one of the half dozen training optimization that Deepseek has come up with.

tomrod · 2025-01-27T19:00:36 1738004436

Yes; to my understanding that is MoE.

tw1984 · 2025-01-27T16:26:22 1737995182

> If most of NVIDIAs moat is in being able to efficiently interconnect thousands of GPUs

nah. it moat is CUDA and millions of devs using CUDA aka the ecosystem

mupuff1234 · 2025-01-27T17:16:31 1737998191

But if it's not combined with super high end chips with massive margins that moat is not worth anywhere close to 3T USD.

ReptileMan · 2025-01-27T17:43:12 1737999792

And then some chineese startup create an amazing compiler that takes cuda and moves it to X (AMD, Intel, Asic) and we are back at square one.

So far it seems that the best investment is in RAM producers. Unlike compute the ram requirements seem to be stubborn.

01100011 · 2025-01-27T18:49:06 1738003746

Don't forget that "CUDA" involves more than language constructs and programming paradigms.

With NVDA, you get tools to deploy at scale, maximize utilization, debug errors and perf issues, share HW between workflows, etc. These things are not cheap to develop.

Symmetry · 2025-01-27T20:56:58 1738011418

It might not be cheap to develop them but if you can save $10B in hardware costs by doing so you're probably looking at positive ROI.

01100011 · 2025-01-28T05:56:13 1738043773

Yeah, I mean, 9 women can make a baby in a month so why not?

Oh wait, it takes years to do all that and in the meantime you're wasting energy on not staying at the forefront of a hot tech trend.

a_wild_dandan · 2025-01-27T17:26:19 1737998779

Running a 680-billion parameter frontier model on a few Macs (at 13 tok/s!) is nuts. That'a two years after ChatGPT was released. That rate of progress just blows my mind.

qingcharles · 2025-01-27T23:27:16 1738020436

And those are M2 Ultras. M4 Ultra is about to drop in the next few weeks/months, and I'm guessing it might have higher RAM configs, so you can probably run the same 680b on two of those beasts.

The higher performing chips, with one less interconnect, is going to give you significantly higher t/s.

bn-l · 2025-01-27T15:38:44 1737992324

Link has all the params but running at 4 bit quant.

qingcharles · 2025-01-27T23:28:32 1738020512

4-bit quant is generally kinda low, right?

I wonder how badly this quant affects the output on DeepSeek?

neuronic · 2025-01-27T15:05:13 1737990313

> NVIDIAs moat

Offtopic, but your comment finally pushed me over the edge to semantic satiation [1] regarding the word "moat". It is incredible how this word turned up a short while ago and now it seems to be a key ingredient of every second comment.

[1] https://en.wikipedia.org/wiki/Semantic_satiation

mikestew · 2025-01-27T15:14:55 1737990895

It is incredible how this word turned up a short while ago…

I’m sure if I looked, I could find quotes from Warren Buffet (the recognized originator of the term) going back a few decades. But your point stands.

kccqzy · 2025-01-27T16:48:24 1737996504

The earliest occurrence of the word "moat" that I could find online from Buffett is from 1986: https://www.berkshirehathaway.com/letters/1986.html That shareholder letter is charmingly old-school.

Unfortunately letters before 1977 weren't available online so I wasn't able to search.

It also helps that I've been to several cities with an actual moat so this word is familiar to me.

mikeyouse · 2025-01-27T15:30:29 1737991829

Yeah, he's been talking about "economic moats" since at least the 1990s. At least since 1995;

https://www.berkshirehathaway.com/letters/1995.html

pillefitz · 2025-01-27T20:08:29 1738008509

Nobody claimed it's a new word. Still, the frequency increased 100x over the last days, subjectively speaking.

fastasucan · 2025-01-27T19:39:58 1738006798

The word moat was first used in english in the 15th century https://www.merriam-webster.com/dictionary/moat

neuronic · 2025-01-28T06:58:46 1738047526

Yes my wording was rubbish I should have said "tuned up" in the HN bubble. Quick ctrl-f shows 35 uses in this thread without loading all comments.

I did not mean that it was literally invented a short while ago - a few months ago I had to look up what it means though (not native English).

cwmoore · 2025-01-27T21:47:21 1738014441

https://en.wikipedia.org/wiki/Frequency_illusion

ljw1004 · 2025-01-27T21:35:53 1738013753

I'm struggling to understand how a moat can have a CRACK in it.

nateglims · 2025-01-27T23:51:34 1738021894

perhaps if the moat is kept in place by some sort of berm or quay

colinnordin · on May 28, 2023

If you haven’t already: Start to store question and answer pairs and reuse the answer if the same question is asked multiple times.

You could also compute embeddings for the questions (don’t have to be OpenAI embeddings), and reuse the answer if the question is sufficiently similar to a prevously asked question.

serial_dev · on May 28, 2023

I'm not sure it's practical and if it will result in any savings.

Wouldn't it be almost impossible to hit a duplicate when the users each form their own question?

Another issue I see is that these chat AIs usually have "history", so the question might be the same, but the context is different: the app might have received "when was he born", but in one context, the user talks about Obama and in another, she talks about Tom Brady.

If there are ways around these issues, I'd love to hear it, but it sounds like this will just increase costs via cache hardware costs and any dedup logic instead of saving money.

Silasdev · on May 28, 2023

>Wouldn't it be almost impossible to hit a duplicate when the users each form their own question?

The embeddings approach would increase the likelyhood of finding the same question, even if phrased slightly differently.

rjtavares · on May 28, 2023

With embeddings you can compute distance. The questions don't have to be the same, they just have to be sufficiently close.

Regarding context, that should be a part of the input for the embeddings.

dxhdr · on May 28, 2023

This removes half the magic of interacting with ChatGPT. Users will quickly realize they're interacting with a dumb database rather than an AI.

cloogshicer · on May 28, 2023

I don't see what the problem is if it's only on the exact same prompt.

I assume only a small percentage of users would put in the same prompt twice, and even then, why would they be upset at getting the same response?

colinnordin · on April 18, 2020

Assuming we had millions of 3D-scans of different kinds of animals and their full DNA sequence. Wouldn't it be interesting to try to use generative ML methods to generate a dinosaur DNA based on a 3D model?

netjiro · on April 18, 2020

Fun idea, but I think you're underestimating the total systemic information density between DNA and final body.

But I'm sure you can start there and find some interesting tangent that can very well turn out to be worth pursuing further.

malux85 · on April 18, 2020

There’s just no way this would work, the DNA codes organs, proteins and all of the complex molecules and molecular interactions, there’s no way a generative model is going to glean all that from 3D scans.

andbberger · on April 18, 2020

I suppose everything looks like a nail when you've got a few GPUs.

Besides, the sequence alone is not the full state. You have to also consider expression and regulation and all that; epigenetics.

DavidSJ · on April 18, 2020

I wonder:

Could you normally have two viable lineages, each with the same DNA, and each with a different expression of that DNA which is stably conserved over many generations, purely due to womb environment, etc.? In other words, if you cloned a woolly mammoth using an elephant surrogate, might the great great great great grandchildren of this clone still have some characteristics that are due to having an elephant surrogate ancestor?

Or would there be a tendency to converge to a single stable expression due entirely to genetics?

It‘s obvious that in principle the answer could go either way, but I’m not sure whether that’s true in practice, with naturally occurring organisms and naturally occurring DNA sequences. For the sake of this question, one is also tempted to exclude post-natal “cultural” transmission, but it’s not clear that can be easily distinguished.

EmilioMartinez · on April 18, 2020

Sounds like you want to make dinosaur-shaped dogs

antonzabirko · on April 18, 2020

seems oddly familiar. sure let's do it

moobsen · on April 18, 2020

Are you referring to a movie? Which one?

Or something like this? https://arxiv.org/abs/1712.06148

colinnordin · on Jan 31, 2019

> The TV Meter needs to hear the TV clearly so it can work, and people watching TV will need to see the screen while they’re watching TV.

And for some reason it's important that you can see the device even though the camera is turned off?

colinnordin · on March 25, 2018

I also work with non-speech audio and I'm curious: Do you use pure DFT:s as inputs to your models or do you use mel-energies or MFCC:s? What kind of models do you use? Since there is not that much variation in the sound of a chainsaw I suppose either a regular fully connected or convolutional neural network?

Love what you are doing and I would love to see a technical blog post about how you work with audio!