More

wenbin · 2025-09-15T02:17:52 1757902672

AI will create ever more AI-generated synthetic content because current systems still can't determine with 100% certainty whether a piece of content was produced by AI. And AIs will, intentionally or unintentionally, train on synthetic content produced by other AIs.

AI generators don't have a strong incentive to add watermarks to synthetic content. They also don't provide reliable AI-detection tools (or any tools at all) to help others detect content generated by them.

what · 2025-09-15T02:55:50 1757904950

I’d be kind of surprised if they don’t watermark the content they generate. Just so they don’t train on their own slop.

wenbin · 2025-09-15T03:14:37 1757906077

Maybe some of them already embed some simple, secret marker to identify their own generated content. But people outside the organization wouldn’t know. And this still can’t prevent other companies from training models on synthetic data.

Once synthetic data becomes pervasive, it’s inevitable that some of it will end up in the training process. Then it’ll be interesting to see how the information world evolves: AI-generated content built on synthetic data produced by other AIs. Over time, people may trust AI-generated content less and less.

wenbin · 2025-08-30T08:13:50 1756541630

I really hope SynthID becomes a widely adopted standard - at the very least, Google should implement it across its own products like NotebookLM.

The problem is becoming urgent: more and more so-called “podcasts” are entirely fake, generated by NotebookLM and pushed to every major platform purely to farm backlinks and run blackhat SEO campaigns.

Beyond SynthID or similar watermarking standards, we also need models trained specifically [0] to detect AI-generated audio. Otherwise, the damage compounds - people might waste 30 minutes listening to a meaningless AI-generated podcast, or worse, absorb and believe misleading or outright harmful information.

[0] 15,000+ ai generated fake podcasts https://www.kaggle.com/datasets/listennotes/ai-generated-fak...

6LLvveMx2koXfwn · 2025-08-30T10:33:58 1756550038

Given there is "misleading or outright harmful" information generated by humans, why is it more pressing that we track such content generated by AI?

anuramat · 2025-08-30T11:19:32 1756552772

I suppose efficiency? It's easier to filter out petabytes of AI slop than to determine which human generated content is harmful

wenbin · 2025-07-21T00:37:56 1753058276

missed the good old days of telnet bbs & newsgroup :)

wenbin · 2025-06-30T23:44:40 1751327080

Earlier this year, we at Listen Notes switched to Better Stack [0], replacing both Datadog and PagerDuty, and we couldn’t be happier :) Datadog offers a rich set of features, and as a public company, it makes sense for them to keep expanding their product and pushing larger contracts. But as a small team, we don't have a strong demand for constant new features. By switching to Better Stack, we were able to cut our monitoring and alerting costs by 90%, with basically the same things that we used from Datadog previously.

[0] https://www.listennotes.com/blog/use-betterstack-to-replace-...

wenbin · 2025-06-09T17:20:04 1749489604

https://Transcript.New

wenbin · 2025-06-06T22:02:08 1749247328

Let me guess - the keyword here is "Section 174", just from the title alone :)

Dealing with Section 174 amortization in those first one to three years is a real headache (and your tax bill ends up higher than if it didn’t apply). Once your startup survives that the first few years of doing Section 174, things do get easier... but, sadly, most don't make it that far.

wenbin · 2025-05-22T17:29:49 1747934989

One issue I've had with read-it-later apps is that I end up accumulating far too many articles and never actually reading them. Now my approach is simple: if I see something I want to read, I either read it immediately or never.

wenbin · 2025-05-22T17:32:17 1747935137

There could be a better system (for me): a read-it-later app with a strict limit—say, 3 or 5 slots. If it's full and you add a new article, the oldest one gets automatically deleted to make room. That way, you always have something to read, but never feel overwhelmed by an endless backlog.

lastofthemojito · 2025-05-22T17:46:08 1747935968

For me, this is just browser tabs. Modern browsers seem to do a good enough job with both persisting tab state between sessions and not expending tons of resources on idle tabs. So I just pop the article of interest up in a new tab and leave it there until I get back to it.

rs186 · 2025-05-22T21:20:22 1747948822

If you ever take a train (those where everyone has a seat) or flight, those are long hours perfect for reading.

wenbin · 2025-05-12T22:43:46 1747089826

Congrats on the launch.

This reminds me of https://www.heavy.ai/ (previously MapD back in 2015/16?)

wenbin · 2025-04-28T02:16:26 1745806586

besides speaking good english, what else do non-technical folks bring? maybe a good family name? /s

wenbin · 2025-04-08T21:37:17 1744148237

Yes, if I need (relatively) accurate answers (with the sources / urls / web pages), I'd use keyword search on Google.

Still have a trust issue with LLM/ChatGPT for facts. Maybe in a couple years my mindset will shift and trust LLM/chatgpt more.