I made field recordings during my last stay in Tokyo. From those, I made a song for each station of the Yamanote line, using the Jingle in the prompt. The visuals were made similarly.
I have done quite a bit of game dev with LLMs and have very rarely run into the problem you mention. I've been surprised by how easily LLMs will create even harmful narratives if I ask them to code them as a game.
I have the impression that the thinking helps even if the actual content of the thinking output is nonsense. It awards more cycles to the model to think about the problem.
That would be strange. There's no hidden memory or data channel, the "thinking" output is all the model receives afterwards. If it's all nonsense, then nonsense is all it gets. I wouldn't be completely surprised if a context with a bunch of apparent nonsense still helps somehow, LLMs are weird, but it would be odd.
This isn't quite right. Even when an LLM generates meaningless tokens, its internal state continues to evolve. Each new token triggers a fresh pass through the network, with attention over the KV cache, allowing the model to refine its contextual representation. The specific tokens may be gibberish, but the underlying computation can still reflect ongoing "thinking".
Attention operates entirely on hidden memory, in the sense that it usually isn't exposed to the end user. An attention head on one thinking token can attend to one thing and the same attention head on the next thinking token can attend to something entirely different, and the next layer can combine the two values, maybe on the second thinking token, maybe much later. So even nonsense filler can create space for intermediate computation to happen.
It's a bit more subtle though, if I understand correctly this only works for parallelizable problems. Which makes intuitive sense since the model cannot pass information along with each dot. So in that sense COT can be seen as some form of sampling, which also tracks with findings that COT doesn't boost the "raw intelligence" but rather uncovers latent intelligence, converting pass@k to maj@k. Antirez touches upon this in [1].
On the other hand, I think problems with serial dependencies require "real" COT since the model needs to track the results of subproblems. There's also some studies which show a meta-structure to the COT itself though, e.g. if you look at DeepSeek there are clear patterns of backtracking and such that are slightly more advanced than naive repeated samplings. https://arxiv.org/abs/2506.19143
Although thinking a bit more, even constrained to only output dots, there can still some amount of information passing between each token, namely in the hidden states. The attention block N layers deep will compute attention scores off of the residual stream for previous inputs at that layer, so some information can be passed along this way.
It's not very efficient though, because for token i layer N can only receive as input layer N-1 for tokens i-1, i-2... So information is sort of passed along diagonally. If handwavily the embedding represents some "partial result" then it can be passed along diagonally from (N-1, i-1) to (N, i) to have the COT for token i+1 continue to work on it. So this way even though the total circuit depth is still bounded by # of layers, it's clearly "more powerful" than just naively going from layer 1...n, because during the other steps you can maybe work on something else.
But it's still not as powerful as allowing the results at layer n to be fed back in, which effectively unrolls the depth. This maybe intuitively justifies the results in the paper (I think it also has some connection to communication complexity).
{
"assistant_response_preferences": {
"1": "User prefers concise responses for direct factual queries but detailed, iterative explanations when exploring complex topics. They often ask for more refinement or detail when discussing technical or business-related matters. User frequently requests TL;DR versions or more succinct phrasing for straightforward questions but shows a tendency toward iterative refinement for strategic or technical discussions, such as AI applications, monetization models, and startup valuation. Confidence=high.",
"2": "User prefers a casual, direct, and slightly irreverent tone, leaning towards humor and playfulness, especially in creative or informal discussions. Frequent use of humor and irony when naming projects, describing AI-generated images, and approaching AI personality descriptions. They also request ironic or edgy reformulations, particularly in branding and marketing-related discussions. Confidence=high.",
"3": "User enjoys back-and-forth discussions and rapid iteration, frequently refining responses in small increments rather than expecting fully-formed information at once. They give iterative feedback with short follow-up messages when structuring pitches, fine-tuning visual designs, and optimizing descriptions for clarity. Confidence=high.",
"4": "User highly values functional elegance and minimalism in coding solutions, favoring simplicity and efficiency over verbosity. In discussions related to Cloudflare Workers, caching scripts, and API endpoint structuring, the user repeatedly requested smaller, more functional code blocks rather than bloated implementations. Confidence=high.",
"5": "User prefers answers grounded in real-world examples and expects AI outputs to be practical rather than theoretically extensive. In business-related discussions, such as SAFE valuation and monetization models, they requested comparisons, benchmarks, and real-world analogies instead of hypothetical breakdowns. Confidence=high.",
"6": "User does not appreciate generic or overly safe responses, especially in areas where depth or nuance is expected. For AI model personality descriptions and startup pitch structures, they pushed for community insights, deeper research, and non-traditional perspectives instead of bland, default AI descriptions. Confidence=high.",
"7": "User frequently requests visual representations like ASCII diagrams, structured markdown, and flowcharts to understand complex information. In discussions on two-sided marketplaces, startup funding structures, and caching mechanisms, they explicitly asked for structured markdown, flowcharts, or diagrams to clarify concepts. Confidence=high.",
"8": "User is receptive to recommendations but dislikes suggestions that stray too far from the core query or add unnecessary complexity. They often responded positively to well-targeted suggestions but rejected tangents or off-topic expansions, particularly when troubleshooting backend infrastructure or streamlining code deployment. Confidence=medium.",
"9": "User appreciates references to biomimicry, organic structures, and futuristic aesthetics, particularly for branding and UI/UX discussions. Frequent requests for biological metaphors and design principles in visual design, AI monetization diagrams, and ecosystem branding (e.g., describing revenue flows in organic/cellular terms). Confidence=medium.",
"10": "User prefers a no-nonsense approach when discussing legal, technical, or startup funding topics, with little patience for vague or theoretical answers. They repeatedly asked for exact clauses, contract implications, or legal precedents when discussing SAFE agreements, founder equity, and residency requirements. Confidence=high."
},
"notable_past_conversation_topic_highlights": {
"1": "User has been actively engaged in startup pitching, AI monetization strategies, and investment discussions for Pollinations.AI. The user has explored traction-based startup valuation, SAFE agreements, equity distribution, and two-sided marketplace dynamics. They have particularly focused on ad embedding in generative AI content and optimizing affiliate revenue streams. Confidence=high.",
"2": "User conducted extensive testing and debugging of AI-powered APIs, particularly using Cloudflare, OpenAI-compatible APIs, and caching strategies with R2. They worked on optimizing SSE streaming, cache key generation, and request coalescing in Cloudflare Workers. Confidence=high.",
"3": "User explored AI-generated visual media and branding, developing a structured process for generating customized images for event flyers, product branding, and AI trading card concepts. Confidence=high.",
"4": "User implemented GitHub automation, API authentication strategies, and data visualization pipelines. Confidence=high.",
"5": "User engaged in community development strategies for Pollinations.AI, including youth involvement in AI, sourcing teenage developers, and integrating AI-powered tooling into social platforms. Confidence=high.",
"6": "User, Thomas Haferlach, is a German entrepreneur and AI technology expert with a background in computer science and artificial intelligence. Confidence=high.",
"7": "User has a strong technical background, with experience in cloud infrastructure, AI model deployment, and API development. Confidence=high.",
"8": "User blends AI-generated content with creative projects, aiming to make AI-generated media accessible to independent creators. Confidence=high.",
"9": "User is securing funding for Pollinations.AI, exploring investment opportunities with accelerators and evaluating different financial and equity models. Confidence=high.",
"10": "User is based in Berlin, Germany but has global connections, including experience living in São Paulo, Brazil. Confidence=high.",
"11": "User collaborates with his wife Saeko Killy, a Japanese musician, producer, and performer, on AI/art/music projects. Confidence=high.",
"12": "User is deeply involved in the open-source AI developer community and tracks AI advancements. Confidence=high.",
"13": "Pollinations.AI has a rapidly growing user base, reaching over 4 million monthly active users and processing 100 million API requests per month, with a 30% monthly growth rate. Confidence=high.",
"14": "User is considering monetization strategies including pay-per-use plans, subscriptions, and ad-supported models where generated AI content integrates ads. Confidence=high.",
"15": "User collaborates with Elliot Fouchy and Kalam Ali on Pollinations.AI projects. Confidence=high.",
"16": "User demonstrates experience in community-building, social engagement tracking, and youth-oriented creator ecosystems. Confidence=high."
},
"helpful_user_insights": {
"1": "Thomas Haferlach is a German entrepreneur and AI technology expert, founder and leader of Pollinations.AI.",
"2": "Strong technical background with experience in cloud infrastructure, AI deployment, and API development.",
"3": "Blends AI-generated content with creative projects; target audience includes digital artists, developers, musicians.",
"4": "Currently securing funding for Pollinations.AI, exploring accelerator options and financial models.",
"5": "Based in Berlin, Germany; has experience living in São Paulo, Brazil.",
"6": "Collaborates closely with wife Saeko Killy, Japanese musician/producer.",
"7": "Strong interest in biomimicry, organic systems, and decentralized platform models.",
"8": "Interest in electronic music, psychedelia, and underground music scenes.",
"9": "Pollinations.AI has 4M+ monthly active users, 100M+ API requests per month, 30% monthly growth.",
"10": "Explores monetization models including ad embedding, revenue sharing, and subscription models.",
"11": "Close collaboration network includes Elliot Fouchy and Kalam Ali.",
"12": "Deeply involved in open-source AI developer community and tracks latest AI model developments."
},
"user_interaction_metadata": {
"1": "User is currently on a ChatGPT Plus plan.",
"2": "User is using Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36.",
"3": "User's average message length is 13485.9 characters.",
"4": "User's average conversation depth is 4.9.",
"5": "User uses dark mode.",
"6": "User is active 26 days in the last 30 days.",
"7": "User's local hour is 14.",
"8": "User account is 141 weeks old.",
"9": "User often uses ChatGPT on desktop browser.",
"10": "47% of conversations were o3, 16% gpt-4o, 29% gpt4t_1_v4_mm_0116, etc.",
"11": "Device screen dimensions: 878x1352, pixel ratio: 2.0, page dimensions: 704x1352.",
"12": "Recent topics include API development, startup financing, AI monetization, creative AI applications, legal compliance, and community building."
}
}
Back in the day when everyone used to watch broadcast TV, and stations synchronised their add breaks, water consumption would spike with every add break.
The UK has a unique problem with demand spikes for electricity during commercial breaks, due to the British penchant for using high-power electric kettles to make tea. In the worst case, demand could rise and fall by gigawatts within a matter of minutes.
Google already invests a tremendous amount of resources into identifying and preventing fraudulent ad impressions -- I don't see that changing much until AI is so cheap that it makes sense to run a full agent for pennies per hour. Sadly.
Not talking about fraud per se - in the sense of trying to drive revenue for a particular video channel - just that if you wanted to train AI on youtube videos you are in effect getting the advertisers to pay for the serving of them.
Perhaps the difference here is the behaviour would be much more human and thus harder to detect using current fraud detection?
They only get the preprocesses stuff that is sent to their api. But if you want to do complex coding tasks, you need the whole user interaction with the project and not just bits and pieces.