Thanks for the input! I'm checking on Claude Code Max now - from what I'm seeing, even the $200/month plan has weekly rate limits (240-480 hours of Sonnet 4, 24-40 hours of Opus 4 per week).. so not quite unlimited tokens either, though definitely more predictable billing.
$638/6 weeks won't make me broke, but here's my main issue: for me it's about the value-to-token ratio feeling off.
What bugs me most is that many of those 340M tokens feel wasteful? Like the LLM will use 50k tokens exploring dead ends before finding a solution that could have been expressed in 5k tokens. The productivity gain is real, but it feels like I'm paying 10x more than what should be "fair" for the actual value delivered.
Maybe this is just the current state of AI coding - the models need that exploration space to get to the answer. Or maybe I need to get better at constraining the context and being more surgical with my prompts.
For me as a founder, it's less "can I afford this" and more "does this pricing model make sense long-term?" If AI coding becomes a $5-6k/year baseline expense per developer, that changes a lot of unit economics, especially for early-stage companies.
Are you finding Claude Code Max more token-efficient for similar tasks, or is it just easier to stomach because the billing is flat?
i think when you are testing out ideas you cannot also be thinking about how efficient that is. it doesn't make a lot of sense unless the problem you are trying to solve is efficiency. So like, I get your point, but I don't think anyone is wasting tokens, the LLM explores different solutions and arrives at ones that work. You seem to not want to pay for the tokens used on bad solutions, but they were useful to find the actual solutions. I also think that there are plenty of software licenses that we pay for in my work that are multiple times 5-6k/year and yet all our software is much cheaper than the salaries that cover our developers. Good developer tools are always worth it imo.
But that doesn't make sense? Why would they keep the cache persistent in the VRAM of the GPU nodes, which are needed for model weights? Shouldn't they be able to swap in/out the kvcache of your prompt when you actually use it?
Your intuition is correct and the sibling comments are wrong. Modern LLM inference servers support hierarchical caches (where data moves to slower storage tiers), often with pluggable backends. A popular open-source backend for the "slow" tier is Mooncake: https://github.com/kvcache-ai/Mooncake
OK that's pretty fascinating, turns out Mooncake includes a trick that can populate GPU VRAM directly from NVMe SSD without it having to go through the host's regular CPU and RAM first!
> Transfer Engine also leverages the NVMeof protocol to support direct data transfer from files on NVMe to DRAM/VRAM via PCIe, without going through the CPU and achieving zero-copy.
They fit different niches, IME; node-red is designed for IoT workloads, and so is a great fit for high volume messaging; n8n on the other hand is more workflow automation focused -like Zapier - and so has higher level abstractions and is less focused on performance efficiency.
Can they both fit some of the same use cases? Definitely.
The main reason is a fear - which as far as I know is unsubstantiated - that China's got backdoors in tech sold in the west. But with so many hackers and national security agencies disassembling and analyzing these things, surely they'd have found it by now?
Therefore, I wouldn't be surprised if it's more about market protection. Which doesn't make much sense to me because is there a major US drone manufacturer that can't compete with DJI right now?
> with so many hackers and national security agencies disassembling and analyzing these things, surely they'd have found it by now
This is a common misconception. With OTA updates, the backdoor can be introduced at any time in a future software version. For example, right before an attack.
I'm excited to introduce codeplot, a tool I've been working on that's designed to revolutionize the way we interact with data visualizations in Python.
What is codeplot?
codeplot is an interactive spatial canvas that allows for dynamic data exploration. It's built to move beyond static images and fixed layouts, giving your data the interactive, engaging platform it deserves. With codeplot, you can easily integrate live data visualizations directly from your Python code or REPL into a flexible, interactive canvas hosted at codeplot.co.
Key Features:
Dynamic Visualization: Say goodbye to static charts. Visualize your data in real-time on an interactive canvas. Easy Integration: Seamlessly plot from Python with just a few lines of code. Varied Visualizations: Support for a wide range of data representations, from basic charts to complex widgets. Flexible Layouts: Customize your data exploration space with draggable and resizable plots. Open Community: Whether you're a data scientist or a hobbyist, codeplot is designed for anyone passionate about data. Getting Started is Simple:
Install codeplot with pip, connect to a room, and start plotting right away. We even support usage in Jupyter Notebooks for an integrated development experience.
Docker Support:
For those who prefer self-hosting, codeplot is Docker-ready, allowing you to run your own server and client locally with ease.
Join Our Community:
We're building a community of data enthusiasts and professionals on Discord. It's a place to share insights, ask questions, and collaborate on data visualization projects.
I'd love to get your feedback, suggestions, and hear about the visualizations you create with codeplot. Let's make data exploration more interactive and engaging together!
I'm excited to introduce codeplot, a tool I've been working on that's designed to revolutionize the way we interact with data visualizations in Python.
What is codeplot?
codeplot is an interactive spatial canvas that allows for dynamic data exploration. It's built to move beyond static images and fixed layouts, giving your data the interactive, engaging platform it deserves. With codeplot, you can easily integrate live data visualizations directly from your Python code or REPL into a flexible, interactive canvas hosted at codeplot.co.
Key Features:
Dynamic Visualization: Say goodbye to static charts. Visualize your data in real-time on an interactive canvas. Easy Integration: Seamlessly plot from Python with just a few lines of code. Varied Visualizations: Support for a wide range of data representations, from basic charts to complex widgets. Flexible Layouts: Customize your data exploration space with draggable and resizable plots. Open Community: Whether you're a data scientist or a hobbyist, codeplot is designed for anyone passionate about data. Getting Started is Simple:
Install codeplot with pip, connect to a room, and start plotting right away. We even support usage in Jupyter Notebooks for an integrated development experience.
Docker Support:
For those who prefer self-hosting, codeplot is Docker-ready, allowing you to run your own server and client locally with ease.
Join Our Community:
We're building a community of data enthusiasts and professionals on Discord. It's a place to share insights, ask questions, and collaborate on data visualization projects.
I'd love to get your feedback, suggestions, and hear about the visualizations you create with codeplot. Let's make data exploration more interactive and engaging together!
$638/6 weeks won't make me broke, but here's my main issue: for me it's about the value-to-token ratio feeling off.
What bugs me most is that many of those 340M tokens feel wasteful? Like the LLM will use 50k tokens exploring dead ends before finding a solution that could have been expressed in 5k tokens. The productivity gain is real, but it feels like I'm paying 10x more than what should be "fair" for the actual value delivered.
Maybe this is just the current state of AI coding - the models need that exploration space to get to the answer. Or maybe I need to get better at constraining the context and being more surgical with my prompts.
For me as a founder, it's less "can I afford this" and more "does this pricing model make sense long-term?" If AI coding becomes a $5-6k/year baseline expense per developer, that changes a lot of unit economics, especially for early-stage companies.
Are you finding Claude Code Max more token-efficient for similar tasks, or is it just easier to stomach because the billing is flat?