More

BoxFour · 2025-09-08T12:18:51 1757333931

Social media has proven to be quite an effective tool for mobilizing protests and beyond. I get how the short-sighted might see it as a tactical move to "cripple logistics" by banning social media.

But, the reason I call it short-sighted is exactly what you said: Removing those earlier pressure-release valves doesn’t solve the underlying issue at all and just increases the risk of a more volatile outcome.

JumpCrisscross · 2025-09-08T12:51:45 1757335905

> Social media has proven to be quite an effective tool for mobilizing protests

Gatherings, yes. Effective protest, I’m less convinced.

Effective protests “have clear strategic goals, use protest to broaden coalitions, seek to enlist more powerful individuals in their cause, and connect expressions of discontent to broader political and electoral mobilization” [1].

Social media helps enlist the elite. But it absolutely trashes clarity of goals and coalition broadening, often degrading into no true Scotsman contests. If a protest is well planned, social media can help it organize. But if a movement is developing, social media will as often keep it in a leaderless, undisciplined and thus ineffective state.

[1] https://www.brookings.edu/articles/the-power-of-protest-in-t...

BoxFour · 2025-08-18T12:46:09 1755521169

Equities markets are largely driven by institutional investors, save for some notable exceptions ("meme stocks").

Unless the theory is that institutional investors are doing the same, it's not that surprising.

BoxFour · 2025-08-05T14:46:50 1754405210

> I still think complaining about "hallucination" is a pretty big "tell".

The conversation around LLMs is so polarized. Either they’re dismissed as entirely useless, or they’re framed as an imminent replacement for software developers altogether.

Hallucinations are worth talking about! Just yesterday, for example, Claude 4 Sonnet confidently told me Godbolt was wrong wrt how clang would compile something (it wasn’t). That doesn’t mean I didn’t benefit heavily from the session, just that it’s not a replacement for your own critical thinking.

Like any transformative tool, LLMs can offer a major productivity boost but only if the user can be realistic about the outcome. Hallucinations are real and a reason to be skeptical about what you get back; they don’t make LLMs useless.

To be clear, I’m not suggesting you specifically are blind to this fact. But sometimes it’s warranted to complain about hallucinations!

tptacek · 2025-08-05T15:03:32 1754406212

That's not what people mean when they bring up "hallucinations". What the author apparently meant was that they had an agent generating Terraform for them, and that Terraform was broken. That's not surprising to me! I'm sure LLMs are helpful for writing Terraform, but I wouldn't expect that agents are at the point of being able to reliably hand off Terraform that actually does anything, because I can't imagine an agent being given permission to iterate Terraform. Now have an agent write Java for you. That problem goes away: you aren't going to be handed code with API calls that literally don't exist (this is what people mean by "hallucination"), because that could wouldn't pass a compile or linter pass.

JoshuaDavid · 2025-08-05T19:25:17 1754421917

Are we using the same LLMs? I absolutely see cases of "hallucination" behavior when I'm invoking an LLM (usually sonnet 4) in a loop of "1 generate code, 2 run linter, 3 run tests, 4 goto 1 if 2 or 3 failed".

Usually, such a loop just works. In the cases where it doesn't, often it's because the LLM decided that it would be convenient if some method existed, and therefore that method exists, and then the LLM tries to call that method and fails in the linting step, decides that it is the linter that is wrong, and changes the linter configuration (or fails in the test step, and updates the tests). If in this loop I automatically revert all test and linter config changes before running tests, the LLM will receive the test output and report that the tests passed, and end the loop if it has control (or get caught in a failure spiral if the scaffold automatically continues until tests pass).

It's not an extremely common failure mode, as it generally only happens when you give the LLM a problem where it's both automatically verifiable and too hard for that LLM. But it does happen, and I do think "hallucination" is an adequate term for the phenomenon (though perhaps "confabulation" would be better).

Aside:

> I can't imagine an agent being given permission to iterate Terraform

Localstack is great and I have absolutely given an LLM free rein over terraform config pointed at localstack. It has generally worked fine and written the same tf I would have written, but much faster.

coltonv · 2025-08-05T15:17:11 1754407031

With terraform, using a property or a resource that doesn't exist is effectively the same as an API call that does not exist. It's almost exactly the same really, because under the hood terraform will try to make a gcloud/aws API call with your param and it will not work because it doesn't exist. You are making a distinction without a difference. Just because it can be caught at runtime doesn't make it insignificant.

Anyway, I still see hallucinations in all languages, even javascript, attempting to use libraries or APIs that do not exist. Could you elaborate on how you have solved this problem?

lucumo · 2025-08-05T18:24:49 1754418289

> Anyway, I still see hallucinations in all languages, even javascript, attempting to use libraries or APIs that do not exist. Could you elaborate on how you have solved this problem?

Gemini CLI (it's free and I'm cheap) will run the build process after making changes. If an error occurs, it will interpret it and fix it. That will take care of it using functions that don't exist.

I can get stuck in a loop, but in general it'll get somewhere.

tptacek · 2025-08-05T15:53:55 1754409235

Yeah, again, zero trouble believing that agents don't reliably produce sane Terraform.

timmytokyo · 2025-08-05T17:24:36 1754414676

As if a compiler or linter is the sole arbiter of correctness.

tptacek · 2025-08-05T18:02:24 1754416944

Nobody said anything about "correctness". Hallucinations aren't bugs. Everybody writes bugs. People writing code don't hallucinate.

It's a pretty obvious rhetorical tactic: everybody associates "hallucination" with something distinctively weird and bad that LLMs do. Fair enough! But then they smuggle more meaning into the word, so that any time an LLM produces anything imperfect, it has "hallucinated". No. "Hallucination" means that an LLM has produced code that calls into nonexistent APIs. Compilers can and do in fact foreclose on that problem.

timmytokyo · 2025-08-05T21:46:54 1754430414

Speaking of rhetorical tactics, that's an awfully narrow definition of LLM hallucination designed to evade the argument that they hallucinate.

If, according to you, LLMs are so good at avoiding hallucinations these days, then maybe we should ask an LLM what hallucinations are. Claude, "in the context of generative AI, what is a hallucination?"

Claude responds with a much broader definition of the term than you have imagined -- one that matches my experiences with the term. (It also seemingly matches many other people's experiences; even you admit that "everybody" associates hallucination with imperfection or inaccuracy.)

Claude's full response:

"In generative AI, a hallucination refers to when an AI model generates information that appears plausible and confident but is actually incorrect, fabricated, or not grounded in its training data or the provided context.

"There are several types of hallucinations:

"Factual hallucinations - The model states false information as if it were true, such as claiming a historical event happened on the wrong date or attributing a quote to the wrong person.

"Source hallucinations - The model cites non-existent sources, papers, or references that sound legitimate but don't actually exist.

"Contextual hallucinations - The model generates content that contradicts or ignores information provided in the conversation or prompt.

"Logical hallucinations - The model makes reasoning errors or draws conclusions that don't follow from the premises.

"Hallucinations occur because language models are trained to predict the most likely next words based on patterns in their training data, rather than to verify factual accuracy. They can generate very convincing-sounding text even when "filling in gaps" with invented information.

"This is why it's important to verify information from AI systems, especially for factual claims, citations, or when accuracy is critical. Many AI systems now include warnings about this limitation and encourage users to double-check important information from authoritative sources."

tptacek · 2025-08-05T21:54:42 1754430882

What is this supposed to convince me of? The problem with hallucinations is (was?) that developers were getting handed code that couldn't possibly have worked, because the LLM unknowingly invented entire libraries to call into that don't exist. That doesn't happen with agents and languages with any kind of type checking. You can't compile a Rust program that does this, and agents compile Rust code.

Right across this thread we have the author of the post saying that when they said "hallucinate", they meant that if they watched they could see their async agent getting caught in loops trying to call nonexistent APIs, failing, and trying again. And? The point isn't that foundation models themselves don't hallucinate; it's that agent systems don't hand off code with hallucinations in it, because they compile before they hand the code off.

timmytokyo · 2025-08-06T02:21:19 1754446879

If I ask an LLM to write me a skip list and it instead writes me a linked list and confidently but erroneously claims it's a skip list, then the LLM hallucinated. It doesn't matter that the code compiled successfully.

tptacek · 2025-08-06T03:56:04 1754452564

Get a frontier model to write an slist when you asked for a skip list. I'll wait.

BoxFour · 2025-08-04T15:34:29 1754321669

> The others are very frenetic

I think the pace is because a lot of the episodes revolve around play and games - and any sort of play with children does tend to be a bit frenetic. There’s a good number of episodes that aren’t that, including the two you mentioned, but it would be a bit strange for a show about play and imagination to not be a bit frenetic.

> It also often shows a lot of bad behaviour that kids can interpret as funny (the cousin running away with the phone after being told, the old lady buying the scooter).

There’s bad behavior that is funny, sure, but almost all of those episodes demonstrate the consequences of it even if in a humorous fashion: Muffin is constantly facing consequences for her actions, for example. I think that’s an ok trade off.

BoxFour · 2025-08-03T16:25:59 1754238359

This strikes me as a somewhat unfair characterization of many of these communities. In my experience, a much more common issue is that the people who do have answers end up ignoring the group and it becomes pointless. It rarely becomes a source of career hindrance or long-lasting judgement, it just ends up being useless because there's not a lot of incentive for the expert side of the equation.

People who are likely to judge people for dumb questions are rarely involved in those groups in the first place, for exactly all the obvious reasons.

The more realistic outcome isn’t that your boss ends up a drunk puking slob (and for what it’s worth most of these groups don’t include leadership anyway, so not sure why anyone's boss would be involved) but that an intern floats a terrible idea ("I'm thinking of taking these 10 shots of 151"), nobody responds, they take silence as approval, and they end up causing a mess and then being judged for the mess they caused.

A quick gut check from them with a healthy group might get a few eye rolls and a "here's why that's a bad idea", but not any lasting judgement unless they completely ignored the advice.

The only case I can think of where that might happen is if they already did something which has policy or legal implications ("hey i accidentally dumped the whole user base including PII to my phone"), in which case - good? There should be a review mechanism, including consequences if they ignored a bunch of roadblocks.

nucleardog · 2025-08-03T17:26:34 1754241994

> It rarely becomes a source of career hindrance or long-lasting judgement, it just ends up being useless because there's not a lot of incentive for the expert side of the equation.

Yeah, the incentive structure for something like this is totally misaligned for this to work effectively in many cases outside of a very small, tight-knit team. (In which case... why the formality in the first place?)

For the "juniors": Why waste time digging through documentation, searching, or thinking--I can just post and get an answer with less effort.

For the "seniors": I'm already busy. Why waste time answering these same questions over and over when there's no personal benefit to doing so?

Sure, there are some juniors that will try and use it as a last resort and some seniors that will try their best to be helpful because they're just helpful people... but I usually see the juniors drowned out by those described above and the experts turn into those described above.

I think we _could_ come up with something that better aligned incentives though. Spitballing--

Juniors can ask a question. Once a senior answers, the junior then takes responsibility for making sure that question doesn't need to be answered there again--improving the documentation based on that answer. Whether that's creating new documentation, adding links or improving keywords to help with search, etc. That change then gets posted for a quick edit/approval by the senior mainly to ensure accuracy.

Now we're looking at something more like:

For the "juniors": If I ask a question, I will get an answer but it will create additional work on my end. If I ask something already answered in the documentation that I could have easily found, I basically have to publicly out myself as not having looked when I can't propose an improvement to the documentation. And that, fairly, is going to involve some judgement.

For the "seniors": Once I answer a question, someone is going to take responsibility for getting this from my head into documentation so I never need to answer this again.

This has an added benefit of shifting some of the documentation time off of the higher paid, generally more productive employees onto the lower paid, less productive employees and requiring them to build out some understanding in order to put it into words. It may also help produce some better documentation because stuff that a senior writes is more often going to assume knowledge that stuff a junior writing may think to explain because _they_ didn't know it. It also means that searching in the Slack/other channel, any question you find should end up with a link to the documentation where it's been answered which should help you discover more adjacent documentation all of which should be the most up-to-date and canonical answer we have.

BoxFour · 2025-08-03T18:00:06 1754244006

I’m on board with the overall point, though I’d actually flip the logic in this section:

> Once a senior answers, the junior then takes responsibility for making sure that question doesn't need to be answered there again.

That might make sense for simple questions. But for anything more complex, especially when the issue stems from something you have control over, having senior folks take ownership might make more sense. If they can tie the fix to visible impact, there’s a strong incentive for them to actually solve the root problem. Otherwise, there’s always the risk that experienced team members simply ignore the question 100% of the time (which also solves the problem of "i've already answered this question").

One way seniors might approach these types of groups is by treating them as a source of ideas. Repeated questions like “how do I use X?” might indicate that X needs a redesign or better onboarding. An experienced corporate climber could treat those questions as justification for "X 2.0 which is way easier to onboard to" and get backing to work on it.

Anyone who’s spent time at a large tech company has likely seen this dynamic play out, because it’s a common pathway to promotion. Definitely taken to problematic extremes, no doubt, but a slightly-healthier version of that playbook still beats the alternative of relying on the arcane knowledge of a select few as gatekeepers of information.

BoxFour · 2025-07-27T16:33:43 1753634023

That point might hold more weight if we were talking about someone decades younger, but he’s 80 and hasn’t exactly lived a healthful life. As far as I know, no one’s managed to beat father time yet.

Transitions of power in movements built around a cult of personality rarely keep the same momentum. There are a few exceptions, but in most of those the clear successor had their own charisma. That doesn’t appear to be the case here.

BoxFour · 2025-07-25T17:38:26 1753465106

Even if you manage to sidestep the issues with payment processors mentioned elsewhere, you don’t end up as a “popular platform that just happens to take a principled stance and also hosts some controversial material.”

Instead, you become the hub for that kind of material — and that reputation drives away more mainstream creators who won’t want their work associated with it. See also: Kick, Parlor, etc.

Rather than building a principled broad competitor to something like Steam, you end up cornering yourself into a narrow, highly specific market segment.

gs17 · 2025-07-25T18:48:49 1753469329

One thing that might be a possibility for attracting developers of non-banned games is focusing on having lower fees than Steam's 30% or Epic's 12%, but Itch.io already does that (you can choose the split from 0 to 100%).

magicmicah85 · 2025-07-25T17:44:38 1753465478

>Rather than building a principled broad competitor to something like Steam, you end up cornering yourself into a narrow, highly specific market segment.

Yes, that's the point. Not everyone cares about financial censorship, but the few that do will be your customers.

BoxFour · 2025-07-25T17:46:55 1753465615

when you start talking about a business for serving “the few”, you’ve already removed the incentive for most entrepreneurs (unless those “few” are the extremely wealthy and you can charge them exorbitantly).

hyghjiyhu · 2025-07-25T17:45:30 1753465530

I've watched hikaru on kick and the only offensive thing about his stream is how he repeats himself. I don't really like how he says the same thing over and over. Chat, it's kinda starting to bother me how he repeats himself. Yeah I'm starting to think he repeats himself a bit too much for my taste.

topato · 2025-07-25T18:32:28 1753468348

After the third sentence, I was like, what's wrong with this guy? After the fourth, I was like, Oh. Lol.

hyghjiyhu · 2025-07-25T23:12:37 1753485157

I guess from the down votes I'm getting that people don't have the full context here and won't seek it out on their own - something I should have foreseen.

I'm speaking of Hikaru Nakamura, who is one of the best chess players in the world. He is also a streamer on kick, and actually talks in the way I demonstrated. It's not an exaggeration, he actually repeats the same thought ~5 times in the regular.

He is the only kick streamer I know, so that's what I think of when I hear kick.

BoxFour · 2025-07-21T16:54:55 1753116895

If I were a VC or PE firm, this is exactly where I’d be putting a lot of money in the next year or two. Right now there’s a lot of fear given the stance of the current administration, which makes it pretty ripe for smart money willing to play the long game (as Buffett famously noted).

The technology keeps improving, clean energy is increasingly shaping up to be a new arms race with China, and politics these days tend to swing back and forth wildly. By 2028+, it’s very plausible we’ll see things 180 and there'll be plenty of government attention given to clean energy. Even the current administration could change their tune if it's positioned as "beating China" (or even for no reason at all, because who knows with them).

Spending a couple years to prop it up and become a well-established player by then could be a huge advantage.

fromwilliam · 2025-07-21T17:35:24 1753119324

Do you mean investments in solar panel manufacturing, or something else? From what I understand solar panels are somewhat commoditized, and China has massive subsidies for their manufacturers. I wouldn't want to get in that game. If you mean battery R&D + manufacture, I think that could be promising

ipdashc · 2025-07-22T05:09:04 1753160944

I wonder how much is being invested into reducing the costs of inverters and MPPTs and such. ("Balance of system" seems to be the term?)

I'm looking into a DIY install, and it's looking like the microinverters are going to basically be just as expensive as the panels themselves. A quick Google seems to imply this is similar for utility-scale installs: the BoS costs are less than, but still comparable to the costs of the actual panels.

On one hand I get it; panels are very simple, robust devices, while inverters need to interface with the grid and usually have network connectivity and so on. On the other hand, there's a lot less material in an inverter, and they're still relatively simple electronics? Which we're pretty good at mass producing cheaply. You'd think there's a lot of room there to get the cost down.

dv_dt · 2025-07-21T17:54:29 1753120469

On solar panel investments, at the time TSMC got into the chip game, I think most people might have said something very similar to a TSMC. Chips are commodotized, and the existing entrants are highly capitalized, and why TSMC do you think you can outdo the likes of 1987 Intel TI, Motorola, NEC et al.

Perkskovites to name one tech, will probably be a generational shift in solar panel technologies, the US would be stupid to miss it if they want to be a future world energy player outside the slow inevitable decline of fossil fuels.

For who has the stomach to fund it, there is available maybe another order of magnitude in cost performance in solar, and say two or three orders of magnitude of cost performance available in batteries?

thijson · 2025-07-21T18:03:36 1753121016

Brookfield is investing in it:

https://www.youtube.com/watch?v=vTd02-0BiOM

BoxFour · 2025-07-20T17:29:19 1753032559

Even in a place as dense as manhattan (where I live), it’s very common to spend 30 minutes on your commute still between walking to/from the subway and waiting for the train. Without sounding too harsh, a 20-minute walk really shouldn’t be a major hurdle for the vast majority of working adults.

Whenever I visit other parts of the US I’m struck by how resistant people are to walking even half a mile in the best of circumstances: wide, well-lit sidewalks etc. It’s remarkable how often we default to driving for trips that clearly don’t require it, and it’s like I’m speaking heresy for even suggesting it when visiting somewhere that has pedestrian paths.

At the heart of the public transit debate, it seems, is a simple reality: Much of the country simply doesn’t want to move at all, even short distances. Suggesting someone walks half a mile sometimes feels like suggesting they run a marathon. All the pedestrian infrastructure in the world won’t change that.

triceratops · 2025-07-22T14:53:02 1753195982

> a 20-minute walk really shouldn’t be a major hurdle for the vast majority of working adults

Most US adults live in suburban or non-dense cities. In the 20 minutes it takes to walk to and from the bus stop they can complete most of their trips in a car. Driving is the competition. Walking per se isn't the problem. The time it wastes is the hurdle.

> Even in... manhattan... it’s very common to spend 30 minutes... between walking to/from the

And that's an acceptable tradeoff in Manhattan because driving the same distance would take twice as long and be 4 times as expensive.

BoxFour · 2025-07-05T14:27:43 1751725663

> I’m very torn at the moment if he was an incredible coach or just rode the wave or Brady talent.

Honestly, it’s hard to imagine they’d have been anywhere near that successful if the answer wasn't just "both."

You see plenty of examples of great coaches stuck with lousy rosters (Parcells with the Cowboys), and also great players on poorly run teams (Patricia-era Lions). Usually when a team only has one or the other, they continually flame out early in the playoffs.

> these next two seasons at UNC for ‘ol Bill will be really telling.

I wouldn’t read too much into that. He’s 73, the game’s evolved a lot, and coaching college is a whole different thing from the NFL. It’s incredibly rare for someone to excel at both — guys like Pete Carroll being the exception that prove the rule.

satyrun · 2025-07-05T15:14:53 1751728493

Exactly. It is such a stupid debate when Belichick coached and molded Brady into what he became.

Everyone has always said Belichick is basically an encyclopedia of football knowledge.

dgfitz · 2025-07-06T00:04:59 1751760299

That’s my whole point. Brady went on to win a ring in Tampa. Bill did… what?

I don’t give belicheck the credit for teaching Brady, you can’t teach that. It’s not stupid at all if you’re a fan of the sport.