More

potsandpans · 2025-02-04T01:30:03 1738632603

Could someone please cite for me in plain english what has transpired in the last 48 hours?

quinncom · 2025-02-05T00:09:10 1738714150

Matt Kiser’s WTFJHT delivers useful daily summaries of all significant actions out of the White House. Here’s yesterday’s edition: https://whatthefuckjusthappenedtoday.com/2025/02/03/day-1476...

throw310822 · 2025-02-04T01:36:53 1738633013

Also shutting down of federal agencies without Congress approval and getting access to their data:

"Musk on the call also appeared to claim credit for the shutdown of the U.S. Agency for International Development ... though the executive branch’s legal authority to do so without congressional action is highly in doubt, as the agency’s existence is established in law. Trump on Monday said he didn’t need an act of Congress to shut down USAID.

USAID’s homepage has been shut down for days, as is its X account. Dozens of career staff at the agency have been put on leave, and hundreds have been shut out of agency computer systems. Musk aides have gained access to classified USAID information over the objection of agency security personnel, who were subsequently placed on leave, The Associated Press reported."

walterbell · 2025-02-04T02:01:20 1738634480

https://www.politico.com/news/2025/02/03/rubio-acting-head-u...

> The Trump administration took a variety of steps Monday.. The most dramatic was naming Secretary of State Marco Rubio as acting administrator of USAID.. The State Department said in its statement announcing Rubio’s acting role that he has also informed Congress that “a review of USAID’s foreign assistance activities is underway with an eye towards potential reorganization.” .. Rubio explained he was entrusting the duties of deputy administrator to Peter Marocco, the head of the State Department’s office of foreign assistance.. Marocco is also leading the review into USAID activities.

State Department tweet: https://nitter.net/statedept/status/1886512412863684802 or https://xcancel.com/statedept/status/1886512412863684802

binary_slinger · 2025-02-04T03:22:06 1738639326

[flagged]

hackyhacky · 2025-02-04T06:31:24 1738650684

> Trump and his colleagues are trying to reduce spending by cutting payments to things deemed unnecessary. Some feel the process in which they do this is questionable and opaque.

That's not quite accurate. I would say: "some feel the process in which they do this is unconstitutional as the executive branch does not have the authority to dissolve programs." And furthermore: "some feel entrusting these vital government functions to an un-vetted non-employee with no oversight will lead to abuse and conflicts of interest."

nosioptar · 2025-02-04T01:31:55 1738632715

[flagged]

bckr · 2025-02-04T01:34:13 1738632853

Wikipedia says he’s a citizen.

epicureanideal · 2025-02-04T01:38:27 1738633107

Isn’t the President able to appoint people to do these things? It’s not clear to me this is not permitted.

yibg · 2025-02-04T01:40:57 1738633257

Does the president have authority shut down USAID?

dole · 2025-02-04T01:46:10 1738633570

Can President Trump Dissolve USAID by Executive Order?

According to Just Security, Trump can drastically curtail USAID with executive actions alone; however, he “may not unilaterally override” a statute by executive order. USAID was established by statute as its own agency via Congress in 1998 after first being created by executive order in 1961 by President John F. Kennedy. Since an act of Congress established it as an independent agency, an act of Congress would be necessary to dissolve it. - Forbes

__d · 2025-02-04T01:59:12 1738634352

As with many such things, the letter of the law doesn't anticipate a determined effort to undermine it.

Could the USAID agency could continue to exist, on paper, with a single staff member?

walterbell · 2025-02-04T02:10:04 1738635004

Statements today to Congress and public, https://news.ycombinator.com/item?id=42926580

searke · 2025-02-04T01:41:38 1738633298

tl;dr No. If this were permitted, then a president would just be a dictator.

And he didn't even appoint Musk. Musk isn't head of any government department. He's just a random guy who has a ton of money.

Mountain_Skies · 2025-02-04T01:47:34 1738633654

>unelected

Almost everyone in the government is unelected.

cf100clunk · 2025-02-04T02:21:29 1738635689

A poet?

https://news.ycombinator.com/item?id=42888756

potsandpans · 2025-01-30T03:39:48 1738208388

> DeepSeek is more affiliated and partnered with the CCP...

Given what we know about the PRISM program, it's a distinction without much difference.

potsandpans · 2025-01-27T22:05:51 1738015551

Counterexample: Ive been able to complete more side projects in the last month leveraging llms than i have ever in my life. One of which I believe to have potential as a viable product, and another which involved complicated rust `no_std` and linker setup for compiling rust code onto bare metal RISCV from scratch.

I think the key to being successful here is to realize that you're still at the wheel as an engineer. The llm is there to rapidly synthesize the universe of information.

You still need to 1) have solid fundamentals in order to have an intuition against that synthesis, and 2) be experienced enough to translate that synthesis into actionable outcomes.

If youre lacking in either, youre at the same whims of copypasta that have always existed.

pfannkuchen · 2025-01-27T23:10:40 1738019440

I’ve found LLMs to basically be a more fluent but also more lossy way of interfacing with stack overflow and tutorials.

If a topic is well represented in those places, then you will get your answer quicker and it can be to some extent shaped to your use case.

If the topic is not well represented there, then you will get circular nonsense.

You can say “obviously, that’s the training data”, and that’s true, and I do find it obvious personally, but the reaction to LLMs as some kind of second coming does not align with this reality.

biohcacker84 · 2025-01-28T01:39:45 1738028385

That matches my experience too. I wonder how fast they'll improve and if LLMs will hit a wall, as some AI experts think.

janderson215 · 2025-01-28T06:26:55 1738045615

Is it possible that you’re both using LLMs the same way you’d use SO and that’s the reason you see such similarities? The reason I ask is because it doesn’t not match my experience. It feels more like I’m able to Matrix-upload docs into my brain like Trinity learning to fly a helicopter.

pfannkuchen · 2025-01-28T07:48:09 1738050489

I am using it like stack overflow in the sense that I’m solving a problem and I’m using it to answer questions when I’m in an unfamiliar or non-obvious place in the problem space.

If I have a question about a first order language or framework feature or pattern, it works great. If I have a question about a second order problem, like an interaction between language or framework features, or a logical inconsistency in feature behavior, then it usually has no idea what’s going on, unless it turns out to be a really common problem such as something that would come up when working through a tutorial.

For code completion, I’ve just turned it off. It saves time on boilerplate typing for sure, but the actual content pieces are so consistently wrong that on balance I find it distracting.

Maybe I have a weird programming style that doesn’t mesh well with the broader code training corpus, not sure. Or maybe a lot of people spend more time in the part of problem-space that intersects with tutorial-space? I am not very junior these days.

That being said I definitely do use LLMs to engage with tutorial type content. For that it is useful. And outside of software it is quite a bit better for interfacing with Wikipedia type content. Except for the part where it lies to your face. But it will get better! Extrapolating never hurt anyone.

x0x0 · 2025-01-28T21:03:57 1738098237

Same. I use it to bootstrap my writing a react native app from pretty low familiarity with React.

It's pretty good at writing screens in broad strokes. You will have to fill in some details.

The exact details of correctly threading data through; or prop drilling vs alternatives; the rules around wrapping screens to use them in React Navigation? It's terrible at them.

h0l0cube · 2025-01-28T06:36:33 1738046193

> I’m able to Matrix-upload docs into my brain like Trinity learning to fly a helicopter.

So you're using it like Wikipedia? I find when learning something new (and non coding related) YouTube is infinitely better than an LLM. But then I prefer visual demonstration to tutorial or verbal explanation.

almostdeadguy · 2025-01-28T17:22:14 1738084934

I'm sorry, what?

ASalazarMX · 2025-01-29T23:42:39 1738194159

> It feels more like I’m able to Matrix-upload docs into my brain like Trinity learning to fly a helicopter

Did you puzzle about this sentence specifically? Imagine you don't know jack about flying helicopters, then Tank uploads the Helicopter Pilot Program (TM) directly to your brain; it would feel like magic.

Conversely, if you know a lot about helicopters, just not enough to fly a B-212, and the program includes instructions like "Press (Y) and Left Stick to hover", you'd know it's confabulating real world piloting with videogames.

That's the same with LLMs, you need to know a lot of the field to ask the right questions, and recognize/correct slop or confabulation, otherwise they seem much more powerful and smart than they really are.

almostdeadguy · 2025-01-30T01:19:50 1738199990

[flagged]

ASalazarMX · 2025-01-30T20:03:11 1738267391

My horrible grammar has to be a giveaway it didn't, ha ha.

EarthMephit · 2025-01-28T00:01:14 1738022474

I find that LLMs are almost comically bad at projects that have a hardware component like RaspberryPi or Pico, or Ardunio.

I think that its because often the libraries you use are niche or have a a few similar versions, the LLM really commonly hallucinated solutions and would continually suggest that library X did have that capability. I think because often in hardware projects you often hit a point where you can't do something or you need to modify a library, but the LLM tries to be "helpful" and it makes up a solution.

jerf · 2025-01-28T02:10:53 1738030253

Based on my own modestly successful forays into that world, I have to imagine one problem the LLMs have in that space is terrible training data. A good three quarters of any search result you search for in that space will be straight-up out-of-date and not work on your system. Then you've got all the tiny variations between dozens of chipsets, and all the confidently wrong people on the internet telling you to do nonsensical things, entire ecosystems that basically poofed out of existence three years ago but are still full of all kinds of juicy search terms and Google juice... if I can hardly paw through this stuff with years of experience and the physical hardware right in front of me to verify claims with, I don't know how an LLM is supposed to paw through all that and produce a valid answer to any question in that space, without its own hardware to fiddle with directly.

lostlogin · 2025-01-28T06:30:37 1738045837

You’re making my eye twitch.

The number of projects I’ve done where my notes are the difference between hours of relearning The Way or instant success. Google doesn’t work as some niche issue is blocking the path.

ESP32, Arduino, Home Assistant And various media server things.

chamomeal · 2025-01-28T01:38:34 1738028314

They’re also pretty bad at typescript generics. They’re quite good at explaining concepts (like mapped types), but when push comes to shove they generate all sorts of stuff that looks convincing, but doesn’t pass the type checker.

And then you’ll paste in the error, and they’ll just say “ok I see the problem” and output the exact same broken code lol.

I’m guessing the problem is lack of training data. Most TS codebases are mostly just JS with a few types and zod schemas. All of the neat generic stuff happens in libraries or a few utilities

swatcoder · 2025-01-28T02:23:24 1738031004

Actually, it's because many of the people writing tutorials and sharing answers about that stuff don't know what the hell they're doing or grasp the fundamentals of how those systems work and so most of the source material the LLM's are trained on is absolute garbage.

Public Arduino, RPi, Pico communities are basically peak cargo cult, with the blind leading the blind through things they don't understand. The noise is vastly louder than the signal.

There's a basically giant chasm between expereinced or professional embedded developers that mostly have no need to ever touch those things or visit their forums, and the confused hobbyists on those forums randomly slapping together code until something sorta works while trying to share their discoveries.

Presumably, those communities and their internal knowledge will mature eventually, but it's taking a long long time and it's still an absolute mess.

If you're genuinely interested in embedded development and IoT stuff, and are willing to put in the time to learn, put those platforms away and challenge yourself to at least learn how to directly work with production-track SoC'a from Nordic or ESP or whatever. And buy some books or take some courses instead of relying on forums or LLM's. You'll find yourself rewarded for the effort.

bsder · 2025-01-28T02:53:28 1738032808

> Presumably, those communities and their internal knowledge will mature eventually, but it's taking a long long time and it's still an absolute mess.

It won't because the RPi are all undocumented, closed-source toys.

It would be an interesting experiment to see which chips an LLM is better at helping out with: RPi's with its hallucinatory ecosystem or something like the BeagleY-AI which has thousands of pages of actual TI documentation for its chips.

It would be really nice if the LLMs could cover for this and circumvent where RPi's keep getting used because they were dumped under cost to bootstrap a network effect.

rcxdude · 2025-01-28T10:46:33 1738061193

>Presumably, those communities and their internal knowledge will mature eventually, but it's taking a long long time and it's still an absolute mess.

I'm not sure they will. There's a kind of evaporative cooling effect where once you get to a certain level of understanding you switch around your tools enough that there's not much point interacting with the community anymore.

tacticalDonut · 2025-01-28T20:23:35 1738095815

Any book or course recommendations for someone who has a senior SWE background but has never touched hardware/embedded systems?

jrmg · 2025-01-28T02:48:59 1738032539

I was just today trying to fix some errors in an old Linux kernel version 3.x.x .dts file for some old hardware, so that I could get a modern kernel to use it. ChatGPT seemed very helpful at first - and I was super impressed. I thought it was giving me great insight into why the old files were now producing errors … except the changes it proposed never actually fixed anything.

Eventually I read some actual documentation and realised it was just spouting very plausible sounding nonsense - and confident at it!

The same thing happened a year or so ago when I tried to get a much older ChatGPT to help me with with USB protocol problems in some microcontroller code. It just hallucinated APIs and protocol features that didn’t actually exist. I really expected more by now - but I now suspect it’ll just never be good at niche tasks (and these two things are not particularly niche compared to some).

dagw · 2025-01-29T13:22:47 1738156967

Eventually I read some actual documentation...

For the best of both worlds make the LLM first 'read' the documentation, and then ask for help. Make a huge difference in the quality and relevance of the answers you get.

ASalazarMX · 2025-01-29T23:48:47 1738194527

And hope the docs aren't too large. LLMs tend to confabulate more with longer contexts.

onemoresoop · 2025-01-28T01:01:27 1738026087

I find LLMs to be decent unblockers. I only turn to them form time to time though, unless Im in a playful mode and try poking out various ideas. As a coder I also ask for snippets when Im lazy. I tried growing a slightly larger solution a few times and it failed in dumb ways. It was clear it doesn’t really comprehend we do, it’s not aware it’s moving in circles and so on. All these things will probably see a lot incremental of improvents and as a tool will definitely stay but fundamentally LLMs can’t really think, at least the way we do and expecting that is also foolish.

talldayo · 2025-01-27T22:39:42 1738017582

> which involved complicated rust `no_std` and linker setup for compiling rust code onto bare metal RISCV from scratch.

That's complicated, but I wouldn't say the resulting software is complex. You gave an LLM a repetitive, translation-based job, and you got good results back. I can also believe that an LLM could write up a dopey SAAS in half the time it would take a human to do the same.

But having the right parameters only takes you so far. Once you click generate, you are trusting that the model has some familiarity with your problem and can guide you without needing assistance. Most people I've seen rely entirely on linting and runtime errors to debug AI code, not "solid fundamentals" that can fact-check a problem they needed ChatGPT to solve first place. And the "experience" required to iterate and deploy AI-generated code basically boils down to your copy-and-paste skills. I like my UNIX knowledge, but it's not a big enough gate to keep out ChatGPT Andy and his cohort of enthusiastic morons.

We're going to see thousands of AI-assisted success stories come out of this. But we already had those "pennies on the dollar" success stories from hiring underpaid workers out of India and Pakistan. AI will not solve the unsolved problems of our industry and in many ways it will exacerbate the preexisting issues.

simonw · 2025-01-27T23:23:40 1738020220

A tool that can "write up a dopey SAAS in half the time it would take a human to do" is a pretty incredible thing to add to your toolbox!

talldayo · 2025-01-28T01:55:48 1738029348

If the summary goal of your existence is to be the most delirious waste of resources that humanity has yet known, sure. It's the hammer and nail of spoiled burnouts everywhere that need a credible ruse to help them out of the bottle.

Some of us are capable of wanting for things better than a coin-operated REST API. The kind of imagination used to put people on the moon, that now helps today's business leaders imagine more profitable ways to sell anime pornography on iPhone. (Don't worry, AI will disrupt that industry too.)

tonyhart7 · 2025-01-28T02:36:55 1738031815

I'm sorry, but generated REST API boilerplate has probably fed more people than putting people on the moon.

justinclift · 2025-01-29T13:18:01 1738156681

Pity about the direct injection SQL vulnerabilities though?

HPsquared · 2025-01-27T23:02:21 1738018941

What it will do is to free up a lot of brainpower to think about those hard problems and empower people to try our their ideas.

talldayo · 2025-01-28T01:56:59 1738029419

I used to think the exact same thing would happen when we paid Pakistani and Indian labor to do America's busywork.

That was about 15 years ago, I no longer have the same enthusiasm you do.

tonyhart7 · 2025-01-28T02:40:42 1738032042

Now, you are paying a Taiwanese or American company to produce GPUs for you. This allows you to use open-source models like DeepSeek R1, significantly reducing your reliance on Indian tech labor

sgc · 2025-01-28T04:42:37 1738039357

I believe they are saying we did not learn to think more deeply then, so why should we expect to learn how now.

baxtr · 2025-01-27T22:41:25 1738017685

Is it reasonable to assume that more senior devs benefit more from LLMs?

adamtaylor_13 · 2025-01-27T23:02:16 1738018936

I believe so. In my experience, you need to have that gut intuition (or experience) to say, “No way. That’s totally wrong.”

Since AI will capitulate and give you whatever you want.

You also have to learn how to ask without suggesting because it will take whatever you give it and agree.

j_bum · 2025-01-27T23:26:24 1738020384

Yep. I think a default state of skepticism is an absolute necessity when working with these tools.

I love LLMs. I agree with OP them expanding my hobby capacity as well. But I am constantly saying (in effect) “you sure…?” and tend to have a pretty good bs meter.

I’m still working to get my partner to that stage. They’re a little too happy to accept an answer without pushback or skepticism.

I think being ‘eager to accept an answer’ is the default mode of most people anyway. These tools are likely enabling faster disinformation consumption for the unaware.

Spivak · 2025-01-27T23:15:42 1738019742

Yes, you essentially have an impossibly well read junior engineer you can task with quick research questions like, "I'm trying to do x using lib y, can you figure that out for me." This is incredibly productive because in the answer is typically all the pieces you need but not always assembled right.

Getting the LLM to pull out well-known names of concepts is for me the skill you can't get anywhere else. You can describe a way to complete a task and ask for what it's called and you'll be heading down arxiv links right away. Like yes the algorithm to find the closest in edit distance and length needle string in a haystack is called Needleman–Wunsch, of course Claude, everyone knows that.

ahi · 2025-01-27T23:40:58 1738021258

Once it gives me the names for the concepts I'm struggling with, I often end up finding the stackoverflow or documentation it's copy pasting.

simonw · 2025-01-27T23:22:00 1738020120

I think so.

Junior devs can get plenty of value out of them too, if they have discipline in how they use them - as a learning tool, not as a replacement for thinking about projects.

Senior devs can get SO much more power from these things, because they can lean on many years of experience to help them evaluate if the tool is producing useful results - and to help them prompt it in the most effective way possible.

A junior engineer might not have the conceptual knowledge or vocabulary to say things like "write tests for this using pytest, include a fixture that starts the development server once before running all of the tests against it".

RickS · 2025-01-27T23:03:28 1738019008

IMO experience provides better immunity for common hangups. Generated code tends to be directionally pretty good, but with lots of minor/esoteric failures. The experience to spot those fast and tidy them makes all the difference. Copilot helps me move 10x faster with tedious arduino stuff, but I can easily see where if I didn't have decent intuition around debugging and troubleshooting, there'd be almost zero traction since it'd be hard to clear that last 10% hurdle needed to even run the thing.

jaredcwhite · 2025-01-28T03:01:36 1738033296

I wouldn't assume that at all. Most of the senior devs I talk to on a regular basis think commercial* LLMs are ridiculous and the AI hype is nonsensical.

* I put commercial there as a qualifier because there's some thought that in the future, very specifically-trained smaller models (open source) on particular technologies and corpuses (opt-in) might yield useful results without many of the ethical minefields we are currently dealing with.

dogma1138 · 2025-01-27T22:45:18 1738017918

It depends it think it’s less about how senior they are and how good they are at writing requirements, and knowing what directives should be explicitly stated and what can be safely inferred.

Basically if they are good at utilizing junior developers and interns or apprentices they probably will do well with an LLM assistant.

jumpman500 · 2025-01-27T23:02:39 1738018959

Ya. I think people that have better technical vocabulary and an understanding of what should be possible with code do better. That’s usually a senior engineer, but not always

satvikpendem · 2025-01-28T05:12:38 1738041158

It's the LLM paradox, seniors get more senior with them while juniors get more junior, creating a bimodal distribution in the future simply because juniors will start depending on them too much to learn how to code properly while seniors (who some may also exhibit the previous trait) will by and large be able to rapidly synthesize information from LLMs with their own understanding.

ern · 2025-01-28T00:26:04 1738023964

I had a couple of the most capable senior developers reach out to me to tell me how Github Copilot accelerated their productivity, which surprised me initially. So I think there's something to it.

chamomeal · 2025-01-28T01:40:47 1738028447

I agree with his point about asking AI to “fix” problems though. It’s really nice that you don’t have to fully understand something to use it, but that becomes a problem if you lean on it too much

pinoy420 · 2025-01-28T06:17:48 1738045068

Exactly this. OP, credit where credit is due, appears to be someone who “hacks things together” copy pasting solutions blindly from the internet - with little intuition gained along the way.

gsf_emergency_2 · 2025-01-28T02:01:12 1738029672

Ime engineers who find LLM useful have misunderstood their reasons for existence outside of being salaried..

What is your main project ? Do you LLM that? (

I wager you're not a rust expert and should maybe reconsider using rust in your main project.

FWIW asking LLM whether you should use rust ~ asking it about the meaning of life. Important questions that need answers, but not right away! (A week or 2 tops)

If you need to synthesize the universe of information with LLM.. that is not the universe you want to live or play in

dogma1138 · 2025-01-27T22:42:18 1738017738

Indeed LLMs are useful as an intern, they are at the “cocky grad” stage of their careers. If you don’t understand the problem and can’t steer the solution and worse has only limited understanding of the code they produce you are unlikely to be productive.

On the other hand if you understand what needs to be done, and how to direct the work the productivity boost can be massive.

Claude 3.5 sonnet and O1 are awesome at code generation even with relatively complex tasks and they have a long enough context and attention windows that the code they produce even on relatively large projects can be consistent.

I also found a useful method of using LLMs to “summarize” code in an instructive manner which can be used for future prompts. For example summarizing a large base class that may be reused in multiple other classes can be more effective than having to overload a large part of your context window with a bunch do code.

irthomasthomas · 2025-01-28T12:05:12 1738065912

Another data point. Plugin ideation to publication in 2 minutes, from a single prompt, albeit a multi-step shell prompt. 3 packages published in 24 hrs https://x.com/xundecidability/status/1884077427871342955

mythrwy · 2025-01-27T22:26:54 1738016814

I've had both experiences strangely enough.

blast · 2025-01-28T02:07:03 1738030023

> The llm is there to rapidly synthesize the universe of information

That's a nice way of putting it.

zackproser · 2025-01-28T23:29:02 1738106942

maybesomaybenot · 2025-01-27T23:36:28 1738020988

Claude is like having my own college professor. I've learned more in the past month with Claude then I learned in the past year. I can ask questions repeatedly and get clarification as fine as a need it. Granted, Claude has limits, but its a game-changer.

> I think the key to being successful here is to realize that you're still at the wheel as an engineer. The llm is there to rapidly synthesize the universe of information.

Bingo. OP is like someone who is complaining about the tools, when they should be working on their talent. I have a LOT of hobbies (circuits, woodworking, surfing, playing live music, cycling, photography) and there will always be people who buy the best gear and complain that the gear sucks. (NOTE: I"m not implying claude is "the best gear", but it's a big big help.)

I think the only problem with LLMs is synthesis of new knowledge is severely limited. They are great at explaining things others have explained, but suck hard at inventing new things. At least that's my experience with Claude: it's terrible as a "greenfield" dev.

BalinKing · 2025-01-28T01:44:41 1738028681

> Claude is like having my own college professor.

I don't use Claude, so maybe there's a huge gap in reliability between it and ChatGPT 4o. But with that disclaimer out of the way, I'm always fairly confused when people report experiences like these—IME, LLMs fall over miserably at even very simple pure math questions. Grammatical breakdowns of sentences (for a major language like Japanese) are also very hit-or-miss. I could see an LLM taking the place of, like, an undergrad TA, but even then only for very well-trod material in its training data.

(Or maybe I've just had better experiences with professors, making my standard for this comparison abnormally high :-P )

EDIT: Also, I figure this sort of thing must be highly dependent on which field you're trying to learn. But that decreases the utility of LLMs a lot for me, because it means I have to have enough existing experience in whatever I'm trying to learn about so that I can first probe whether I'm in safe territory or not.

joseda-hg · 2025-01-28T03:44:27 1738035867

Major in the context of Japanese is rough, I can see a significant drop in quality when interacting with the same model in say Spanish vs English

For as rich a culture the Japanese have, there's only about 1XX million speakers and the size of the text corpus really matters here, the couple billion of English speakers are also highly motivated to choose English over anything else because Lingua Franca has homefield advantage

To use LLM's efectively you have to work with knowledge of their weaknesses, Math is a good example, you'll get better results from Wolphram Alpha even for the simple things, which is expected

Broad reasoning and explanations tend to be better than overly specific topics, the more common a language, the better the response If a topic has a billion tutorials online, an LLM has a really high chance of figuring out first try

Be smart with the context you provide, the more you actively constrain an LLM, the more likely it is to work with you I have friends that just use it to feed class notes to generate questions and probe it for blindspots until they're satisfied, the improvements on their grade s make it seem like a good approach, but they know that just feeding responses to the LLM isn't trustworthy, so they do and then they also check by themselves, the extra time valuable by itself, if just to improve familiarity with the subject

satvikpendem · 2025-01-28T05:09:47 1738040987

> LLMs fall over miserably at even very simple pure math questions

They are language models, not calculators or logic languages like Prolog or proof languages like Coq. If you go in with that understanding, it makes a lot more sense as to their capabilities. I would understand the parent poster to mean that they are able to ask and rapidly synthesize information from what the LLM tells them, as a first start rather than necessarily being 100% correct on everything.

BalinKing · 2025-01-28T15:04:34 1738076674

Of course that's fair advice in itself, but the parent specifically equated them to a "college professor."

justinclift · 2025-01-29T13:14:57 1738156497

Maybe that should be "college art professor" then? :)

maybesomaybenot · 2025-01-28T06:53:27 1738047207

I think a lot of these people object to AI probably see the gross amounts of energy it is using, or the trillions of dollars going to fewer than half a dozen men (most american, mostly white).

But, once you've had AI help you solve some gnarly problems, it is hard not to be amazed.

And this is coming from a gal who thinks the idea of self-driving cars is the biggest waste of resources ever.

BalinKing · 2025-01-28T15:08:45 1738076925

(EDIT: Upon rereading this, it feels unintentionally blunt. I'm not trying to argue, and I apologize if my tone is somewhat unfriendly—that's purely a reflection of the fact that I'm a bad writer!)

Sorry, maybe I should've been clearer in my response—I specifically disagree with the "college professor" comparison. That is to say, in the areas I've tried using them for, LLM's can't even help me solve simple problems, let alone gnarly ones. Which is why hearing about experiences like yours leaves me genuinely confused.

I do get your point about people disagreeing with modern AI for "political" reasons, but I think it's inaccurate to lump everyone into that bucket. I, for one, am not trying to make any broader political statements or anything—I just genuinely can't see how LLMs are as practically useful as other people claim, outside of specific use cases.

maybesomaybenot · 2025-01-28T06:18:44 1738045124

Like I made very clear, it is great at some things and terrible at others.

YMMV. /shrugs/

KerrAvon · 2025-01-28T00:26:33 1738023993

How do you know it's accurate?

selcuka · 2025-01-28T00:49:30 1738025370

They are reasonably accurate, and no tutor is perfect. How do you know your college professor is accurate?

malfist · 2025-01-28T00:58:33 1738025913

My college professor has certifications and has passed tests that weren't in their training data.

My college professor was also willing to say "I don't know, ask me next class"

paulryanrogers · 2025-01-28T01:47:48 1738028868

> My college professor was also willing to say "I don't know, ask me next class"

This is a key differentiator that I see in humans over LLMs: knowing ones limits.

selcuka · 2025-01-28T02:51:36 1738032696

> My college professor has certifications and has passed tests that weren't in their training data.

Granted, they are not (can't be) as rigorous as the tests your professor took, but new models are run through test suites before being released, too.

That being said, I saw my college professors making up things, too (mind you, they were all graduated from very good schools). One example I remember was our argument with a professor who argued that there is a theoretical limit for the coefficient of friction, and it is 1. That can potentially be categorised as a hallucination as it was completely made up and didn't make sense. Maybe it was in his training data (i.e. his own professors).

I agree with the "I don't know" part, though. This is something that LLMs are notoriously bad.

Eisenstein · 2025-01-28T01:29:12 1738027752

What do you consider 'not in its training data'?

I just asked Claude a question I am pretty sure was not in its training data.

* https://i.imgur.com/XjvImeT.jpeg

mahogany · 2025-01-28T13:00:12 1738069212

It is immediately wrong in Step 1. A newborn is not a 2:1 ratio of height:width. Certainly not 25cm width (what does that even mean? Shoulder to shoulder?).

This is a perfect example of where not knowing the “domain” leads you astray. As far as I know “newborn width” is not something typically measured, so Claude is pulling something out of thin air.

Indeed you are showing that something not in the training data leads to failure.

philote · 2025-01-28T14:42:55 1738075375

Babies also aren't rectangles.. you could lay a row shoulder to shoulder, then do another row upside down from the first and their heads would fit between the heads of the first row, saving space.

Edit: it also doesn't account for the fact the moon is more or less a sphere, and not a flat plane.

selcuka · 2025-01-28T02:36:05 1738031765

That's almost in the training data:

https://www.quora.com/How-many-Humans-can-we-fit-on-the-Moon

Eisenstein · 2025-01-28T06:05:33 1738044333

I guess coming up with a truly original question is tougher that it seems. Any ideas?

ramses0 · 2025-01-28T12:21:06 1738066866

Ask them what's the airspeed velocity of a laden astronaut riding a horse on the moon...

Edit: couldn't resist, and dammit!!

Response: Ah, I see what you're doing! Since the Moon has no atmosphere, there’s technically no air to create any kind of airspeed velocity. So, the answer is... zero miles per hour. Unless, of course, you're asking about the speed of the horse itself! In that case, we’d just have to know how fast the astronaut can gallop without any atmosphere to slow them down.

But really, it’s all about the fun of imagining a moon-riding astronaut, isn’t it?

lelanthran · 2025-01-28T14:04:57 1738073097

An African horse or a European horse?

satvikpendem · 2025-01-28T05:11:08 1738041068

Did you actually test the math done? Usually LLMs are terrible at math as, as I mentioned in another comment, they are language models, not calculators. Hopefully that changes when LLMs leverage other apps like calculators to get their results, I am not sure if Claude does that already or it's still in development.

Eisenstein · 2025-01-28T06:04:21 1738044261

Claude has access to an analysis frame which takes javascript which it can use for calculations.

maybesomaybenot · 2025-01-28T06:14:24 1738044864

You can also test your professor's answers. I don't just walk around going "Oh, Claude was right", I'm literally using what I just learned and am generating correct results. I'm not learning facts like dates, or subject things, I'm learning laws, equations, theories, proofs, etc. (Like how to apply Euler's totient or his extended theories on factorization... there's only one "right answer").

Also, you method for attesting your professors accuracy is inherently flawed. That little piece of paper on their wall doesn't correlate with how accurate they are; it doesn't mean zero, but it isn't foolproof. Hate to break it to you, but even heroes are fallible.

maybesomaybenot · 2025-01-28T01:25:00 1738027500

Simple: one thing I'm learning about is RFCs for TCP/IP. I can literally go test it. It's like saying, "How do you know it is right when it says 2+2=4"? Some knowledge when taught is self-correcting. Other things I'm studying, like, say, tensor calculus, I can immediately use and know I learned it correctly.

kortilla · 2025-01-28T04:07:34 1738037254

TCP/IP is a great example though of something you can get seemingly correct and then be subject to all kinds of failure modes in edge cases you didn’t handle correctly (fragmentation, silly windows, options changing header sizes, etc).

maybesomaybenot · 2025-01-28T06:19:17 1738045157

Thanks for telling me how wrong I am! I bet you're fun at parties.

namaria · 2025-01-28T09:00:18 1738054818

This is not a party, this is a technology forum and we prize being right

XenophileJKO · 2025-01-28T00:07:56 1738022876

I would add though, that they can be very good at combining known concepts. Which can create a non-trivial set of "new knowledge".

maybesomaybenot · 2025-01-28T00:09:06 1738022946

Creating new knowledge from current knowledge is called "synthesis" (ancient term, nothing modern). I'm hoping you're right, it would be amazing.

gerdesj · 2025-01-28T01:31:51 1738027911

[flagged]

potsandpans · 2025-01-28T01:41:39 1738028499

You are very smart and cool.

gerdesj · 2025-01-30T01:33:45 1738200825

"You are very smart and cool."

No I am really not. You aren't either.

potsandpans · 2025-01-24T04:11:42 1737691902

Thanks for quoting wikipedia.

potsandpans · 2025-01-18T04:58:17 1737176297

This comment is representative of something like a mass psychogenic illness prevalent in the hacker news community.

Which could be roughly summarized as: an absurd and distorted perception of application development for the web, the goals people in that domain are trying to achieve and the history of how we got here.

The real true legacy of react will be bringing functional reactive programming to the masses. Packaging it in a way that a common junior dev could build an intuition around.

potsandpans · 2025-01-12T18:50:48 1736707848

Nitpick to your nitpick:

You've offered a linguistically prescriptive interpretation that ignores nuance and flexibility of language.

While it may be the case that the canonical "point" of parenthesis is what you've described, the purpose only remains for as long as we culturally accept that definition.

The usage of parenthesis as an aside to indicate a certain emotion somewhat playfully is not only acceptable, it's entered into the cultural zeitgeist. We all understand the meaning. The sentence is complete -- regardless of the rules you've cited -- because the reader doesn't eliminate the parenthesis from their context like a robot, and we have a common understanding of what the aside is trying to communicate.

potsandpans · 2025-01-10T05:08:45 1736485725

It's incredible. People are completely out of touch.

potsandpans · 2025-01-10T04:57:21 1736485041

Your tone policing in this thread is offensive.

Apparently you know many people that have died from cancer young, and this qualifies you to know how a terminally ill person should process that emotion.

You have zero qualification. How dare you imply that you know best for someone going through this.

Hopefully no one reading this is ever in that situation. But I'll defer to the individual who's facing the death count down to process it in their own way.

_DeadFred_ · 2025-01-10T20:21:25 1736540485

Your tone policing in offensive. Psychological and quality of life in end of life situations are valid and necessary considerations to include when talking about life threatening conditions.

I said nothing about how a terminally ill person should process anything. I stated what I have found to provide the best outcome when a person close to you has a terminal disease. If OP said 'my friend asked me to research...' I would have given a different response/no response.

But when OP makes it look like this is an initiative OP took upon themselves, for themselves, because they have lost too many friends, then yeah, I'm going to highlight that might not be the best possible position to come from if OP wants the best outcome for their friend. My response about outcomes it totally valid since OP asked for help with outcomes.

potsandpans · 2025-01-11T03:46:52 1736567212

Too long didn't read

lambdaphagy · 2025-01-10T05:19:06 1736486346

OP has a reasonable concern, I just don't think it's the only consideration at play.

potsandpans · 2025-01-08T16:17:14 1736353034

It's funny how some folks with no domain experience hn seem to be obsessed with html.