Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, exactly, that's the problem. They're trying to mechanically evaluate individual sentences or even fragments of sentences, the AI is asked to rewrite it and then hallucinates some context that wasn't actually stated. Then they claim the sentence is ambiguous based on the different random hallucinated contexts they get. In reality there was likely to be some surrounding context that pinned down the context that the LLM is otherwise inventing, but in this academic exercise it is removed.

> "He entered the room. When he pushed the button, the lights came on."

This statement doesn't state causality directly, it's just a very plausible inference by the listener based on what they already know about buttons and lights, and the assumption that this sentence wasn't preceded by some sort of modifying information (e.g. maybe the button is not connected to anything in which case the cause of the light coming on is different).

In artificial cases like this it doesn't matter if your inference is wrong, but it wouldn't be fair to label the speaker as ambiguous. The statement is not ambiguous. If later it turns out that your assumption of causality was incorrect, you can't turn around and blame whoever wrote that sentence because it was your assumption to start with.



Maybe a different example would help. If I support abortion, I can say "I support women's health care". Without context you'd think the statement had nothing to do with abortion, so you've got to add context to understand it. It's also ambiguous because what I'm saying and what I mean are different.

What do you think the speaker intended to say with "the stock market crashed when Obama was elected"? What was the context? Surely not two random unrelated facts placed in the same sentence for no reason.


I do understand what you're getting at, more examples aren't required.

What you're arguing is that if someone says one thing but actually means something else, that's ambiguity.

What I'm saying is that this situation is something else. Call it duplicity, talking out of both sides of your mouth, vagueness, innuendo, whatever, there's lots of labels you could use. But it's not the same thing as ambiguity.

There are good examples of genuinely ambiguous statements in the paper like "John and Anna are married" which could be interpreted as married to each other, or a statement about both individuals independently. Neither interpretation is obviously more correct than the other and which is correct has to be disambiguated based on context or further questioning.

The difference is important because ambiguity is usually accidental and often a matter of poor use of syntax or natural language being evolved. If a speaker says something unclearly, and someone else finds it ambiguous, then asking them to clarify which of one or two interpretations they meant won't normally risk causing offense or conflict.

But if you think someone is engaging in (let's call it) innuendo, then there's no way to ask them to correct that without it causing conflict, because it's an inherently hostile accusation. Moreover it's extremely listener specific. Genuine language ambiguity almost never has that problem.

The paper is a bit odd because if you check their jsonl files, most of the examples marked as having an ambiguous premise are of genuine ambiguity caused by the way English works. A few are ambiguous only in written form and would be considered unambiguous to a native speaker if spoken inflection was available (this is listed in their limitations section). Yet when it gets into the part where PolitiFact are suddenly a reliable source, one of the three examples isn't.

Probably the problem here is that PolitiFact, being amateurs with no interest in linguistics or logic, tend to ignore genuine ambiguity but they wanted to show that PolitiFact aren't entirely useless, so took this claim that was marked "barely true" because PolitiFact didn't like the implication of causality and then said that in that case they found ambiguity. It makes me wonder how often PolitiFact identifies genuine ambiguity as a consequence (maybe never). I couldn't find these examples in their jsonl files so it's hard to say what's going on there.


In isolation, the statement about the market crashing at the time of Obama's election is about as meaningful as to mention the fact that the election happened on a Tuesday. To make sense of it, the reader has to assume there is some point being made.

Maybe what is being implied is that the election result caused the crash, on the other hand it could go on to say that the financial crisis was the result of the previous administration's failings and how Obama did a good job in handling it.


Yeah, but this is one of the flaws of the study methodology. Many of the sentences being given to the LLM to classify aren't really meaningful on their own so the LLM/human labeler then invents some secondary assumed meanings that aren't strictly speaking there. It's totally understandable why they do that but then it makes it hard to interpret the results.


Ok I think I get your definition. That doesn't appear to be what the paper is talking about: "recognize ambiguity and disentangle possible meanings".

Sure, PolitiFact could flag "Donald Trump and Melania Trump are married" as linguistically ambiguous, consistent with your John and Anna example, but I'm not sure who that would help.

Calling out mud slinging like: "Donald Trump was an associate of Jeffrey Epstein. Epstein was arrested for sex trafficking minors." feels like the point of these tools. Arguing that the statements are individually true and are "syntactically unambiguous" ignores that putting them together is intended to make the reader hear something that wasn't actually said and to think something bad about Trump.

Since the goal is to "disentangle possible meanings", using statements that are syntactically unambiguous, but have hidden meaning seems like a great choice.


Well, most of the examples of the paper are examples where I agree they're linguistically ambiguous so I'm not sure what I'm talking about is so different. The John and Anna example comes from the paper. Maybe I'm being harsh because really only one of the examples in the paper is not an example of ambiguity, but the connection to politics is how they choose to justify why their work is useful so it's fair to call that out given there are only three such examples to begin with.

If someone wants to show how to use LLM to make news less biased or auto-delete mud-slinging or hidden meanings or whatever then great. I'd be super interested in that, because I think LLMs give us the tools to radically reimagine news and political reporting, but for better or worse it's not this study.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: