I've tried GPT 4.5. It seems a bit better, but couldn't magically solve some of ...

duskwuff · 2025-03-01T00:00:34 1740787234

It's not just that paragraph, either - the word starts showing up more and more frequently as the chat goes on, including a phase of using the word at every possible juncture, like:

> Explicitly at each explicit index ii, explicitly record the largest deleted element explicitly among all deletions explicitly from index ii explicitly to the end nn. Call this explicitly retirement_threshold(i) explicitly.

If I were you, I'd treat the entire conversation with extreme suspicion. It's unlikely that the echolalia is the only problem.

jquery · 2025-03-01T01:36:00 1740792960

It’s like dialogue out of a horror novel.

redox99 · 2025-03-01T04:19:19 1740802759

Wow, lol. That's pretty hilarious.

It seems like they need to increase the repetition penalty.

eru · 2025-03-01T04:53:43 1740804823

I also had frequent occurrences of errors where I needed to hit the 'please regenerate' button. Sometimes I kept hitting errors, until I gave up and changed the prompt.

JohnKemeny · 2025-03-01T07:14:53 1740813293

While that was interesting to read, now I'm more curious about where laminar matroids appear. Are they graphic?

eru · 2025-03-03T01:33:43 1740965623

No, I don't think laminar matroids are a subset of graphic matroids (nor vice versa).

I think laminar matroids occur fairly naturally in lots of places where a computer scientist would use a heap.

> Hierarchical or Nested Constraints:

> When problems involve hierarchical resource constraints—say, scheduling with nested deadlines or quotas that are imposed at different levels—a laminar family naturally describes these relationships. The corresponding laminar matroid models the “at most so many” restrictions on various overlapping (but nested) groups.

> Network Design and Resource Allocation:

> In network design or resource allocation, you may encounter situations where resources are grouped in a hierarchy (for example, a network might have capacities on various subnetworks that nest within one another). Laminar matroids capture this kind of structure.

One of my original motivating examples was something like this: suppose you have a compiler that's supposed to give good error messages. Now it gets input with mismatched parens like "a = (b * (c + d);"

Both "a = (b * c + d);" and "a = b * (c + d);" are ways to remove the fewest number of characters to get a well-balanced expression. Assume some other parts of your parser give you some weights to tell you which parens are less suspicious / more likely to be intended by the user. You want to select the subsequence of characters that has the highest total weight, or equivalently: delete the most suspicous parens only.

In any real world scenario, you would just use a normal heap to do this in O(n log n) time. But I was investigating whether we could do it in linear time, or prove that linear time ain't possible.

Well, I figured out that linear time is possible.

I'm actually working on writing it down and publishing it as a paper or at least blog post.

mattigames · 2025-03-01T10:31:27 1740825087

You could say this explicitly shows how little it has improved.

eru · 2025-03-03T01:38:15 1740965895

I don't think so; it only shows a lack of polish: you can avoid one word repetition like this with some fairly straightforward software engineering (by tweaking the sampling from the networks probability distribution over tokens), it has not much to do with progress or lack thereof in LLMs more generally.

However, I am somewhat surprised that whatever they are doing to avoid repetition seems to be crafted again for each model (or at least for 4.5) instead of using the same system that successfully avoids repetition for their more established models.

To say things another way: if 4.5 had successfully avoided repetition, I wouldn't have taken it as a sign of progress, so I'm not taking the opposite as a big sign of lack of progress.