There's the technique of model orthogonalization which can often zero out certain tendencies (most often, refusal), as demonstrated by many models on HuggingFace. There may be an existing open weights model on HuggingFace that uses orthogonalization to zero out positivity (or optimism)--or you could roll your own.
Honestly, I think the mode that will actually occur is that incumbent businesses never successfully adopt AI, but are just outcompeted by their AI-native competitors.
Sears also did everything it could to annihilate itself while dot-com was happening.
their CEO was a believer of making his departments compete for resources leading to a brutal, dysfunctional clusterfuck. rent seeking behavior on the inside as well as outside.
IMO the lack of real version control and lack of reliable programmability have been significant impediments to impact and adoption. The control surfaces are more brittle than say, regex, which isn’t a good place to be.
I would quibble that there is a modicum of design in prompting; RLHF, DPO and ORPO are explicitly designing the models to be more promptable. But the methods don’t yet adequately scale to the variety of user inputs, especially in a customer-facing context.
My preference would be for the field to put more emphasis on control over LLMs, but it seems like the momentum is again on training LLM-based AGIs. Perhaps the Bitter Lesson has struck again.
People are trying to design how to prompt, but it’s very different in both implementation and result than designing a programming language or a visual language, ofc.
Exactly. In other threads on hacker news people have bemoaned the loss of the old weird web. I don't think anyone believed me that the same spirit exists in some sides of TikTok.
My belief is that while eng manager empire building was the easier path to get promoted before 2022, it's not anymore, for two main reasons:
1. HC doesn't accrue like that anymore.
2. Many organizations are looking to delayer; harder to promote up to director when your org went from 9 runs to 5.
I hear a lot of the focus going to Tech Lead Manager roles--fewer reports but more hand-on keyboard than EM roles of the past.
As I understand it, the Q-hypothesis is often situated within the hypothesis of Marcan priority (Mark was the source for Luke and Matthew), and Q is a way of explaining agreements within Luke and Matthew that are not also found in Mark. The hypothesis would be that Luke and Matthew each combined text from Mark with Q.
I think (but cannot prove) that along the way, it was decided to explicitly measure ability to 'study to the test'. My theory goes that certain trendsetting companies decided that ability to 'grind at arbitrary technical thing' measures on-job adaptability. And then many other companies followed suit as a cargo cult thing.
If it were otherwise, and those trendsetting companies actually believed LeetCode tested programming ability, then why isn't LeetCode used in ongoing employee evaluation? Surely the skill of programming ability a) varies over an employee's tenure at a firm and b) is a strong predictor of employee impact over the near term. So I surmise that such companies don't believe this, and that therefore LeetCode serves some other purpose, in some semi-deliberate way.
I do code interviews because most candidates cannot declare a class or variable in a programming language of their choice.
I give a very basic business problem with no connection to any useful algorithm, and explicitly state that there are no gotchyas: we know all inputs and here’s what they are.
Almost everyone fails this interview, because somehow there are a lot of smooth tech talkers who couldn’t program to save their lives.
I think I have a much lazier explanation. Leet code style questions were a good way to test expertise in the past. But the same time everyone starts to follow suit the test becomes ineffective. What's the saying? When everyone is talking about a stock, it's time to sell. Same thing.
> If it were otherwise, and those trendsetting companies actually believed LeetCode tested programming ability, then why isn't LeetCode used in ongoing employee evaluation?
Probably recent job performance is a stronger predictor of near future job performance.