The short and unsatisfying answer is that an LLM generation is a markov chain, except that instead of counting n-grams in order to generate the posterior distribution, the training process compresses the statistics into the LLM's weights.
There was an interesting paper a while back which investigated using unbounded n-gram models as a complement to LLMs: https://arxiv.org/pdf/2401.17377 (I found the implementation to be clever and I'm somewhat surprised it received so little follow-up work)
When countries like North Korea, which depends on cybercrime to fund itself, are signatories, you have to wonder whether this agreement means what its title says.
They have also had the longest on going embargo on earth right after they were nearly wiped out by a genocidal war on behalf of the US.
I don't doubt their history explains the shape of their economy.
This may seem like I am defending North Korea, but in reality I am putting in perspective who/why they are. Facts which nearly amount propaganda to western nations.
I don't think it's right to blame ordinary North Koreans for the state of their country like that. Clearly it has more to do with the paranoid authoritarianism of 1 guy, rather than the collective war trauma of the people. Just look at South Korea, the other party of that "genocidal war". They moved on a long time ago, because their national politics allowed them to.
Not really. Squid games is loosely based on for-profit concentration camps run by US backed dictators in South Korea. The true story is actually worse than the fictional show.
All US colonies have had terrible dictatorships, the news of which purposefully does not reach the western core. Saying "they've moved on" is ignoring what actually happened, and I'm not bringing this up in a "lets focus on the past" sort of way, but on how it actually got there.
Is a murderous dictatorship that forcefully bashes the economy into the shape it wants the same as "getting over it"?
And blaming how North Korea is on one guy is a cop out. And, like I said, I'm not even defending the guy, but you can't simplify history too much either, else it starts to look like cartoons. There are 4 other guys who lead North Korea, Kim Jong Un just has a family legacy involved in its founding and is the general secretary of state, the real President, coincidentally, died yesterday, Nov 3th.
The reality of meetings in most places I've seen is that key stakeholders have already formed an opinion beforehand, the meeting is a place to disseminate decisions that have already been made and align the organization.
When I read "51% fewer false positives" followed immediately by "Median comments per pull request cut by half" it makes me wonder how many true positives they find. That's maybe unfair as my reference is automated tooling in the security world, where the true-positive/false-positive ratio is so bad that a 50% reduction in false positives is a drop in the bucket
The DeepSeek v3 model had a net training cost of >$5m for the final training run, the paper lists over 100 authors[1], meaning highly-paid engineers. This is also one of a sequence of models (v1, v2, math, coder) trained in order to build the institutional knowledge necessary to get to the frontier , and this ends up still far above the $10m mark. It's hardly a "trio of super-smart engineers".
Incidentally, Altman's comments were in response to a question a out about a hypothetical startup with $10m. So you've made the argument even more cogent.
That's really not true, e.g. the wikipedia page on population transfer in the Ottoman empire[1]. This dates way back to the Assyrian and Persian empries explicitly moving conquered peoples around in their empires in order to safeguard their rule. This book on population transfer in the Ottoman empire[2] explicitly states, with references, that the Ottomans habits were inherited from the steppe Turks, the Byzantines (=the Romans) and the Arabs.
Anecdotally, a pro-audio software company I worked with had to fire 1/3 of the company when their copy-protection was cracked and sales tanked immediately afterwards, and recovered once a new copy-protection scheme was developed and applied. And just to be clear, software licenses in direct-to-user sales are not that company's only revenue stream (they sell hardware and software to OEMs).
This is to say, the evidence in this natural experiment points towards piracy reducing sales by a lot.
If it was professional audio, then your main concern would be acquiring business sales, right? If certain companies stopped paying after a new crack comes out then that sounds like a rather blatant example of piracy that could have been pursued legally.
Under the leaderboard tab, if the "Solution" column has an icon, it's clickable. 2nd place solution is by Jeremy Howard (of fast.ai fame), which I'd summarize as TrueSkill Through Time (Microsoft Research paper) + some overfitting on the public leaderboard (1st place was #26 in the public leaderboard).
The CLIP plot (Fig. 2) is damning, however some of the generative models show flat responses in Fig. 3 (e.g. Adobe GigaGAN, DALL-E-mini). While those are on the one hand technically linear relationships, but are also exactly what we'd want: image generation aesthetic score that doesn't care about concept frequency. Maybe the issue is with the contrastive training target used in CLIP?
reply