Hacker Newsnew | past | comments | ask | show | jobs | submit | pptr's commentslogin

They only had to comply with EU laws when they were already a big player in China. EU manufacturers need their new vehicles to be compliant on day one. That is, if they want to launch in the EU market first. Audi recently launched a China-only car (AUDI E5 Sportback).


What is different about Deepseek's use of MoE vs all the other MoE models that makes training more efficient?

FP8 training and GRPO make sense to me, but that only gets you a 4x improvement total, right?


They slightly restructure their MoE [1], but I think the main difference is that other big models (e.g Llama 504B) are dense and have higher FLOP requirements. MoE should represent a ~5x improvement. FP8 should be about a ~2x improvement.

We don’t know how much of a speed improvement GRPO represents. They didn’t say how many GPU hours went into to RLing DeepSeek-r1 and we don’t have a o1 numbers to compare.

There’s definitely lots of misinformation spreading though. The $5.5m number refers to Deepseek-v3, not Deepseek-r1. I don't want to take away from HighFlyer's accomplishment, though. I think a lot of these innovations were forced to work around H800 networking limitations, and it's impressive what they've done.

[1] https://arxiv.org/abs/2401.06066


It's interesting that only having access to less powerful hardware motivated/necessitated more efficient training--like how tariffs can backfire if left in place too long.


If you don't, your geopolitical adversary might be the first to build AGI.

So in this scenario I could see it become necessary from a military perspective.


The mere existence of regulation is part of the problem. Without precise understanding of the law, you don't know if your use cases are fine/excempted. The safe default assumption is that your site is not compliant with regulations until you can prove otherwise, involving a lawyer.


That regulation would not have come into existence if there were no privacy problems caused by the ones that have to comply to the regulations


Right, and that regulation has a cost. I hope it's worth it.


There were certainly problems.

But the GDPR and ePrivacy directives don't protect us from nefarious cross-site tracking cookies.

Prior to the GDPR, websites just tracked us.

Now they track us AND present an irritating warning that users have learnt to mindlessly "accept"


This is simply not true.


Once you have a technological breakthrough that requires lots of exploration to figure out which products will succeed, regulation is a competitive disadvantage.

For established industries regulation just increases prices. ... Until a new technology comes up that allows new competitors to enter the market. Like Tesla, SpaceX.


Hmm both SpaceX and Tesla are heavily subsidised by the government

Funnily enough many of the basic breakthroughs in AI were done in Europe, but at universities.

Sadly people don‘t invest in Europe in startups as in the US, therefore the companies were formed in the US, even tough research is done here.


If you declare using an invalidated iterator as UB, the compiler can optimize as if the container was effectively immutable during the loop.


Ignoring it means you need to get explicit consent, which is what the websites are already doing.


Legally they are required as of now to gather informed consent that is given freely.

Contrary to popular believe the EU has somewhat defined what that means (just read the law) and surprise: The way many datahogs wish it to be, isn't how the law was written.

E.g. if you trick or extort users into agreeing, consent was neither given informed nor freely. In front of the law it is as if you haven't asked for consent at all and GDPR fines can be up to 4% of the global turnover of the previous fiscal year. But yeah.


But then you don't have a company that can afford to build a 100b training cluster for example. If that becomes the new sota you are again left behind.


>If that becomes the new sota you are again left behind.

It's why Europe is being elft behind technologically compared to the US and also loosing worldwide share of GDP. Thousands of small local companies scattered across several countries can't compete with the scale the likes of FAANG can achieve in the US and worldwide.


When I do a Google search for the example query "postgresql query analysis" on mobile chrome (no ad block), I get 0 ads. Same thing if I select the "Example" filter as shown in the demo image.


Ideally you can only retry error codes where it is guaranteed that no backend logic has executed yet. This prevents retry amplification. It also has the benefit that you can retry all types of RPCs, including non-idempotent ones. One example is if the server reports that it is overloaded and can't serve requests right now (loadshedding).

Without retry amplification you can do retries ASAP, which has much better latency. No exponential backoff required.

Retrying deadline exceeded errors seems dangerous. You are amplifying the most expensive requests, so even if you only retry 20% of all RPCs, you could still 10x server load. Ideally you can start loadshedding before the server grinds to a halt (which we can retry without risk of amplification). Having longer RPC deadlines helps the server process the backlog without timeouts. That said, deadline handling is a complex topic and YMMV depending on the service in question.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: