In exactly what sense? Who is the "old guard" you're thinking of here? Peter Schwabe got his doctorate 16 years after Bernstein. Peikert got his 10 years after.
For the most part LLMs choose "the most common" tokens; so regardless of whether the content was "AI content" or not, maybe you are getting tired of mediocrity.
And of course also that mediocrity has now become so cheap that it is now the overwhelming majority.
This is similar to how the average number of children per household is 2.5, but no one has 2.5 children. The most common tokens actually yield patterns that no one actually uses together in practice
LLMs have the tendency to really like comparisons / contrasts between things, which is likely due to the nature of neural networks (eg “Paris - France + Italy” = “Rome”). This is because when representing these concepts as embeddings, they can be computer very straightforward in vector space.
So no, it’s not all due to human language, LLMs do really write content in a specific style.
One recent study also showed something interesting: AIs aren’t very good at recognizing AI generated content either, which is likely related; they’re unaware of these patterns.
One of my favourite books, I even read it a couple of times, even hacked around with xv6 (the x86 port of the edition 6 kernel[0][1], if you do hack around with it in a VM make sure to add HLT to the idle() function, if nothing else it will save your fans).
One of a small number of books (such as TCP/IP Illustrated[2]) that progressed me from the larval hacker stage.
I also met Lions when I was a kid, but didn't put 2 + 2 together until 20 years later!
For Linux, one of my top 5 favourite books of all time is “Linux Core Kernel”, which is a Linux version of Lions but on pages so big it doesn’t fit on the shelf.
It's a pity that the section on exceptions didn't do a more detailed analysis on the criticism of exceptions; in my personal view I haven't liked exceptions, in a similar vein to how I'm not a huge fan of the implementation of POSIX signals or setjmp / longjmp.
Although I very much see the reason why it was developed, in effect it comes closer to a glorified form of "come from" (and people thought that goto was considered harmful).
I'm the opposite - I really like checked exceptions in Java because it's very easy to see how developers are handling errors and they also form part of the function signature.
Most functions will just pass on exceptions verbatim so it's better than error return values because with them the entire codebase has to be littered with error handling, compared to fewer try catch blocks.
setjmp, etc. are like unchecked exceptions, so I'm also not a fan, but I use this occasionally in C anyway.
>I really like checked exceptions in Java because it's very easy to see how developers are handling errors and they also form part of the function signature.
Errors as return values also form part of the function signature in many languages.
>Most functions will just pass on exceptions verbatim so it's better than error return values because with them the entire codebase has to be littered with error handling, compared to fewer try catch blocks.
The question is whether you think that calls that might pass an error up the call chain should be marked as such. I think they should be.
I wouldn't call this "littered with error handling" just because a certain language has decided to do this in a way that resembles industrial style fly-tipping rather than just littering.
Why would errors as return values have to propagate any farther in the codebase compared to errors as exceptions? If exceptions can be handled, so can the value based errors.
The criticism as I understand it isn't about where the errors are actually handled but the ceremony needed in an errors-as-values model, most obviously in Go where you've got to write a couple of lines of test and early return explicitly for each such possible error, compared to C++ where you write nothing.
Rust's Try operator is the minimal ceremony, a single question mark acknowledges that we might not succeed and if so we return early - but there was still ceremony for each such early return.
I happen to think exceptions are inherently a bad idea, but for a different reason.
I've worked with a lot of code like this (particularly C libraries and litanies of return codes), and it's fine... But I prefer something like Java-style exceptions. And with Java lambdas or Kotlin the trend is unfortunately away from checked exceptions these days...
I believe that for today's large software which has multiple engineers, possibly not even working in the same group, involved in the final product, the key problem of exceptions (knowing what is or is not an "exceptional" situation for the software) is not soluble in the general case. Exceptions are a clever idea if you're a solo writing a soup to nuts piece of software, say the firmware for a thermostat, but are too flawed for the library author or the app developer.
"Exceptions should be exceptional" gets to the heart of the problem with the entire concept of Exceptions for non-trivial pieces of software where there's more than a single programmer maintaining the complete system.
Now the library programmer has to guess whether when you - the application programmer try to Wibble a Foozle - will have ensured all the Foozles can be Wibbled, and so not being able to Wibble the Foozle is an exceptional condition, or whether you want to just try to Wibble every Foozle and get an error return if it can't be Wibbled...
One option is for every such condition you bifurcate the API. Instead of three types with a total of eight methods, maybe there are six conditions for two of those methods, and so now it is three types with over 100 methods... ouch. Good bye documentation and product quality.
Well, this obviously depends on a given programming language/culture, but in my mind I would say in case of parsing a string to an int it is an expected case that it could fail, so I would model it as a Return type of an Int or a ParsingError, or something like that.
Meanwhile, for a function doing numerous file copies and network calls I would throw an exception, as the number of possible failure cases are almost limitless. (Like you surely not want to have an ADT that models FileSystemFullErrors and whatnot).
It so happens that in this particular example one is pure and the other is side-effecting, but I'm not convinced that's a hard rule here.
In a good effect system exceptions as effects are isomorphic to error conditions as data, so the choice comes down to what is more ergonomic for your use case, just like the choice between these three isomorphic functions should is down to ergonomics:
frob1 :: Foo -> Bar -> R
frob2 :: (Foo, Bar) -> R
frob3 :: FooBar -> R
data FooBar = FooBar { foo :: Foo, bar :: Bar }
Everybody chooses a favorite depending on their domain.
A function executes, and some error happens:
- Return error value: try to handle the error ASAP. The closer to the error the more detailed the information. Higher probability of recovery. Explicit error code handling throughout the code. Example: maybe you try again in one millisecond because the error is a very low probability but possible event.
- Exception: managing errors requires a high-level overview of the program state. Example: no space left on device, inform the user. You gather the detailed information where the error happened. The information is passed as-is or augmented with more information as it bubbles up the stack until someone decides to take action. Pros: separate error handling and happy path code, cleaner code. Cons: separate error handling and happy path code, unhandled errors.
Worst case scenario: you program in C. You don't have exceptions. You are forbidden to use setjmp because rules. A lot of errors are exposed directly to the programmer because this is a low-level language. You return error codes. Rules force you to handle every possible return code. Your code gets incorporated as an appendix to the Necronomicon.
Exceptions done well should outperform return value checking. However it’s very difficult to make it perform well and for some reason people prefer -> Result<T, E> instead of -> T throws E which is basically the same thing.
The difference is the monadic capabilities of the result type. Thrown exceptions pepper codebases in ways that are even more unfortunate than monadic composition, which is already kind of iffy, but at least has generic transforms for error recoveries, turning errors to successes of a different type and so on. You end up with far less boilerplate.
Well, PLs with effect systems are basically meant to solve exactly this problem.
E.g. a generic `map` function can become throwing if it handles throwing lambdas, but are otherwise non-throwing - this is pretty much the main pain point with Java's checked exceptions.
I’m not so sure. The page goes over how it works in Scala 3 and it’s a little bit cleaner. But there is some nicety in handling return and exception uniformly in some cases.
Not true, IP addresses are sold on, or even from time to time IP addresses are even leased (they can be transferred from RIRs, temporarily or permanently). Some times IP address ranges are registered with RIR but used in a different geography / region.
Geo-IP databases are mostly accurate, emphasis on mostly.
In the overwhelming majority of cases, mobile roaming traffic uses "home routing", not "local break-out". This means it is routed to the country where the user normally resides, not where they currently are. This means:
- For people visiting the UK (and potentially staying there for a long time, if on a permissive roaming plan), their IP address won't show up as UK despite long-term residence / citizenship.
- British people visiting other countries will still be subject to OSA, even when they should not be.
- People (including British people!) who buy British AirAlo SIMs may not get a British IP. AirAlo often uses SIMs registered in a different country than the one you're visiting, and the "exit node" (P-GW) may be located in a different country altogether. I suspect this last option will become quite attractive if VPN bans ever actually come into effect.
This is pretty much unfixable without major changes in how LTE roaming is conducted worldwide, and the UK isn't important enough to make that happen.
"but if there's a cost to providing free support to the community like official container images, then it will get cut.", but here's the kicker, supporting creating docker images when you're on github is close to negligible as to be paper thin.
I would never rely on headers such as "Sec-Fetch-Site"; having security rely on client generated (correct) responses is just poor security modelling (don't trust the client). I'll stick to time bounded HMAC cookies, then you're not relying on client properly implementing any headers and it will work with any browser that supports cookies.
And having TLS v1.3 should be a requirement; no HTTPS, no session, no auth, no form (or API), no cookie. And having HSTS again should be default but with encrypted connections and time bounded CSRF cookies the threat window is very small.
No, in CSRF the browser is not the adversary, it is a confused deputy, and it’s perfectly reasonable to collaborate with it against the attacker (which is another site).
You might want to develop some critical thinking skills. The doc is wrong, and will soon be updated to say that Sec-Fetch-Site is sufficient on its own.
CSRF is about preventing other websites from making requests to your page using the credentials (including cookies) stored in the browser. Cookies can't prevent CSRF, in fact they are the problem to be solved.
Somewhere auth needs to be done, somewhere, somehow, and some when. And this is done with cookies (be it CSRF, auth token, JWT, etc). There has to be some form of mechanism for a client to prove that a) it is the client it claims it is, and therefore b) it has the permission to request what it needs from the server.
And, the server shouldn't trust the client "trust me bro" style.
So, at the end of the day it doesn't matter whether it is a "rose by another name", i.e. it doesn't matter whether you call it a CSRF token, auth token, JWT, or whatever, it still needs to satisfy the following; a) the communication is secure (preferably encryption), b) the server can recognise the token when it sees it (headers (of which cookies are one type), etc), c) the server doesn't need to trust the client (it's easiest if the server creates the token, but it could also be a trusted OOB protocol like TOTP), and d) it identifies a given role (again it's easiest if it identifies a unique client (like a user or similar)).
So a name is just a name, but there needs to be a cookie or a cryptographically secure protocol to ensure that an untrusted client is who it says it is. Cookies are typically easier than crypto secure protocols. Frankly it doesn't really matter what you call it, what matters is that it works and is secure.
I work as a pentester. CSRF is not a problem of the user proving their identity, but instead a problem of the browser as a confused deputy. CSRF makes it so the browser proves the identity of the user to the application server without the user's consent.
You do need a rigid authentication and authorization scheme just as you described. However, this is completely orthogonal to CSRF issues. Some authentication schemes (such as bearer tokens in the authorization header) are not susceptible to CSRF, some are (such as cookies). The reason for that is just how they are implemented in browsers.
I don't mean to be rude, but I urge you to follow the recommendation of the other commenters and read up on what CSRF is and why it is not the same issue as authentication in general.
Clearly knowledgeable people not knowing about the intricacies of (web) security is actually an issue that comes up a lot in my pentesting when I try to explain issues to customers or their developers. While they often know a lot about programming or technology, they frequently don't know enough about (web) security to conceptualize the attack vector, even after we explain it. Web security is a little special because of lots of little details in browser behavior. You truly need to engage your suspension of disbelief sometimes and just accept how things are to navigate that space. And on top of that, things tend to change a lot over the years.
Of course CSRF is a form of authorisation; "should I trust this request? is the client authorised to make this request? i.e. can the client prove that it should be trusted for this request?", it may not be "logging in" in the classic sense of "this user needs to be logged into our user system before i'll accept a form submit request", but it is still a "can i trust this request in order to process it?" model. You can wrap it up in whatever names and/or mechanism you want, it's still a trust issue (web or not, form or not, cookie or not, hidden field or not, header or not).
Servers should not blindly trust clients (and that includes headers passed by a browser claiming they came from such and such a server / page / etc); clients must prove they are trustworthy. And if you're smart your system should be set up such that the costs to attack the system are more expensive than compliance.
And yes, I have worked both red team and blue team.
You say you should "never trust the client". Well trust has to be established somehow right, otherwise you simply cannot allow any actions at all (airgap).
Then, CSRF is preventing a class of attacks directed against a client you actually have decided to trust, in order to fool the client to do bad stuff.
All the things you say about auth: Already done, already checked. CSRF is the next step, protecting against clients you have decided to trust.
You could say that someone makes a CSRF attack that manages to change these headers of an unwitting client, but at that point absolutely all bets are off you can invent hypothetical attacks to all current CSRF protection mechanisms too. Which are all based on data the client sends.
(If HN comments cannot convince you why you are wrong I encourage you to take the thread to ChatGPT or similar as a neutral judge of sorts and ask it why you may be wrong here.)
Yes, this is documenting one particular way of doing CSRF. A specific implementation.
The OP is documenting another implementation to protect against CSRF, which is unsuitable for many since it fails to protect 5% of browsers, but still an interesting look at the road ahead for CSRF and in some years perhaps everyone will change how this is done.
And you say isn't OK, but have not in my opinion properly argued for why not.
It doesn't actually fail to protect 5%, as the top-line 5% aren't really "browsers". Even things like checkboxes often top out at around 95%!
You can change a setting on caniuse.com and it excludes untracked browsers. Sec-Fetch-Site goes up to 97.6, with remainder being a bit of safari (which will likely update soon) and some people still on ancient versions of chrome.
It's very complicated and ever evolving. It takes dedicated web app pentesters like you to keep up with it... back in the day, we were all 'generalists'... we knew a little bit about everything, but those days are gone. It's too much and too complicated now to do that.
I don't understand what you are getting at. CSRF is not another name for auth. You always need auth, CSRF is a separate problem.
When the browser sends a request to your server, it includes all the cookies for your domain. Even if that request is coming from a <form> or <img> tag on a different website you don't control. A malicious website could create a form element that sends a request to yourdomain.com/api/delete-my-account and the browser would send along the auth cookie for yourdomain.com.
A cookie only proves that the browser is authorized to act on behalf of the user, not that the request came from your website. That's why you need some non-cookie way to prove the request came from your origin. That's what Sec-Fetch-Site is.
I don't think this is accurate. As your parent comment said, Csrf defenses (tokens, origin/Sec-Fetch-Site) serve a different purpose from Auth token/jwt. The latter says that your browser is logged in as a user. The former says "the request actually came from a genuine action on your page, rather than pwned.com disguising a link to site.com/delete-account.
You're misunderstanding my point, the Sec-Fetch-Site is not a replacement for CSRF tokens (be they cookies (classic CSRF tokens, auth tokens, JWTs; all of these can be made to work for the client to prove to the server that they are allow to submit a form (and came from the "right" form), some obviously easier than others), a header (such as X-CSRF-Token - Ruby on Rails, Laravel, Django; X-XSRF-Token - AngularJS; CSRF-Token - Express.js (csurf middleware); X-CSRFToken - Django), a TOTP code, etc), but the Sec-Fetch-Site header is a defence in depth mechanism, not a replacement for CSRF (however that is achieved, classic cookie mechanism or other).
That’s not correct, or is at least seriously misleading. `Sec-Fetch-Site` is a replacement for CSRF tokens. The sole purpose of CSRF tokens is to prevent CSRF, and enforcing that all unsafe[1]-method requests have a `Sec-Fetch-Site: same-origin` header serves exactly the same purpose – in other words, adding a CSRF token to this policy doesn’t achieve anything. The most relevant difference for most apps is that `Sec-Fetch-Site` isn’t sent by older browsers.
Now, I would actually prefer to make this claim about the `Origin` header, since the spec for `Sec-Fetch-Site`[2] says that “in order to support forward-compatibility with as-yet-unknown request types, servers SHOULD ignore this header if it contains an invalid value.” But given that Go 1.25 is deploying a `Sec-Fetch-Site` check[3] as mentioned in the article and that places are recommending it as defence in depth, the `same-origin` value will probably never change in a way that’s backwards-incompatible with this kind of use.
False. OWASP will now be modifying it's doc on the topic to say that Sec-Fetch-Site is sufficient on its own, rather than defense in depth. You really have no idea what you're talking about.
All the voting down but not a single comment as to why. The "Sec-Fetch-Site" primarily protects the browser against Javascript hijacking, but does little to nothing to protect the server.
This is probably apocryphal, but Willie Sutton was asked why he kept robbing banks, he quipped "that's where the money is". Sure browser hacking occurs, but it's a means to an end because the server is where the juicy stuff is.
So headers that can't be accessed by Javascript are way down the food chain and only provide lesser defence in depth if you have proper CSRF tokens (which you should have anyway to protect the far more valuable server resources which are the primary target).
I must be missing something. What does JavaScript have to do with this? My understanding is that csrf is about people getting tricked into clicking a link that makes, for example, a post request to another site/origin that makes an undesired mutation to their account. If the site/origin has some sort of Auth (eg session cookie), it'll get sent along with the request. If the Auth cookie doesn't exist (user isn't logged in/isn't a user) the request will fail as well.
There's server security and there's client security. From what I've seen in these comments people are focused on the client security and are either a) ignoring server security, or b) don't understand server security.
But the server security is the primary security, because it's the one with the resources (in the money analogy it's the bank).
So yes, we do want to secure the client, but if the attacker has enough control of your computer to get your cookies then it's already game over. Like I said you can have time bounded CSRF tokens (be they cookies or whatever else, URL encoded, etc who cares) to prevent replay attacks. But at the end of the day if an attacker can get your cookies in real time you're done for, it's game over already. If they want to do a man in the middle attack (i.e. click on a fake "proxy" URL) then having the "secure" flag should be enough. If the server checks the cookie against the client's IP address, time, HMAC, other auth attributes, will then prevent the attack. If they attacker takes control of your end device, you've already lost.
I, the article and most comments here quite explicitly talked about server security via Auth and csrf protections.
None of this has anything to do with browser security, such as stealing csrf tokens (which tend to be stored as hidden fields on elements in the html, not cookies). MOREOVER, Sec-Fetch-Site obviates the need for csrf tokens.
"MOREOVER, Sec-Fetch-Site obviates the need for csrf tokens.", you're just posting misinformation, you are flat out wrong.
"It is important to note that Fetch Metadata headers should be implemented as an additional layer defense in depth concept. This attribute should not replace a [sic] CSRF tokens (or equivalent framework protections)." -- OWASP; https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Re...
That quote is probably referring to the limitations listed later on the page (https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Re...). I think if you understood that this was the caveat, you wouldn’t use the phrasing “flat out wrong” or have brought up all the irrelevant stuff about client/server security earlier in the thread. You have some kind of deeper misunderstanding, but it’s not clear where.
Once again, you have no idea what you're talking about. Moreover, you lack critical thinking skills - the Sec-Fetch-Site section in that doc is senseless and will now be modified to say that Sec-Fetch-Site is sufficient on its own.
I don't understand why your post is flagged. You are 100% right. The point of CSRF protection is that -you can't trust the client-. This new header can just be set in curl, If I understand correctly. Unlimited form submissions here I come!
CSRF protects the user by not allowing random pages on the web using resources from a target website, without the user being aware of this. It only makes sense when serving people using browsers. It is not a defense against curl or skiddies.
To elaborate/clarify a bit, we defend against curl with normal auth, correct? Be it session cookies or whatever. That plus origin/Sec-Fetch-Site (and tls, secure cookies, hsts) should be reasonable secure, no?
indeed, you need some form of CSRF, but the Sec-Fetch-Site is primarily focused on keeping a browser secure, not the server. Having said that it's nice defence in depth for the server as well but not strictly required as far as the server is concerned.
I'm confused. In my mind, you only really need to keep the server secure, as that's where the data is. Auth cookies and csrf protections (eg Sec-Fetch-Site) are both used towards protecting the server from invalid requests (not logged in, or not coming from your actual site).
What are you referring to when you talk about keeping the browser secure?
The Sec-Fetch-Site header can't be read / written by Javascipt (or WASM, etc), cookies (or some other tokens) on the other hand can be. In most circumstances allowing Javascript to access these tokens allows for "user friendly" interfaces where a user can log in using XMLHttpRequest / API rather than using a form on a page. OOB tokens one a one off auth basis or continuous (i.e. OAuth, TOTP with every request) are more secure, but obviously requires more engineering (and comes with its own "usability" / "failure mode" trade offs).
> The Sec-Fetch-Site header can't be read / written by Javascipt
Perfect. It's not even meant or needed to be. The server uses it to validate the request came from the expected site.
As i and others have said in various comments, you seem to be lost. Nothing you're saying has any relevance to the topic at hand. And, in fact, is largely wrong.
"Nothing you're saying has any relevance to the topic at hand. And, in fact, is largely wrong."; your confidence in your opinion doesn't make you right.
This is not what this is supposed to protect, and if you are using http.CrossOriginProtection you don't even need to add any header to the request:
> If neither the Sec-Fetch-Site nor Origin headers are present, then it assumes the request is not coming from web browser and will always allow the request to proceed.
Wait, but if those headers are missing, then isn't there a vulnerability if someone is using an old browser and clicks on a malicious link? Do we need to also check user agent or something else?
Exactly, the post talks about this too: older browsers will be vulnerable, this probably affects only a small amount of the population and it is even lower if you limit service to accept TLSv1.3 (for this to be useful you of course need to enable HTTPS otherwise the attacker can just strip the headers from your request).
If you can't afford to do this you still need to use CSRF tokens.
I suppose that we could just reject anything that doesnt have these tokens, depending on whether you want to allow curl etc... I might just do that, in fact.
This is like a broadband (white noise) EW jammer; i.e. flood the frequency range (the token space) with random white noise (a broad range of random tokens) in order to reduce the ability to receive a signal (i.e. information).
Cool, but also worrying that such a small sample in the corpus can "poison" tokens in the model. Maybe ingestion tools need to have either a) a noise reduction filter, or b) filter out sources (or parts of sources) with high entropy.
reply