More

Nick-Craver · on June 12, 2021

For pub/sub yes that's correct. For full info though: Redis later added streams (in 5.x) for the don't-wan't-to-miss case: https://redis.io/topics/streams-intro

Nick-Craver · on Oct 10, 2020

Clarification there: I have a US-8 behind each TV, taking 802.11af loads to power. For the TV with an AP or something else PoE behind it, those switches can take in 802.11at and output 802.11af on port 8. The TVs themselves aren’t PoE...I’m crazy, just not that crazy :)

monocasa · on Oct 10, 2020

Lol, word, that makes a lot more sense. Thanks for the clarification! : )

Nick-Craver · on June 26, 2019

I just wanted to chime in from Stack Overflow here and let people know: we are aware of the issue. And we're NOT okay with it. We're trying to sort out how to kill the audio behavior now. It's not very straightforward to find where it's coming from, but we are working on it. We've also reached out to Google for their assistance in tracking it down. If anyone can offer advice, we'll more than happily take it.

- Nick Craver, Architecture Lead at Stack Overflow

coldpie · on June 26, 2019

Why are you allowing arbitrary javascript to be served to your users?

nerdponx · on June 26, 2019

Wish I could upvote this 1,000 times.

It's ridiculous. It's a text-based ad. At worst, it's a clickable image. At what point did it become okay in your minds to let advertisers run arbitrary code?

I've left ads turned on specifically on StackOverflow because 1) I want to support StackOverflow, and 2) I trust them not to run malicious ads.

I don't even care that they're running ads network-wide. But if they're going to be running these kinds of ads anywhere on the site, they're going right on the ad block list along with everyone else.

mikeash · on June 26, 2019

It’s completely insane. Can you imagine a TV station receiving ads on tapes and playing them to their audience without looking at them first? Can you imagine TV stations occasionally showing ads containing porn, urging people to kill, showing extreme violence during cartoons, or containing specially crafted audio that blows out your speakers, and the TV station just shrugs and says they try their best to stop these things but they can’t stop everything?

Imagine a TV ad that tries to make your phone call a 1-900 number so they can rip you off, and the station says they don’t know where it came from but they’re trying real hard to put a stop to it. And somehow watching the ads themselves before broadcasting them never crosses their mind.

josephg · on June 27, 2019

It’s worse than that. Imagine a TV ad which sends malicious code that gets executed to your television, which profiles the hardware in your TV and sends information about your viewing habits (tied to a unique ID) back to the advertiser.

In any other context we would call this a security vulnerability. I think that label also applies here.

Spooky23 · on June 27, 2019

You don’t need to, it happens already. Many TVs do screen grabs and send everything you do to the manufacturer or partners.

zeta0134 · on June 27, 2019

My Vizio's built-in software tries to do that. There's a reason it's not allowed to connect to wifi.

DCoder · on June 27, 2019

When you say "it's not allowed", do you trust its own settings? Are you sure it's not doing something like [0]? How do you even protect against that?

[0]: https://www.reddit.com/r/privacy/comments/bpr6xs/if_you_choo...

zeta0134 · on June 28, 2019

I bought a new wifi router and never told the Vizio the new credentials. If it manages to somehow figure out how to log onto the new router, and transmit the data about how I don't own cable service and mostly use it to play retro games? I'm going to be kindof impressed really; at that point, Vizio can have the data.

islon · on June 27, 2019

My wifi router has an interface that shows every connected device and I can blacklist them based on their physical address.

DCoder · on June 27, 2019

In the post I linked to, the TV in a similar situation was happily connecting to someone else's (open) WiFi network nearby. You can't really block those…

ammar2 · on June 27, 2019

Let it connect to your network and then black-hole it?

But that's assuming it doesn't try to connect elsewhere if it detects it doesn't have internet.

nerdponx · on June 28, 2019

And this safely outside the scope of what most people know how to do with their routers.

Imagine having to take countermeasures like this to prevent things you've purchased from spying on you!

SmellyGeekBoy · on June 27, 2019

Don't buy a "smart" TV?

cerebellum42 · on June 27, 2019

Isn't every decent TV these days a smart TV? Not exactly practical advice

neop1x · on June 29, 2019

don't connect smartTV to the internet, juse use the DP/HDMI inout ;)

anoncake · on June 27, 2019

jacobush · on June 27, 2019

I guess with GDPR.

koolba · on June 26, 2019

I bet you could do that with an ad that plays “Alexa- call 1-900-555-1234”.

baroffoos · on June 27, 2019

The state of web ads is closer to the public pinboard only instead of having ads for grandmas couch its Mr CEO trying every trick to drain your money and track you.

manigandham · on June 26, 2019

Tv spots are very limited. Digital ad impressions number in the billions with 10s of millions of ad creatives. It’s not the same situation.

mikeash · on June 26, 2019

The only reason it’s not the same situation is because they’re willing to throw their users under the bus for a little extra cash. If they wanted to exert more control, they absolutely could. Ads would cost more and we’d see fewer distinct ads as a result.

manigandham · on June 27, 2019

That is absolutely not the only reason. Digital ads work entirely different from the TV medium and its more than "a little extra cash".

No single publisher today really has the power to change much, no matter how big they are. The issue likes with adtech (like Google) and advertisers.

mikeash · on June 27, 2019

Digital ads could work where every single one is vetted by people before it’s served to any users. There is no reason it can’t work this way, other than it being a lot cheaper to skip that step.

manigandham · on June 27, 2019

All creatives (and the root templates of dynamically construted ones) are actually audited on the advertiser-facing platforms before they ever get to the publisher.

Unfortunately running javascript means these ads can do anything at any time and change into malware. Other than adding some technical guardrails, the best practice would be to ban bad actors (of which many are known and usually the same shady people) but many large adtech companies look the other way because it makes money and they have no consequences.

Malware and adfraud is primarily a business problem, not a technical one.

mikeash · on June 27, 2019

So, don't allow them to run JavaScript. That's not necessary, just convenient.

manigandham · on June 27, 2019

See my other comment for how it all works: https://news.ycombinator.com/item?id=20290673

It's not that simple. There are many layers in the supply chain that currently requires JS. Publishers can't disable the JS and they can't demand JS-free creatives either.

mikeash · on June 27, 2019

Of course it’s that simple. Don’t let ads run JS. Done.

You’re saying that doing this would drastically decrease ad revenue. Which is what I’m saying too: it’s about money, not necessity.

Would a site like SO be unable to survive without ads that run arbitrary JS? I don’t know. Even if the answer is that they must do this to survive, it’s still insane that content companies let randos inject arbitrary code into their pages. If this is so entrenched in the industry that there’s no way around it, that just means the industry is insane.

manigandham · on June 27, 2019

Money is a necessity, that's how SO exists, and it wouldn't sustain its current size if it required JS-free network campaigns or tried to sell all ad space directly.

Simple doesn't mean it's easy or realistic. Yes, adtech has major problems but they're being slowly worked on and won't change overnight. This applies to any other industry where you think can just walk in and solve everything if everyone just did X. Reality doesn't work that way.

mikeash · on June 27, 2019

We know that advertising can work and make money without arbitrary JS. When there’s a clear existence proof, is it really wrong to say that a problem could be solved by not doing the problematic behavior?

Of course reality doesn’t work that way. Ad companies aren’t going to change, because they like money and don’t give a shit about users.

We’re stuck in a local minimum. It’s insane. It could be easily fixed if everyone just stopped doing the insane things. And they won’t stop.

scarface74 · on June 27, 2019

Maybe the business model of ExpertsExchange where they charged money wasn’t such a bad idea....

manigandham · on June 27, 2019

The fact that everyone uses StackOverflow and nobody uses ExpertsExchange seems to say otherwise.

aerique · on June 27, 2019

Well, my browser hasn't been running JS for ages and more people are going to do that. If the business isn't going to fix it, users will.

And yes, I'm enjoying my 90's internet and enable JS when it is needed (rarely) for specific domains.

pushpop · on June 28, 2019

Yet adverts on porn sites do operate as per our wish list:

* adverts are vetted by a human

* adverts are not allowed to inject JavaScript.

There have been a few interesting blog posts from businesses outside of the adult entertainment industry where they discuss just how work is involved in getting an advert approved on adult sites.

It’s a sad state of affairs when an adblocker is less required on porn sites than it is on Stack Overflow.

manigandham · on July 4, 2019

All major ad networks audit every single creative. The problem is javascript can change at anytime, and the publisher is the most exposed and also the most removed to be able to discover and mitigate. There have been some movements to whitelist the JS providers but volume is incentivized so most networks look the other way for now.

Adult ads are definitely not better and are served by even looser networks that allow anything. That industry has pioneered things like popunders, clickjacking, and monetizing every possible action on a window while serving as the primary vector for malware and browser bitcoin mining. I'm not sure what blog posts you've read but the only strict standards they would have is on getting paid.

pushpop · on July 8, 2019

Like everything, it depends on the sites in question. Disreputable adult sites aren’t going to be any better nor worse than disreputable sites of any other content. However adult sites run as a reputable business - of which there are many - most certainly do follow the points I described earlier.

What you’re effectively doing is looking at Source Forge and then arguing that Github, Gitlab and Bitbucket are all probably just as bad.

wdr1 · on June 27, 2019

Or -- the more expensive ads don't justify the ROI, meaning advertisers don't buy them, meaning fewer ads, but less content.

simongr3dal · on June 26, 2019

If you can't manage to oversee it because of the scale you don't deserve to take advantage of the scale.

manigandham · on June 27, 2019

That sounds nice but is neither realistic or even sensible. There are other solutions like sandboxing to prevent access to features, it's not an unsolvable problem.

bamboozled · on June 27, 2019

Well I would argue if billions will see the content, that gives more reason to have it checked over before serving no?

manigandham · on June 27, 2019

Billions? No single creative is seen by that many. In fact, with dynamic creative optimization (DCO) and all the optimization that happens, you can easily get creatives that are custom generated and only see by a few individuals or even a single person.

luckylion · on June 27, 2019

The comment was referencing the parent: Digital ad impressions number in the billions with 10s of millions of ad creatives

manigandham · on June 27, 2019

I wrote both comments. There are billions of impressions but a single creative is not seen by that many. The point is that the scale is too large to validate on the publisher side.

pushpop · on June 28, 2019

It seems to me there are two solutions to this problem:

* remove the ability for 3rd parties to abuse their automatic powers (ie disable their ability to inject JavaScript)

* or have a human manually vet every creative

The problem here is you neither want to control their access nor take responsibility for monitoring their access. So the blame equally lies with yourselves for not managing an easily exploitable vector of attack.

If this were any other system, eg VPN, security professionals would tear you a new asshole and point out just how irresponsible your lack of management is.

You’re only excuse here is greed and frankly I’m disgusted.

manigandham · on July 4, 2019

Major ad networks already vet every creative. The problem is javascript which can change at anytime. Banning javascript in creatives is not a technical problem, it's a business and politics problem. Same with just about every other issue in adtech.

I'm not sure who you think I am or why you're accusing me but none of this is down to a single person.

Ajedi32 · on June 27, 2019

I think this comment[1] on the linked Meta question explains it pretty well:

> To the people confused why ads need to run their own Javascript (even ones that are just static images): The short answer is that Ad Networks do not and cannot trust website operators. They need to run their own JavaScript served from their own servers in order to verify that a real user saw the ad and for how long, and they can't trust the website operator to tell them. And these pieces of JavaScript tend to be more invasive and privacy-destroying than the website's JS because they care, far more than the actual website does, that the "user" is not a bank of iphones in a sweatshop in China.

[1]: https://meta.stackoverflow.com/questions/386487/why-is-stack...

wlesieutre · on June 26, 2019

Not just arbitrary JavaScript, arbitrary JavaScript where they can’t easily even see where it came from! Sheesh.

Could we require advertisers to sign their ad code to have a trail of where it came from, prevent tampering, and make it easier to pull the plug on bad actors?

The people bearing the costs of the internet ad economy aren’t the people in any position to do anything about it. So there’s very little pressure to fix anything.

Maybe if the US government started threatening to enact something like GDPR unless the a democratic industry gets its shit together.

manigandham · on June 27, 2019

Large adtech demand/sell side platforms do not want to remove these bad actors because they make money on percentage of spend. They are incentivized to increase volume and ad spend at all costs, and there is no regulation to stop them from doing otherwise by continuing to deal with shady companies and known malware techniques.

m0dest · on June 27, 2019

The solution is in sight. It's called Feature Policy.

https://feature-policy-demos.appspot.com/

https://developers.google.com/web/updates/2018/06/feature-po...

lol768 · on June 27, 2019

This is not a solution. JS still runs, it just has limited access to certain features.

You also need to somehow <iframe> the ad content (and serve it from somewhere else with the feature policy header set/attribute on the iframe set) or else sacrifice use of these features on your own site.

The solution is to make the ads inert. They do not need to run code.

_eht · on June 27, 2019

Why are you allowing arbitrary JavaScript to run on your device?

colanderman · on June 27, 2019

Sites like StackOverflow require JavaScript to work (or at least, to work in a manner approaching interactivity). So, even someone who disables JavaScript normally, would presumably enable it in order to use this popular and useful site. Furthermore – and importantly – they place trust in StackOverflow not to abuse the privilege of executing arbitrary JavaScript. That is an entirely reasonable thing for a technically savvy modern web user to do.

By serving this ad with JavaScript not vetted to StackOverflow's presumed standard, StackOverflow has violated that trust. Thus the onus is on them, not the user, to remove the offending ad or risk damaging their brand.

Honestly, what you said is like saying "why would you ever not keep a hand on your wallet" after someone got pickpocketed in a nice restaurant. Reasonable people have reasonable expectations of safety in certain places which they trust to provide it for them. No-one should go around being constantly paranoid of pickpockets everywhere, no more than anyone on the web should be constantly paranoid of malicious JavaScript even on sites with established records of safety.

DCoder · on June 27, 2019

> So, even someone who disables JavaScript normally, would presumably enable it in order to use this popular and useful site.

I agree that StackOverflow is at fault here, but enabling JS is not a binary choice — "allow all JS on this site" vs "block all JS on this site" are not your only options.

Tools like uMatrix allow me to control JS coming from different domains on different domains independently. For example, on SO I have enabled JS from Stack Exchange and related domains, but not from Google or other snoopers.

zhangjunphy · on June 27, 2019

Revenues are important. The users will not notice unless something happens. And when something happens they forget fast.

runn1ng · on June 27, 2019

More money that way

gotodengo · on June 26, 2019

From the post:

"The ad is attempting to use the Audio API as one of literally hundreds of pieces of data it is collecting about your browser in an attempt to "fingerprint" it... Your browser may be blocking this particular API, but it's not blocking most of the data."

Seems like killing the audio is the metaphorical putting a finger in the dyke of serving arbitrary JavaScript to your users.

Benjammer · on June 27, 2019

Maybe in the dyke holding back user outrage, but the dyke of serving arbitrary JavaScript was never built in the first place.

gowld · on June 27, 2019

It's spelled "dike".

brokenmachine · on June 27, 2019

Not in England, which incidentally is where English originates from.

inferiorhuman · on June 26, 2019

Nick, how did things go so wrong from three years ago?

e.g. https://news.ycombinator.com/item?id=20289841

Nick-Craver · on June 27, 2019

I don’t know. I am so very much trying to find out and push to make things better.

WoahNoun · on June 27, 2019

So no vetting on new ad tech?

Coding_Cat · on June 26, 2019

> we are aware of the issue. > We're trying to sort out how to kill the audio behavior now.

Are you really aware of the issue? The issue people have here is not the fact that the ad is trying to access the audio api per se but that it is trying to fingerprint the users.

wtmt · on June 27, 2019

If you're "NOT okay with it", how about stopping ads completely until you resolve this problem? That should give a bigger impetus to solve it ASAP as the bottom line gets hit for multiple stakeholders.

This is not just ads, but about fingerprinting and tracking users somehow or the other by third parties. It's plain evil, and not a decent thing to continue foisting on your unsuspecting users after you've known it. Tell management to take an ethical stance and preserve the reputation of SO.

someotherperson · on June 27, 2019

Probably not his call. By "we" he's probably talking about the engineering team, which in many cases is nothing more than a conduit for whims of the marketing and sales teams.

The only time they'd do that is if the marketing team decided that the value-add from taking ads off cancelled out the profit loss from taking the ads off.

wtmt · on June 27, 2019

I completely understand that it may not be his call. That's why I said "Tell management to take an ethical stance and preserve the reputation of SO."

Maybe he (or someone else in the team) has already given this as a temporary solution but it's been rejected. Since we don't know what's going on in the background, this suggestion being put on a public forum is still worthwhile. It could also help external parties (like HN readers) add more pressure in not letting this kind of surveillance continue just because the company doesn't want to stop making money while they're working on a solution or waiting for Google (or someone else) to help.

Every minute they delay cutting this off puts thousands of people in a position of vulnerability.

MzHN · on June 27, 2019

So, we have:

- Stack Overflow makes a blog post about not using dynamic ads.

- Dynamic ads found on Stack Overflow, with aggressive fingerprinting.

- Architecture Lead doesn't know how this happened and is getting serious.

I have so many questions. I hope this gets a post-mortem.

amluto · on June 26, 2019

The fundamental problem seems to be that you are including non-sandboxed JavaScript that you don’t control.

Perhaps you should stop doing that.

shostack · on June 28, 2019

Would something like SafeFrame have avoided this issue?

https://www.iab.com/guidelines/safeframe/

geocar · on June 26, 2019

Hi Nick,

If you're serious about this, I've built tools for the publisher side for stopping exactly this.

My email address is in my profile.

Nick-Craver · on June 27, 2019

I’m very interested and very serious. Email sent.

JeremyBanks · on June 26, 2019

I just saw this post, where an potential justification was provided for a similar script in the past: https://meta.stackoverflow.com/questions/335956/adzerk-servi...

It's hard to read the obfuscated code and be sure what's being done with the browser environment information. This script seems to generate some hash and put in some global variables, presumably for some other script to consume. I don't know whether such scripts send it to a server, compare it locally to a previously-known value, or ignore it.

jf · on June 26, 2019

I would pay for an ad-free version of Stack Overflow. Take my money, please.

minitoar · on June 26, 2019

I think the data in aggregate is worth more than people like you would pay for an ad-free service.

nerdponx · on June 26, 2019

This is the actual problem at the heart of it all. And even if it were more profitable to take subscription fees than to serve ads, what's stopping you from "double dipping" and serving ads anyway?

Scoundreller · on June 26, 2019

Or taking your subscription money and tracking you anyway. Knowing your interests on one site helps target you elsewhere.

wtmt · on June 27, 2019

ArsTechnica (obviously a very different site compared to SO) has an ad free subscription model where it also removed all trackers for paying subscribers. It's possible to do this in an ethical way. Whether the site publisher is interested or not is a different matter.

Wowfunhappy · on June 26, 2019

> what's stopping you from "double dipping" and serving ads anyway?

People looking at the source code, like what happened here.

nerdponx · on June 27, 2019

You think the NY Times, Linkedin, etc. is going to have the same response as StackOverflow? Good luck even getting in touch with someone who knows what you're talking about.

Wowfunhappy · on June 27, 2019

If LinkedIn (to choose a random example) advertises one of the perks of subscribing is that you won't be tracked, and then tracks you anyway, that's a story for The New York Times et al.

nerdponx · on June 27, 2019

Sure. But my point was that the NYT is an example of a paid service that openly serves you a big pile of invasive ads, even if you're a paying subscriber.

Imagine if all the ads in the print edition were spying on and tracking your every move.

bassman9000 · on June 27, 2019

Very likely. I'd pay hundreds of dollars a year to Gogle if they guaranteed* me, with severe legal repercussions otherwise, that they wouldn't track me, or allow a single bit of my data, anonymized or not, leave their servers, or be used in any other way that wasn't for my own purpose.

Re-selling digital personas as commodities must be far more lucrative.

brokenmachine · on June 27, 2019

> Gogle

Is that the evil twin?

LaurentS · on June 27, 2019

I actually wonder about this. SO's typical user is tech-savvy, and I would imagine many access the site with adblockers on (I do). So I suspect my value to the site in terms of ads is close to zero. I would happily pay a monthly subscription to know that the service will remain, given how much value I derive from it, if that gave me the assurance that they won't track me with ads/cookies/fingerprinting.

Their other income is from job ads, and I guess the value is that they have lots of data points about their logged in users (with scores high enough to imply they've interacted with the site a fair bit), in the form of what is posted, worth more than the aggregated list of websites that a user sees (as reported by ads).

I'd love to know more about this, as I have very little understanding of the economics of serving targeted ads. How much can they be making from ads?

viraptor · on June 27, 2019

But they're mostly wouldn't pay in ads either. The difference may be pay-for-ad-free vs adblock-and-no-money rather than getting more ad views.

pushedx · on June 27, 2019

It looks like something using fingerprintjs2.

This library is very popular.

https://github.com/Valve/fingerprintjs2/blob/master/fingerpr...

detaro · on June 26, 2019

Not sure how that plays with rules about how you can place ads etc, but <iframe> with a feature policy can stop access to audio I think.

IloveHN84 · on June 27, 2019

Why don't you block all the JavaScript not coming from your origin and just display a simple link+PNG as advertising?

colek42 · on June 27, 2019

This is exactly why I block third party advertisements for myself and everyone that uses my network.

Pimpus · on June 26, 2019

[flagged]

komali2 · on June 26, 2019

Hey don't drag satanism into this, Lucifer doesn't serve His followers arbitrary JavaScript!

de6u99er · on June 27, 2019

I hear from multiple sides people reporting, to receive ads about topics thy only talked to friends about but never entered in a search engine.

Google has is currently as far away from their previous world famous "don't be evil" corporate culture.

Other examples are AMP where Google wants to make it harder to de-individualise URL's. This is being driven to an extend where Chrome on Android makes it harder to edit the URL.

Or games like Egress or PokemonGo, which in my opinion helps Google constantly update their WiFi SSIDs-To-GPS-location database.This database is rhen furthermore being used to track users location through a little permission called "WiFi Control", which also can not be found in the regular App Permissions settings entry.

To me WiFi-Control sound nothing like location tracking. But I have to admit, I am not a native speaker. Therefore I might be misunderstanding something.

tjpnz · on June 27, 2019

"Don't be evil" was replaced by "Do the right thing" years ago. Great piece of corporate speak right there.

Nick-Craver · on May 23, 2017

12,095,709 questions have an answer, 7,506,004 of those have an accepted answer, and 1,813,270 aren't yet answered.

I'd say your 1:20 ratio is just a little bit off :)

tallanvor · on May 23, 2017

Just out of curiosity, do those 7.5+ million accepted answers include those closed as duplicates? Because by far my biggest complaint is finding the exact question I have was closed as a duplicate and links to a question that is useless at answering my question.

ygra · on May 23, 2017

In that case you can vote to re-open and perhaps even post a bounty. Although bounties tend to invite lots of low-quality, low-effort answers just on the off chance that they might be the top-voted one once the bounty runs out.

linkregister · on May 23, 2017

Thanks for the correction! I am asking pretty niche questions.

lucb1e · on May 23, 2017

I feel you. I've taught myself programming between 13 and, well, I'm now 23; so by the time stackoverflow came around I had figured out how to solve things myself. When I have a question, it's usually either opinion-based (bad fit for SO) or not a common question.

I'd say 1:20 is a good estimate if I ignore answers that didn't read my question (which is most of them), but indeed the facts disagree.

tyleraldrich · on May 23, 2017

What? Stackoverflow has been around since ~2008 - You certainly didn't learn how to solve things yourself a year into programming :).

lucb1e · on May 23, 2017

Back then I didn't speak proper English, and how many questions were actually covered on SO in the beginning? It took some years to get to where we are, both for SO and for my English ;)

Nick-Craver · on May 22, 2017

We could - but the network side isn't the problem. There's a lot of logging, user banning, etc. pieces that need IPv6 love first. We just haven't had the time yet.

There are network bits we'd have to evaluate heavily as well, e.g. firewall rules - basically the very limited benefits don't make it a priority, yet. When things change there, we'll do it.

Nick-Craver · on May 22, 2017

Split horizon would point you at the same data center, rather than the writeable one. So that's more of a .local than a .internal. We discussed this, but ultimately the AD version we're on (pre-2016 Geo-DNS) it's not actually supported the way you'd need, and it's a nightmare to debug.

We'd consider it for a .local, when the support it properly there in 2016. Even subnet prioritization is busted internally, so that's a bit of an issue. Evidently no one tried to use a wildcard with dual records on 2 subnets before (we prioritize the /16, which is a data center) and it's totally busted. Microsoft has simply said this isn't supported and won't be fixed. A records work, unless they're a wildcard. So specifically, the <star>.stackexchange.com record which we mirror internally at <star>.stackexchange.com.internal for that IP set is particularly problematic.

TL;DR: Microsoft AD DNS is busted and they have no intention of fixing it. It's not worth it to try and work around it.

jontro · on May 22, 2017

Interesting, thanks!

Nick-Craver · on May 22, 2017

Well, yes and no - it depends on the length. Let's take 3 common examples. Here's GitHub's relevant headers (that we don't have):

Content-Security-Policy:default-src 'none'; base-uri 'self'; block-all-mixed-content; child-src render.githubusercontent.com; connect-src 'self' uploads.github.com status.github.com collector.githubapp.com api.github.com www.google-analytics.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-user-asset-79cafe.s3.amazonaws.com wss://live.github.com; font-src assets-cdn.github.com; form-action 'self' github.com gist.github.com; frame-ancestors 'none'; img-src 'self' data: assets-cdn.github.com identicons.github.com collector.githubapp.com github-cloud.s3.amazonaws.com *.githubusercontent.com; media-src 'none'; script-src assets-cdn.github.com; style-src 'unsafe-inline' assets-cdn.github.com

Public-Key-Pins:max-age=5184000; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18="; pin-sha256="RRM1dGqnDFsCJXBTHky16vi1obOlCgFFn/yOhI/y+ho="; pin-sha256="k2v657xBsOVe1PQRwOsHsw3bsGT2VzIqz5K+59sNQws="; pin-sha256="K87oWBWM9UZfyddvDfoxL+8lpNyoUB2ptGtn0fv6G2Q="; pin-sha256="IQBnNBEiFuhj+8x6X8XLgh01V9Ic5/V3IRQLNFFc7v4="; pin-sha256="iie1VXtL7HzAMF+/PVPR9xzT80kQxdZeJ+zduCB3uj0="; pin-sha256="LvRiGEjRqfzurezaWuj8Wie2gyHMrW5Q06LspMnox7A="; includeSubDomains

Those are 1220 bytes. I'm not sure what they'll compress down to, but it's still non-trivial and not near 0 (anyone want to run the numbers?).

The same pair of headers are 969 bytes for facebook.com and 2,772 for gmail.com.

I don't know what ours would be - since we're open-ended on the image domain side it's a bit apples-to-oranges compared to the big players.

When you take into account that you can only send 10 packets down the first response (in almost all cases today) due to TCP congestion window specifications (google: CWND), they get more expensive as a percentage of what you can send. It may be that you can't send enough of the page to render, or the browser isn't getting to a critical stylesheet link until the second wave of packets after the ACK. This can greatly affect load times.

Does HPACK affect this? Yeah absolutely, but I disagree on "negligible". It depends, and if something critical gets pushed to that 11th packet as a result, you can drastically increase actual page render time for users.

If it helps, I did a blog post with some details about this a while back: https://nickcraver.com/blog/2015/03/24/optimization-consider...

hdhzy · on May 22, 2017

Oh I wasn't clear - I meant that for the same connection headers are not sent for every page but just references for previous values (see [0]). The initial page load is a different matter but that's part of the cost/risk analysis if you need CSP or HPKP (I agree it's not necessary and very easy to mess up).

[0]: https://http2.github.io/http2-spec/compression.html#indexed....

> When you take into account that you can only send 10 packets down the first response (in almost all cases today) due to TCP congestion window specifications (google: CWND), they get more expensive as a percentage of what you can send. It may be that you can't send enough of the page to render, or the browser isn't getting to a critical stylesheet link until the second wave of packets after the ACK. This can greatly affect load times.

I wonder how much of the page can be rendered in 10 packets...

Do you send Link preload headers?

DamonHD · on May 22, 2017

I explicitly try to ensure that for my sites the first 10kB sent (so less than 10 packets typically) is enough to render all the information above the fold. Anything essential should make it out in the first 2 packets for old TCP slow-start rules. (Lipstick and ads can arrive later, once the user is happy reading or whatever, IMHO.) Has been my policy since about the mid '90s!

Nick-Craver · on May 22, 2017

Well if it wasn't for someone buying <star>.com back in the day, we probably could have them. Oh and then buying <star>.<star>.com after browsers banned that one, which led to RFC 6125 rule clarifications and restrictions.

DamonHD · on May 22, 2017

Hey, I'm pretty sure that the first real domain name hack was sex.net, which as the proud owner of ex.net [PS: or was it sexnet.com, as we also have exnet.com?] caused some upset for a while, though mainly to disappointed one-handed typists I believe... B^>

BTW, did I blink and miss the "It really is all faster over HTTP/2, even given TLS" bit? My testing for my tiny lightweight sites close to their users (the opposite of what you're dealing with) is that HTTP/2 is slightly slower overall. Even with Cloudflare's advantages such as good DNS. And with the pain of cert management...

http://m.earth.org.uk/note-on-carbon-cost-of-CDN.html

Anyhow, thanks for the warts-n-all.

knodi123 · on May 23, 2017

> which as the proud owner of ex.net

haha, that page is a priceless timecapsule:

Use the Java applet below to search ExNet's main Web pages.

When the ``Status'' indicator stops flashing and says ``Idle'', type key words in the ``Search for:'' box.

The ``Results:'' box will show you the documents that matched your key words, the best matches coming first in the list. Click on any line in the ``Results:'' box, and that document should appear in a new browser window in a few seconds. When you are finished with that document, you can close it without killing your browser.

DamonHD · on May 23, 2017

That code did search-by-word from (IIRC before Google existed, ie Netscape 2) right up until Java applets were dropped, across all compliant browsers AFAIK. It did roughly what G's live search now does.

lilyball · on May 22, 2017

I would imagine the more resources your page has, the more benefit you can get from HTTP/2 because of Server Push. So if you're comparing a tiny lightweight site, I'm guessing you can't benefit as much from Server Push.

DamonHD · on May 22, 2017

I have relatively little that would benefit from push; basically a tiny hand-crafted CSS file that I currently inline because HTTP/1.1 and even HTTP/2 overhead for having it separate may be too high.

Nick-Craver · on May 22, 2017

Yep - we're aware. I thought about putting in our Content-Security-Policy-Report-Only findings about what all would break, but the post was already a tad long. It's quite a long list of crazy things people do.

As the headers go, here's my current thoughts on each:

- Content-Security-Policy: we're considering it, Report-Only is live on superuser.com today.

- Public-Key-Pins: we are very unlikely to deploy this. Whenever we have to change our certificates it makes life extremely dangerous for little benefit.

- X-XSS-Protection: considering it, but a lot of cross-network many-domain considerations here that most other people don't have or have as many of.

- X-Content-Type-Options: we'll likely deploy this later, there was a quirk with SVG which has passed now.

- Referrer-Policy: probably will not deploy this. We're an open book.

tomschlick · on May 22, 2017

Great! Thanks for the detailed response!

Expect-CT is one to look at as well.

Basically just tells the browser that Certificate Transparency should be available through the provider (DigiCert in this case).

mrbabbage · on May 22, 2017

> - Public-Key-Pins: we are very unlikely to deploy this. Whenever we have to change our certificates it makes life extremely dangerous for little benefit.

Is it possible to pin to your CA's root instead of to your own certificate? That would make rotating certs from the same CA easy but changing CAs hard (but changing CAs is already a big undertaking for big orgs).

Also, I see your five minute HSTS header ;)

Nick-Craver · on Nov 15, 2016

If curious, I did a post on that a while back. I'm settling into a new house and will pick this series back up soon.

https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...