Hacker News new | past | comments | ask | show | jobs | submit login
Is Google (and mod_pagespeed) the Poor Man's CDN? (codefromaway.net)
46 points by narcissus on Feb 2, 2012 | hide | past | favorite | 35 comments



This article really doesn't have anything to say other than they found a new feature, and we should install it. What was the speed difference? Did it alleviate load on their servers?

I spoke to my web host about mod_pagespeed before, they reported that it was far too buggy to be used just yet. Just an anecdote, but I happen to like my host a lot.


How is this different from simply creating a sub-domain on your server for CDN-like functions?

For example, Google's "Page Speed, Performance Best Practices" recommends setting up your own sub-domain(s) to serve static content[1]. Yahoo! recommends using a CDN service provider[2], but the general idea of using other domains to serve static content is the same.

Also, some static content can be referenced from third-party hosts, such as jQuery on Google Code or Twitter's Bootstrap CSS on GitHub.

As an aside, I'm surprised that, seemingly, a large number of web developers have never heard of - or read - the performance best practices from Google or Yahoo!.

1. http://code.google.com/speed/page-speed/docs/rtt.html#Parall...

2. http://developer.yahoo.com/performance/rules.html#cdn


Its different, because it seems you're confusing CDNs with Parallelizing Downloads. These are not one and the same.

They're different because the recommendation to create one or more subdomains for static content has two purposes:

1. So that you can exclude some overhead from things like cookies being passed on each request, as its a separate hostname from where the dynamic content loads.

2. So that there are multiple hostnames to download static content from, so web browsers can take advantage of their maximum connections per host setting, to download your static content in parallel.

Neither of those points included CDN: Geographically disperse high speed network serving content to your viewers from the nearest CDN server to them.

The proposal from the linked article is that they are able to take advantage of a CDN service for free, without having to pay the likes of Amazon Cloudfront.


True, which is why I said "CDN-like functions" because, for smaller web applications and websites, having paralleled downloads would increase performance (without having to use a CDN service).

Also, what about hosting your static scripts and style sheets on a service like Google Code or GitHub? Would that give you a "CDN-like" performance increase as well?

I'm advocating non-CDN services through other techniques that can provide increase performance for smaller web application and websites.


I don't believe any such static host would necessarily improve your performance over your own host or s3.

Cloudfront is surprisingly inexpensive, especially for smaller web applications and websites: http://aws.amazon.com/cloudfront/pricing/

You probably have only a few MBs of static content, cloudfront can be configured to have a cloudfront URL that just fetches from http://yournormalwebsite.com. No S3 bucket involved, no uploading content. Just change your static content URLs to be http://cloudfrontURL/static content.

Even with tens of thousands of visitors a month, you'll owe... < $5.

Or as suggested elsewhere here, just use cloudflare, which is $0.


IIRC the point of using a subdomain is to get rid of cookies so that the server can cache the static content efficiently.

This is entirely different. Looks like the trick described in this article lets you "host" your static content on Google's CDN.


Funny how I made a couple of busy sites faster recently by taking them off CDN and just moving them to a faster, better configured server instead (that ironically was less expensive). All the CDN in the world would not have made them as fast as they are now.

Unless your site is heavily image based, I simply do not understand CDN for static content.

If you are trying to trick browsers into accepting more connections, first ask yourself why your site is so poorly designed that it needs so many connections - then if necessary use additional domains pointing to your own servers.


A good CDN provides "close" (network latency wise) servers so if there are 30 assets to be loaded, most are done over a 10 ms hop, not a 60 ms hop. Get a complex page with 100+ assets, or try to serve it to another continent and the difference is even more impressive.

Also if you're running a "busy" site off one server, it may not be busy enough to warrant a CDN's traffic off-loading perk. When you start talking about thousands and tens of thousands of requests per second, being able to off-load 80% of those requests makes a huge difference in how much hardware you need to deal with locally.


There should not be 100 externally loading elements on an initial page load, ever.

Any site does that needs to get a better developer.


So say 50 elements. With a 50ms difference that's 2.5 seconds of total time. Most browsers will do 2-4 simultaneous connections, so let's say 4, that's still over 1/2 second faster page load. Now put that end user in Europe or Africa, and you're looking at 200-500 ms latency per call, versus 10 ms. You can see how the math works.

Assets loaded on some of your recently submitted URLs:

21 - http://www.mozilla.org/en-US/firefox/10.0/releasenotes/

60 - http://www.adequatelygood.com/2010/2/JavaScript-Scoping-and-...

104 - http://www.hollywoodreporter.com/thr-esq/falling-skies-lawsu...

282 - https://plus.google.com/u/0/111314089359991626869/posts/HQJx...

54 - http://www.computerworld.com/s/article/9223601/Anonymous_dup...

67 - http://abcnews.go.com/blogs/politics/2012/01/rand-paul-in-pa...

70 - http://www.wptavern.com/bad-behavior-in-the-wordpress-commun...

etc....

The impact of reducing latency for each of those calls, and offloading those requests from your server(s) is a big deal for big sites.


Strong upvote for defending your argument with researched examples.

I guess I am old school, I'd never serve that many elements on initial page load, CDN or not, that's crazy.

Each additional DNS lookup can add up to 2 seconds if it's a cold cache-miss.

Most modern browsers/servers use pipelining so it's not a 50ms connect each time.

Different continent I might understand the desire. But I have a server in VA that can serve western Europe at 75-100ms connect time, which is not horrible.


Mildly off-topic.

We've been using maxCDN since its not terribly expensive to use for the entire year, and since its being used with Wordpress site we've coupled that with a caching plugin and also incorporated CloudFlare. maxCDN seems to be very good for caching static content such as js and css files, but i dont believe it caches the html itself nor does Cloudflare. Thats only being done locally.

I do wonder if there is some other service out there, thats not terribly expensive that caches html too. Ive noticed even with a CDN and services like CloudFlare, if you dont have a dedicated server to handle the high html load requirement the setup is going to fail.

I'll be interested to learn if anyone has any further experience with things like nginx or varnish to cache their html and what services in their cloud they are using to achieve this.


Unless I misunderstand, it seems google will cache html if you set Cache-Control:

https://code.google.com/speed/pss/docs/cache.html


Fastly (fastly.com) is designed for exactly this purpose.


I have to admit. I found the pricing structure on Fastly, very confusing. Are you someone affiliated with Fastly?

I'd be quite interested in it as a service if it can help me handle high volumes of traffic on a server thats running as a shared hosting account negating the need to move to a dedicated server.


Alternatively, just use Cloudflare.


We've been using CloudFlare for two weeks now for an image serving site. It saves us about 1.5tb in traffic per day.

I still don't understand how they can offer this for free; I'd be happy to pay them a few hundred USD per month - it would still be a lot cheaper than AWS or any other cloud hoster out there.


> I still don't understand how they can offer this for free

You allow the following:

- Add tracking codes or affiliate codes to links that do not previously have tracking or affiliate codes.

- Add script to your pages to, for example, add services or perform additional performance tracking.


We only enabled CloudFlare for two of our subdomains. These subdomains only serve images, no HTML or script files.

So I guess we're circumventing their monetization model and will be kicked anytime now?!


You certainly won't be kicked off CloudFlare and I don't know what the parent is talking about. CloudFlare doesn't modify your pages to make money for itself. It makes money by people paying for premium services.


> I don't know what the parent is talking about.

I'm talking about the ToS for cloudflare. I don't know whether they modify the pages to make money or not, but they make it clear it is a possibility.


That section of the ToS begins: "Depending on the features you select, CloudFlare may modify the content of your site." The examples listed are things that you control, by default it doesn't modify the page.


I see, thanks. Might be worth stating it explicitly ("we won't modify the content unless you allow us to").


I'm very curious to hear about this. I'm planning to use on a static content only domain and I couldn't find any info if that's ok.


Yes, that's just fine.


Those types of modifications you mentioned happen only if you OPT-IN by manually enabling the Viglink app. It's entirely up to the site owner though.


Or Fastly.


They’re not free. Just enabling SSL costs 500$ for setup and then 100$/mo (compare to 20$/mo for Cloudflare). And they charge for bandwidth and requests.


Related question: I started playing with mod_pagespeed when it first came out, however at the time it was very buggy and broke RichFaces based apps, PHP Gallery, and a few WordPress plugins that sites I host run, so I had to ditch it. That was ages ago though. Has anyone been running a recent version of it? What is your feedback?


Call me paranoid but I would rather not see this being utilized to much. Personally I block Google Analytics (along other tracking scripts) and many people use Adblock. If you start moving your important scripts and stylesheets to Google they will still be able to track you throughout the web.


Tracking available via Google Analytics != the tracking Google could achieve by having your site use their CDN.

Google Analytics is a browser-side javascript code.

Static content loaded through them would not send cookies^, there would be no dynamic URLs with session IDs. The best they could do is track by IP, which they also probably wouldn't bother with across the globally dispersed CDN network.

^ Should not


IP address + User-Agent + Referer is a potent combination for tracking, even without cookies, and is available to all CDNs.


True, this is possible, and they could combine this information with IP + UA when you are visiting google properties that also have the google analytics code in it.

Though depending on the network you're on (how many people are NAT'd behind that IP), and how generic your User-Agent string is, this could be useless. Comparison from Urchin (owned by google): http://www.urchintools.com/urchin6/configuration/tracking-me...


If you need a CDN you most likely can afford to pay for one. No? I mean even if you're just loading your images through S3 unless you're getting some crazy number of hits it's only going to cost you a few bucks a month. I'm all for CDNs however many people think that you need it (thanks to yslow and google) for a basic website and it's just not the case. if you're not under a high traffic situation that extra hop can actually make your site slower than if you just served the files from your server.

If you're on a php/database driven site (early on) I would be willing to wager that your optimization time would be better spent following that string. If your website can only handle 25 hit/s because you're killing MySQL with SELECT* FROM ALL type commands the CDN is not going to do you much good. After you're sure your code is optimized and you feel you need it, sure have at it! there are plenty of good CDNs out there.


> If your website can only handle 25 hit/s because you're killing MySQL with SELECT* FROM ALL type commands the CDN is not going to do you much good.

It will, if you set proper caching headers.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: