How StackOverflow Scales with SQL Server

wmwong · on Nov 16, 2011

This wasn't so much about fine tuning SQL Server itself, but breaking traditional thinking. The 5 rules he presents originally are all replaced in the end.

  - Everybody's the DBA
  - Do what it takes to get what you want
  - Tune later, cache & separate now
  - NewEgg your way out of problems
  - Share for great good

For each point, he argues why the old rule no longer applies and what the new solution is.

I felt a lot of the presentation was about tuning SQL Server without tuning SQL Server: caching, leave full-text searching to Apache Lucene (because it's not querying), and using SSDs to speed up performance without having to touch any code.

BrentOzar · on Nov 16, 2011

Yep, absolutely, you nailed it. There's a gazillion presentations out there about tuning databases, but that only takes you so far. I wanted to show that you need to take a step back before you go into query tuning details.

newegg · on Nov 16, 2011

Interesting that buying hardware written as "NewEgg your way out of a problem" as Newegg uses IIS and quite likely SQL Server.

krmmalik · on Nov 16, 2011

I had no idea StackOverflow was using Microsoft SQL Server. I guess its just a learned response but i've always come to expect some large networking site to always be using a non-MS based solution.

I havent had a chance to watch the video, but i hope to later on. What is interesting, even with just the link is that for a website like StackOverflow that MS SQL is a viable solution.

We have been using SQL Server in our own company for our projects and i was really starting to get annoyed with it. I found it to be heavy on resources, slow to respond and lets not forget cost. I just completed a project that i have been working on for the last 4 months, and the majority of the work was within SQL server. One thing i learned was that its actually quite a powerful beast.

When used correctly and in the right way SQL is a very capable SQL solution. I'm glad that we decided to stick to SQL Server. There is a lot i learned about SQL Server in the last 4 months that i had no idea it was capable of.

silverbax88 · on Nov 16, 2011

Big networks (banks, insurance companies) basically use one of two options most of the time. Oracle or SQL Server.

It only seems like it isn't when you read sites like Hacker News, where most of the posters are not working in big environments. Facebook is a large scale solution that doesn't use either, but their data management and caching is so bad nobody should be considering them as a best practice.

icebraining · on Nov 16, 2011

Google, Craigslist, Twitter, Wikipedia, Youtube and Netflix are just a few example which don't use Oracle or MS SQL Server either.

It's not about being big, banks and insurance companies simply have different requirements on how to access and store their data compared to most websites/services.

silverbax88 · on Nov 17, 2011

Granted (except Google) - but Twitter, Wikipedia and Craigslist are only big in comparison in terms of traffic. Most banks push more data through their network than most developers can conceive.

If you took all of the votes from the 2000 election in the United States (I'll save you the search - it was just over 100 million), it wouldn't even equal one day's ATM transactions in the U.S., and those transactions are a nonevent.

uniclaude · on Nov 17, 2011

We all know that there is a lot of data being pushed in banks networks, but your ATM example may not be a very good one. The transactions you are talking about do not occur in a single bank.

Twitter in not only big in terms of traffic. 200 million tweets per day is not small, even if a tweet involves less processing than an ATM transaction. You can also consider Facebook as an example of website dealing with big data.

ww · on Nov 17, 2011

You might mean most banks or big corporations have larger and more complex interrelated schemas than most developers are used to.

arethuza · on Nov 16, 2011

Google appears to use Oracle Hyperion and Oracle Essbase - obviously not for their main product offerings - but they do use Oracle products:

http://www.google.com/intl/en/jobs/uslocations/mountain-view...

nl · on Nov 17, 2011

They use it for their internal finance systems, not to run any external facing systems.

Google does use MySQL for their AdSense/AdWords transactional system (ie, buying AdWords).

silverbax88 · on Nov 17, 2011

I do think less of Google for using Essbase, but hey, it's hard to get BA's to use anything else. They love it.

MatthewPhillips · on Nov 16, 2011

This is correct; it's the companies that are heavily regulated that have a hard time choosing a nosql solution. The flexibility of SQL means you can create a report at any time. This is necessary when the government comes knocking wanting some specific data pronto.

silverbax88 · on Nov 17, 2011

It's not just regulation. It's a need to report at a granular level and slice data as many ways as possible. That's why Essbase is so popular, even though it's so poorly written.

ja27 · on Nov 16, 2011

Maybe it's because I worked so long for an IBM partner and dealt with IBM customers, but DB2 runs a lot of stuff at large Fortune 500 companies.

silverbax88 · on Nov 17, 2011

Yes, DB2 is definitely up there.

gaius · on Nov 17, 2011

Oracle, Sybase, DB2, SQL Server and Informix. The last company I worked for had all of them.

  - Oracle and SQL Server: chosen by us
  - Sybase and Informix: chosen implicitly by choosing applications that preferred them[1]
  - DB2: Chosen by a company that we acquired

Oh and some MySQL and SQLite too.

[1] Yes they ran on Oracle too - but when 98% of a vendor's customers are on Informix or whatever, sometimes it's just less painful to go with the flow, so you have access to that community.

timwiseman · on Nov 16, 2011

I found it to be heavy on resources, slow to respond and lets not forget cost

What are you comparing it to?

Admittedly I am biased since my day job is a SQL Server DBA, but I have tried several other options and think SQL Server tends to stack up quite well.

I have found it generally more user friendly, easier to work with, and cheaper than Oracle (though Oracle does seem to have an advantage in certain types of partitioning). I rather like MySql for certain types of projects, but generally find SQL Server easier to maintain for large projects.

I have only dabbled with NoSQL options, but my general opinion is that for certain problem sets, they are great. However, when ACID is even remotely desireable they are not an option and for certain other tasks they are less desirable.

So, I think that which type of database you use depends largely on the project, but SQL Server tends to stand up quite well for a wide array of projects.

krmmalik · on Nov 17, 2011

I was comparing to MySQL and PostgresSQL, although i can understand that it may not necessarily be a like for like comparison.

Ultimately, i agree with what you are saying and my opinion of SQL server has improved in the last few months. I really like its flexibility.

checoivan · on Nov 16, 2011

Some windows live properties backends use SqlServer as well and have thousands of SQL instances running, so it does scale.

http://windowsteamblog.com/windows_live/b/windowslive/archiv...

mkramlich · on Nov 17, 2011

one or both of the main people behind SO are "Microsoft stack" guys, so I'm not surprised.

3am · on Nov 16, 2011

I can't watch the video right now, but the idea that "Everybody's the DBA" is risky in general. I'm hugely in favor of developers writing and optimizing their own SQL, being able to create normalized schema (and know when it's worth the tradeoff to denormalize), how to read a query plan, and generally be as competent as a DBA. But... it's good to have one person who has the global view of the database for things like tuning extent sizes, selecting the optimal types of storage for various partions, doing reviews on the schema, capacity planning, etc, etc.

The right setup (IMO) of having a DBA in an operational role with developers that are highly proficient/self-sufficient is hard to get right and expensive enough that it probably isn't right for an early stage company. And a bad DBA can be a nightmare. So there are tradeoffs on both sides.

gaius · on Nov 16, 2011

It's about moral authority. The only people touching production should be those whose pagers go off at 3am if it all goes hatstand. Everyone else is the peanut gallery.

wahnfrieden · on Nov 16, 2011

Are you sure? Amazon is getting flak for that lately (in e.g. Yegge's rant), from having all of its developers on-call. You should be around to fix something you push if it breaks things, but that's not the same as being on-call at all hours for any given coworker's or manager's standard of "going hatstand".

koenigdavidmj · on Nov 16, 2011

>You should be around to fix something you push if it breaks things

Which has another advantage over Amazon's pager policy: if you are up at 3am, only semiconscious, you are likely to be more useful fixing code that you broke (because you just touched in the last day or two) than some other coworker's mistake.

jbert · on Nov 16, 2011

Agree, with the caveat that those who carry the 3am possibility are allowed to delegate at their own risk.

i.e. the real rule is "nothing happens to prod without the agreement of those at the sharp end".

That allows good devops fun to happen with fewer ops bottlenecks, if some devs can provide enough ops-chops to keep the real ops people happy.

[Also note in passing that such devs are likely to be in the "second wave" of 3am people if things go sufficiently pearish.]

gaius · on Nov 17, 2011

Agreed - that is a general rule too. E.g. those that do the actual work make the final call on which technology, language, database, whatever to use - not some "architect".

3am · on Nov 16, 2011

I love this explanation. Note my username :)

Duff · on Nov 16, 2011

I think that with a modern database like SQL Server, the old-school "high priest" DBA is obsolete.

But... if you don't have someone dedicated to thinking about database issues, you need to treat database changes just like your code. It needs to be in a repository, it needs to be reviewed, and you need a change management regime.

From a anecdotal POV, I've noticed that many folks have a good process (or at least a consensus approach) to managing their code... but the database is often a red-headed stepchild that doesn't get the attention it deserves.

MartinCron · on Nov 16, 2011

but the database is often a red-headed stepchild that doesn't get the attention it deserves

On my last three major projects, I have committed to devoting the proper level of attention to the database, with automatically building databases in some environments, scripted scheme changes and seed data loading as part of the mainline code base, etc. and I have found the difference to be immense in practical terms. I can move more quickly, more safely, and have a better quality of life as a developer.

One of the best returns on (effort) invested I have ever seen.

lukencode · on Nov 16, 2011

I definitely agree. We have been using fluent migrator (https://github.com/schambers/fluentmigrator) for our .net based projects at work for database migrations and versioning

MartinCron · on Nov 16, 2011

Thanks. Fluent Migrator looks pretty cool. I've been rolling my own stuff as-needed. As a result, it has evolved into something pretty useful for the problems I've encountered so far, but does nothing outside of what I've already thought of. I can see using the migrator as a good jumping off point.

3am · on Nov 16, 2011

Sure, I wasn't suggesting a high priest. Maybe more a 'faculty advisor' dba? Just someone keeping an eye on the higher level problems (having the right Nagios alerts set up, keeping an eye on how long reindexing is taking and scheduling downtime according, disaster recovery planning, putting in the POs for new drives). That's even if you treat writing queries and schema changes like part of the code, it's a dynamic system. But I completely agree on treating DB changes like code changes, I think DevOps is a good approach.

gaius · on Nov 16, 2011

Same as with a modern language, the old-school "high priest" developer is obsolete.

reginaldo · on Nov 16, 2011

Thankfully, with Liquibase [1] and others like it, developers are starting to think about database change management.

I absolutely despise writing liquibase changesets, but I know how important they are.

[1] http://www.liquibase.org/

BrentOzar · on Nov 16, 2011

You'll want to watch the video, because I explain that in more detail. Of course I'm not against change control or security, but if you want to scale a database, everybody involved has to have DBA-level skills. StackExchange's sysadmins and developers are all keenly aware of the database impact of what they do, and they aggressively try to minimize that impact. That's what I talk about on the video when I say "Everybody's the DBA." It's like saying, "Only you can prevent forest fires."

3am · on Nov 17, 2011

What do you consider DBA level skills? Because you mention in one case people using SELECT * where they only needed a small subset of the fields. I'm all for your message that everyone should be aware of the database, but I don't consider that a DBA level mistake. I've just worked with some astonishingly good DBAs before, and I while I'm not a slouch it would be impossible to catch up with them without years of intensive study.

Anyway, I'm splitting hairs. Your presentation was great, and the people at SO are very impressive overall. Thanks for posting it.

BrentOzar · on Nov 17, 2011

I consider DBA-level skills as the ability to read an execution plan, figure out what indexes are being used, know the difference between a good plan and a bad one, and figure out how to rewrite your query to get the performance & plan you want without just adding indexes willy-nilly. Another good example would be knowing when it's right to normalize tables versus flattening out the data structures - you can't just knee-jerk and normalize everything because it's the best practice.

I wouldn't say that developers need to know about HA/DR, replication, memory tuning, SAN setups, etc, though. Just the how-queries-work and how-data-is-stored angles.

Thanks for the compliments! I had fun doing that one.

andrewheins · on Nov 16, 2011

I love that these guys are so consistently open on their scaling strategies. Does anyone have any similar resources for Rails stacks?

krookoo · on Nov 17, 2011

> Does anyone have any similar resources for Rails stacks?

Of course; this:

Use Something Else.

moonboots · on Nov 16, 2011

http://www.readwriteweb.com/cloud/2011/04/twitter-drops-ruby...

donw · on Nov 16, 2011

I've downvoted you for providing a worthless answer.

The poster asked for similar resources on scaling Rails up; maybe something like https://github.com/blog/117-scaling-lesson-23742 or http://axonflux.com/building-and-scaling-a-startup would have been useful.

I sincerely doubt that the top-poster has Twitter-level problems or resources. He may have a Rails site that's crapping out under load, and needs resources for optimizing both application performance and database access.

"Just move everything to Java" is a meaningless answer. It doesn't help him at all, not unless he's got a year and change to retrain/rehire his engineers and completely rebuild his application.

astrodust · on Nov 16, 2011

You should be so lucky as to have to take something to Twitter-scale.

I worry that if they went hand-rolled assembly, everyone would be jumping in that bandwagon.

gaius · on Nov 16, 2011

Twitter is actually not that high-volume. They peak at 6000 messages/sec and average much lower (http://blog.twitter.com/2011/03/numbers.html). Meanwhile Tibco will sell you a device doing 100,000 messages/sec sustained (http://www.tibco.com/products/soa/messaging/messaging-applia...). There are plenty of people working on systems more scaley than Twitter.

orcadk · on Nov 16, 2011

That's not really a fair comparison however. There's an non-negligible difference between just crunching messages, and then distributing said 6k messages/sec into a directed graph of users. Those 6k messages/sec is just the input, the output is far greater.

gaius · on Nov 16, 2011

That is what RV does.

moonboots · on Nov 16, 2011

Better brush up on x86

scottshea · on Nov 16, 2011

I love it when companies offer insight into their practices like this.

mwsherman · on Nov 17, 2011

It comes down to two things: economics and computer science, in that order.

For most companies, diving deep into your data persistence layer is probably not worth it in the beginning. Your devs need to work on features. (MS SQL does pretty well untuned.) This is the economics part.

The computer science part comes in when you are doing enough traffic that 200 hours of dev time toward a 10% performance improvement becomes good economics. Then you dig into the data layer (and every other layer) and start counting milliseconds.

Which is what we did at Stack O by using Dapper and renting Brent Ozar. :)

BrentOzar · on Nov 17, 2011

RENTING, hahaha, I never thought of myself that way. I like it. I'm going to use a "For Rent" sign around my neck in an upcoming blog post!

revetkn · on Nov 16, 2011

The other part of this is throwing out the ORM in places where performance matters and replacing it with a simple utility, Dapper, that maps SQL to objects: http://samsaffron.com/archive/2011/03/30/How+I+learned+to+st...

blhack · on Nov 16, 2011

"Don't worry the cloud takes care of all of this!"

This is horrifying. Are there really people that think this way?

BrentOzar · on Nov 16, 2011

Sadly, yes - obviously it's not this site's target audience, but a lot of PHBs out there are buying into it hook, line, and sinker.

arthurprs · on Nov 16, 2011

Very good stuff.