Hacker News new | past | comments | ask | show | jobs | submit login
How to Deploy All Day yet Deploy Nothing (mr.si)
98 points by mrfoto on Nov 22, 2015 | hide | past | favorite | 43 comments



Two important points the author (re-)discovered here, that are really not as commonly known as they should be:

1. Docker is a first-class citizen on Linux, and a second-class citizen everywhere else. "Setting up Docker" on Linux means installing the Docker daemon. "Setting up Docker" on OSX or Windows means installing a Linux VM containing Docker and getting it to port-forward the docker daemon from that VM to your host. After it's set up, there's no day-to-day usage difference—but that stumbling block is a big one.

2. Nothing AWS offers—no matter how much it might look that way—is an end-user platform. It's all infrastructure, the kind of thing sysadmins set up/deal with. Infrastructure-as-a-Service effectively means "our market is bigcorp CIOs making buy-vs-build decisions": decisions of whether to provision more local data-center infrastructure, or provision that same infrastructure "in the cloud."

In other words: Elastic Beanstalk isn't for prototyping an instance of your app! Instead, it's for CIOs considering replacing "our data center containing a bunch of machines with docker daemons on them" with "our AWS VPC containing a bunch of opaque instances with docker daemons on them." The same goes for every other AWS service: the documentation and "semantic models" of AWS services are not targeted at app developers, but rather at the sysadmins who were implementing the CIO's whims by keeping local infrastructure going at-scale, and are now implementing the CIO's whims by porting said infrastructure to the Amazon cloud.

AWS might be intuitive to you as a developer if you have 5+ years of devops experience, deploying your own code and and managing the infrastructure it sits on. Otherwise, the choices AWS offers will be pretty opaque to you.

(Side-note: there's a business model [that AWS allows and encourages] in taking an AWS IaaS-level component, and shellacking a coat of UX onto it to make it into a PaaS component, and making your money by arbitraging on the "PaaS premium." Heroku is PaaS-shellacked EC2 instances; Tarsnap (or Dropbox, even) is PaaS-shellacked S3 buckets; etc. I'm surprised more of these businesses don't exist.)


"Nothing AWS offers—no matter how much it might look that way—is an end-user platform"

I don't agree here. I'm definitely in the sysadmin camp but I'd like to think I've advanced from where I was when that was my job title 5 years ago or so.

For us, Cloud Formation was the real eye-opener. We leveraged CF to build a rather sophisticated environment. We referred the CF to our customers who then set it up on their own. We did this over several customers, with various degrees of AWS experience from zero to expert, and various backgrounds from security, to network, to IT. All were successful. I'd describe all as end-users.

The reproducibility, atomicity, and ease of this solution gave us huge gains.


> I'm surprised more of these businesses don't exist

Oh, they do; but because they target niches, they might be small and not be on your radar. Also, the whole field is still pretty green, in the grand scheme of things; as you say, it takes years to be able to properly leverage the AWS ecosystem.


> Once it's set up, there's no day-to-day usage difference

On a laptop you now have an extra OS running, with all the associated costs: talking to disk every now and then, doing os-y things.

If nothing else, it feels like my battery runs out faster with docker-machine on and no containers running. But maybe somebody has actual numbers on this?


Yes, you have one unnecessary OS there, and it isn't Linux running Docker. :). Personally, I run Windows because for some unknown reason that's what I get issued at every company I join. This Windows runs a VirtualBox with Linux, where I do actual work. Docker is amazing operated directly, and you get all the benefits of your dev and production environment being essentially the same.


> I'm surprised more of these businesses don't exist.

They tend to be small, and only wrap a single aspect of the AWS service.

My simple site wraps Route53, allowing you to maintain and change DNS via git-pushes, chances are you've not heard of it because the number of users it suits is pretty small - https://dns-api.com/


That's because most of these tools are built by people who are borderline psychos. I don't mean the murderous kind but the kind that is emotionally stunted, working under intense business pressures to deliver (which is really just another group of psychos higher up the hierarchy), and overworked. The best part is that there is nothing new. It's all just the same old stuff except now it's in the cloud and requires 10 layers of configuration being held together by heroic manual effort. I know you think chef, ansible, salt, puppet, etc. helps. Nope, tried them all, same thing applies and I'm no longer in the mood to learn yet another DSL for shell scripting.

It's not just AWS. This is pretty much true of every infrastructure/cloud as a service company. They either have too many knobs with all sorts of unintended interactions and half-baked integrations or they abstract away way too much and you have no control over anything and can only deploy a nodejs or django application that follows a very specific rigid structure. Oh and did I mention most of this stuff is insecure by default and if you want to do anything in a secure way then your n-layer configuration nightmare is now 10x worse (probably why most places are looking for 10x programmers). Oh and there is also networking/NAT, user management, logging, artifact distributions, service discovery, etc. to worry about as well using tools that are just as half-baked.


> Oh and there is also networking/NAT, user management, logging, artifact distributions, service discovery, etc. to worry about as well using tools that are just as half-baked.

Oh the good old times when I was learning to program. Configuration was just a simple INI parser written in two hours from scratch. Networking was done by Apache, user management meant a database table, logging was done with a function wrapping over fprintf(), "artifact distributions" meant res/ folder, service discovery was done on paper. And somehow everything worked. It worked better and was infinitely more manageable, and was mostly dependency-free.

I can see how the above would cause problems for e.g. Google. But 99% of companies and open-source projects are not Google.

Web development is insane.


> And somehow everything worked. It worked better and was infinitely more manageable, and was mostly dependency-free.

You and I apparently remember things very differently.


Is there a hello world version of this? I have been trying to pick up IaaS (insanity as a service) and it has been really difficult.

At what point does it make sense to use docker? Is it worth learning all of the above crazyness if you have less than 10K people using your service and are 1 person?


No, it's all just terrible software managed by even more terrible software. It's all very aesthetically unpleasant and when I look for good software I look for a certain "coherence" that I have not found in any of the cloud providers.

If you do things right and stage your artifacting and deployment process properly then that is 90% of the battle I think because none of the cloud providers are magic. Any service they have that promises to abstract away your deployment and configuration pipeline is a bold-faced lie. Heck, they don't even have a way for automating the network topology configuration that isn't a nightmarish maze of JSON (I'm speaking of course of cloud formation). So even if you are using docker, that is in no way a solution to a proper deployment pipeline with properly versioned artifacts and dependencies.


so just to be clear, you are a huge fan of bundled cloud services then? can't really praise them enough?

Cheers, will stick with getting better at developing software and not worry about the container craze yet. thanks.


Ya, there is no substitute for fundamentals. If you don't have that down then docker is not gonna help. Like C++ it just makes it easier to shoot yourself in the foot harder.


Can't harp on fundamentals enough. The thing to realize is that everything you do in an automated fashion can also be done manually. If you haven't worked out how to do it manually, then when your automated solution fails to work, it's going to be much easier to troubleshoot if you know the basics.

Hence why I prefer things like OOCSS over Bootstrap, Capistrano over Docker. Bootstrap forces your markup to conform to your framework, drowning you in incoherent class names that are going to flee your mind the second you have to remember why you put them there. OOCSS makes you treat your markup and CSS holistically, so that when it comes time to troubleshoot a browser bug, you can easily apprehend what your markup was supposed to do and why you did it that way.

Docker automates things that don't really need to be automated, unless you've got the kind of problem that really could make use of Docker's abstractions. People think they need Docker when they don't, then get surprised when Docker doesn't magically solve the problem they were hoping it would solve.

Make things work in the simple case, and then iterate towards the complex. If you ignore this, then you will eventually find yourself in the situation where you have a deadline to meet that you simply won't be able to meet because you're writing technical checks that your skills won't be able to cash.


Note, I'm definitely not a web developer. I've put static web pages together and fooled around with Rails & Heroku a little. So, as an outsider, I'm curious: How did you all (the web development community) make DEPLOYMENT so complicated? All these packages and environments and VMs and containers and environment variables. Shouldn't deploying something on the web be a few minutes of FTP and then go do something more productive the rest of the day? What am I missing?


If you have a static website, maybe. If you have a web application with millions of users, several large and complex data stores, and lots of functionality, things get complicated fast. Most critically, it's not a direct relationship. If you double the number of pieces in your infrastructure, your problems become four times as hard if not more.

You might have a database and a web app. But the database gets slow, so you add a caching layer. Then you need to scale up to accomodate more users. You've got to be able to back up all your users' data. And when your app gets complicated, deployments can break things, so you also need to be able to roll back quickly. Once you have a dozen or more servers going, you realize that if one goes down everything breaks, so you have to add redundancy to each part of the system, which adds even more concerns about how backups and recovery is handled. Then you need to add some search features to your growing dataset, which necessitates another kind of database to index your primary database, and more app servers. Oh, and by this point when things break, it becomes really hard to tell why things are broken, so you need monitoring and centralized logging, and whoops, those are two more kinds of databases and the associated web applications to manage them. And when you're at this scale, you have to coordinate multiple pieces of the system for any change you make, so you have additional software to manage that. And by now, you'll realize your original code was terrible, and while you are wise enough to know not to try to rewrite everything, you instead take various pieces of your infrastructure and break them into microservices in new languages, and...

Are you getting the picture?


Great summary and very recognizable points.

It would be useful if there was a guide on how to scale your app wisely. > So what are the basic steps you will go through? > What are caveats and best practices per step? > What are the important choices and when do you make them?

Could you recommend a resource for answers on these questions?

I took your comment and edited it a bit:

Start: database + web app

Slow database: adding caching layer

Need to backup: starting back up all your users' data

App gets complicated, deployments break: being able to roll back quickly

You have a dozen or more servers. If one goes down everything breaks: adding redundancy to each part of the system and improve backups and recovery.

You need search features: add another kind of database to index your primary database and add more app servers.

Now when things break, it becomes really hard to tell why things are broken: add monitoring and centralized logging. Those are two more kinds of databases and web applications to manage them.

Coordinate multiple pieces of the system for any change you make: add additional software to manage that.

You'll realize your original code was terrible: you take various pieces of your infrastructure and break them into microservices in new languages.

And...


Got it, but this guy was trying to deploy a 15-minute sample app.


And what was preventing him from just uploading it to his AWS instance?

If you want to experiment with an entirely new stack of technologies intended for deploying distributed, multi-tier, scalable apps, that's cool, but don't feign surprise if it takes you more than 30 minutes.


... using a tool designed for the type of app I described. That said, if you want to learn how to use the big complicated tool, it's best to start with a trivial app.


He didn't need to use docker. FTP deploys for static sites are akin to deploying straight to heroku for rails apps.


Was there even a project in which a single person was responsible for all of this? It feels that if you're having scaling issues so bad you need to do all of the things you described, you're already big enough to have 20 people on payroll handling this.


I did all that for some projects two jobs ago. The company had ~50 people elsewhere but only three of us in Europe (and for every responsibility we wanted at least two of us to know about it).

None of it is hard, honestly - it's all well-documented things that a smart person can teach themselves how to do. And I honestly think that having one person take end-to-end responsibility for delivering one piece of functionality results in much higher quality than having twenty people specializing in a single layer as it applies to twenty different deliverables.


Whatsapp probably did.


I'm reading WhatsApp history on Wikipedia[0]; it seems they got themselves a lot of money before having the need to scale beyond what simple solutions would allow.

[0] - https://en.wikipedia.org/wiki/WhatsApp


When your dev teams are strewn all over the world and the lowest common manager with the people managing the OS is the CTO 6 layers up, you will find making things "just work" get exponentially more difficult and all sorts of workarounds and safeguards have to be put in to ensure small changes introduced by other teams do not screw up your portion of the app and vice versa, eg not letting the OS team control the JVM for fear security and performance decisions by the OS team will cause unexpected deployment problems... which will lead to performance and security problems that need to be worked on without OS team involvement...

Of course this is a slippery slope and often devs not familiar with RPM eg end up reinventing the wheel as each deployment failure breeds distrust between teams and more workarounds and safeguards get introduced.


> Shouldn't deploying something on the web be a few minutes of FTP

It should be, but well... this is something if you're using PHP.

If you're using anything hipster (ruby, iirc python, Java) you're out of luck with that.


Yes and no. With PHP the complexity is all still there -- you've just delegated to a shared host.

I worked on Cloud Foundry buildpacks for a while. The PHP buildpack is actually one of the more complicated ones, because at staging time it tries to recreate a shared hosting environment in a standalone container.

The Ruby buildpack is complicated for legacy reasons, but for 99% of modern apps all it really does is run bundler and rackup.

Meanwhile every PHP app wants to keep on living in the 90s.

Arguably the Java buildpack has the simplest job, if you use a sane build tool. "Here, launch this jar".


Heroku is pretty straightforward for most stuff. Python, for example, is basically "make a requirements.txt, a Procfile to tell it what the entry point file is, and then push to the Heroku repo".


Tell me, how does uploading some files via FTP provision a database server and create the initial tables required by the application?


FWIW, I used to deploy rails via FTP a decade ago, to unbit hosting (the guys who later did uwsgi).


The sad part is that it's even worse when using windows. The only thing I've got working without too much yak-shaving was node, but anything devops-y is a pain including vagrant, otto and even the package manager (chocolately)


That's funny because I was about to say the complete opposite.


That's funny because I run almost everything smoothly (Vagrant, Node (grunt, bower, ...), Python, Chocolatey and now Meteor), on my Intel Atom NETBOOK. It's not perfect. Can get laggy, but works just fine must log the time. That said, I think I'll jump on the Linux bandwagon very soon.


> Intel Atom

I worked that way for two years. It doesn't really matter if everything works smoothly under Vagrant if every day you want to kill yourself at your desk because you can't run an Vagrant and a browser at the same time without bringing the system to its knees.


Sounds like I'm doing something very wrong then. I have a corp-issued windows 7 laptop. Any advice or resources for how I should be setting up my machine? (Chocolately isn't installing properly at the moment)


"I’ve wasted the whole day exploring new things and concepts" is a weird thing to say after you set out with the goal of learning some new tools. Definitely seems like he learned something.


I understand the frustration. Reminds me of the first time I tried to run something on a VPS. I wasted at least a day or two fighting with bullshit issues like botched up Unicode support on my Debian installation. I did learn something, but it was not the thing I wanted to be learning.


I wasted a lot of weekends getting Docker or Vagrant to work so the rest of my team could follow what I was doing. After a few weeks of Google and outdated documentation, I decided to try out LXC and throw away all the other fancy tools. I went into each container and ran my provisioning script(s), updating them for each trick I stumble upon. My sanity is back now, and my team will clone the LXC repos for their development environment.

After a few nights of trying to get Docker to run on my work laptop, I figured there was an issue with Windows security as the VM would be created but Docker would fail to initialise. One of my colleagues said a CentOS worked for him. By then I had dumped Docker for Vagrant; which also had the same issues. I was now trying to figure out how to run Vagrant on Hyper-V as Vbox was acting up. I grew some grey hair those weekends. I'd rather be conservative with my time; use what works and wait for some of these tools to mature.


Hah, this is why I don't use Docker. Too much added complexity for too little gain.

I'm extremely conservative these days. I beat my head against Capistrano long enough to learn its ins and outs, so if some new tool doesn't work with minimal Googling, I go right back to good-ole' Cap. I have a provisioning procedure I adapt for each cloud host that basically gets the machine ready for cap deploy. If cap deploy doesn't run smoothly, I alter my provisioning procedure once I figure out why. Usually it's for gem OS dependencies.

Any replacement strategy would have to beat the ease and reliability of Capistrano. That's going to be tough.


Once you have the whole system figured out and setup then it seems to be that you have a lot more configuration options and control with eb vs heroku. There might be a steeper learning curve but once you jump the first few hurdles eb does make sense. Additionally, the author mentions deploying a super simple app. Obviously super simple apps are awesome to deploy on heroku because there is nothing out of the ordinary. AWS offers a lot more services for when your application gets off the ground and becomes more complex. It provides the infrastructure and you provide the configuration.


I guess my question is: why not Heroku?

Or, since I work for Pivotal: why not Pivotal Web Services?

The Ruby buildpack (which is > 90% identical code between the two) is pretty well-tested for deploying Rails at this point.


I'm reminded of the "Command Line Bulshittery" article recently featured on HN [0].

So much effort, just to get to zero. Somehow, somewhere, this has to change.

[0] http://pgbovine.net/command-line-bullshittery.htm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: