to ~/.ssh/config. Otherwise if you delete your VM and make a new one, the entry from the old one is going to stay in your main ssh config, so it's going to fill up with cruft over time.
Not sure how old this post is, 2014 by the looks of it, but modern declarative automation makes this redundant. Ansible, as an example, would abstract away the CLI commands and be very readable in very few lines.
The setup described in this post looks fragile, regardless of whether it's literate. I mean, we used to do these things, but we've moved on.
That's what ansible PR material would have you believe. In reality only very simple things can be translated into ansible/salt and even then you may hit one of their many bugs. If you try to do complex things even bash would be a better choice (although obviously the best).
There's one thing ansible/salt are good at: make it easy for cheap replaceable devops to write tons of repetitive boilerplate yaml. Harder for them to shoot themselves in the foot than with a real programming language. But once you descend into jinja hell that's not very true either.
I would second the opinion that the techniques demonstrated here are several years behind current best practices.
Automation is the right course, but this takes a very roundabout way toward producing an artifact that can be distributed and applied repeatably, compared to using something like an Ansible playbook that is applied using Virtualbox's built-in Ansible provisioner.
I would recommend anyone reading this thread against recreating this, except as a proof of concept, and instead concentrate on a workflow more similar to this one:
Horses for courses, it depends on what you're doing, but I've built Multi-£Bn production systems for the last decade on Puppet then Chef then Ansible (and now none of those) and I can't say that Ansible is very buggy these days nor does it fail at complexity. I mean, we found that provisioning with Packer using Ansible as the engine worked perfectly fine, nothing to do with PR material. Newer definitely isn't better, I would always go with reliable, battle-tested tech, but the old ways are largely pointless for most clients and developers who have long left behind artisan infrastructure. If tons of complexity is needed that baffles Ansible/Salt etc, I would be looking at the architects askew TBH...
At my employer, we use a sensible bi-modal approach, where every machine gets a base configuration applied through Puppet, and we consider this The Platform, and any per-machine (or per-use-case) changes should be applied through an Ansible repo.
Puppet lays down those things that the platform "promises"* to provide - syslog, time, auth, DNS, etc, and Ansible does application-specific things.
* - Not a strict promise in the Mark Burgess "Promise Theory" sense, but similar in thought.
I have experience with salt. Imperative vs declarative - it depends. Some stuff is simple (and the tutorials and introductions always are), other stuff is complex.
For example I ended up writing imperative state in salt in a declarative manner. Was not easy. The solution I've found was to make one state predepend on another. The chain in the end was 8 items long if my memory serves me correctly.
On the other hand, in imperative world you have other difficulties, because you have to check the current state first and decide what to do in different edge cases.
Not sure which one in the would be more complex.
I always read that ansible can only be used for very simple things. However I never read what things are too complicated with ansible. Can you share an example?
Ansible is good at putting configuration files in the right place and setting their appropriate values to what you want — provided it’s relatively simple, otherwise prepare yourself for complex templates — or installing software from a package manager, etc., etc. As soon as you go outside what its modules provide, it gets more difficult. Modules themselves are actually easy to write, provided you maintain Ansible’s (completely reasonable) maxim of idempotency. Sometimes, however, idempotency is hard to enforce.
The best example I’ve come across is building software from source. When there’s no package, you have to do this. Some may argue that this is not Ansible’s job, but I don’t see how it’s different from `apt install`, nor where else it would fit into the pipeline. Anyway, the way I solved this for example was by pinning the idempotency on the existence of build artefacts. Those are often software-specific, so it’s quite fiddly/non-general to find the right artefact.
I'd be interested in how you solved this, if you don't mind sharing. I tried writing a role to build source RPMs and it kind of works after doing some really dirty regex tricks, but only barely. I kept feeling that there must be a better way that I wasn't seeing.
Don’t say I didn’t warn you ;) This is not used any more — and the repo’s history looks messed up — but it was my solution to the “configure, make, make install” dance in Ansible:
You should be building packages for the software, of course. If it’s some random internal app you’re chucking into a container it’s one thing, but if you’re deploying it across VMs or physical hosts you probably should put the effort into writing a rpmspec or whatever the hell Debian does (this is a large reason why I don’t do Debian-based systems, I have never been able to properly wrap my head around dpkg builds - there’s too much magic in debhelper and not enough documentation).
Here's an example: you can register the output of a task into a variable, but if you use the same registered variable for multiple tasks even skipped tasks overwrite the registered variable. Say you want either task A or task B to run depending on the OS version, and do something based on that result in task C. Easy enough, just make two different registered variables, and have a separate conditional check for each registered variable on task C. But then, say it's only one of task A, B, C, or D, and task Z needs to know which one ran along with some other conditional logic. The conditional logic gets really hairy at that point. A way around this is to use the same registered variable repeatedly, and save its output via set_fact after each task, if that task ran, to a second variable. The set_fact module will only define the second variable if the conditional is true, unlike the registered variables which get redefined regardless. Then you can check for just the second variable later in the play and see which of the four tasks defined it. But now your 4 tasks have become 8 tasks of workaround repetition. This is just one of the quirks that starts to come up when a playbook gets more complicated than install x, write template y, done.
Basically, it's better to stick with tasks that stand on their own. As soon as you stick complex task logic in it becomes kind of a Rube Goldberg machine. That being said, you can pull off quite a bit if you just test rigorously and write defensively.
Worst scenario for me was trying to implement a quasi hierarchical set of roles for a little over 1000 machines. Meaning I had grouping by datacenters, packages, configuration, domain, apps, etc, but also some differences. Inheritence sucks for both salt and ansible.
I wanted a mono repo and I did manage to do it. But it was not elegant and I kind of pity whoever came after me and needed to maintain and extend that without me being there to explain the model.
This is for the experimentation phase before you abstract backwards into configuration for whatever system you're using.
Sometimes it's a lot easier to stay closer to the metal for figuring out what you actually need, then break out the declarative automation to make it truly reproducible.
I don't think "interspersing commands and their output with prose" is meant to provide a robust solution that should replace "automation tool like Puppet or Chef".
This looks to be a step-up from just manually tinkering with commands in some shell to figure out or explore what it is you need to declare.
Some kinds of declarative automation are just bad. Case in point: configuration management.
Ansible was written before we realized immutable infrastructure is the best way to go. Configuration Management tools craft system state dynamically like a drunk sculpting a Roman bust with a Louisville Slugger. You can get them to do what you want, but it takes a lot of work, and even then the outcome is uncertain. CM tools are often complex because a system whose state is constantly shifting requires complexity to handle it.
If instead you make immutable artifacts, you don't need complexity. A simple series of straight-forward commands in a single version-controlled file does everything you need. Dockerfiles seem immature at first because of how simple they are, but in practice it's much more reliable than Ansible, not to mention easier to support. Thus, non-declarative automation, when used with immutable infrastructure as code, trumps declarative.
Ansible is also only optionally declarative, which people miss just because it has that bastardized form of YAML for a config file. The simplest tasks, roles and playbooks work when executed sequentially, but not necessarily when out of sequential order. When you do make it super-duper-declarative, it can involve tons of confusing logic that makes it nearly impossible to understand, and is much more verbose (and complex) than a simple script. As soon as automation is more work and cost than the alternative, it should be ditched.
That's an ansible task to do something, but it's not the task solved in the article, or in any of the other linked examples.
Things that you haven't done before are harder than things you do all the time, and I would've reckoned trying to do them in ansible is much harder than doing it interactively, but I'd be interested in learning.
If I find an article like this[1], how do you translate it into the artefact I would want to keep in source control? In org-babel it's this[2], so what would that look like in ansible?
Modifying iptables rules is such a common operation that all configuration management languages would support them natively.
Using an iptables module for Ansible it is straightforward to write the rules as yaml. The documentation has a clear example.
However, having used several of these languages for over a decade and realizing this is a contentious issue within the community, I would actually avoid using these tools for individual rules. What I would do is dump the rules to a file, distribute the file, and let Ansible make sure the file matches what is running (by way of a regular rule load when necessary).
The idea here is that I am already familiar with iptables rules and how to write them, and would expect any other ops-ish person to be the same. The source file matches the output, and any historical diffs will be much more straightforward to read, as there are no intermediary source formats that can change.
Also, there is one less Ansible dependency involved, and less syntax to learn (given that one can already read iptables rules).
That's pretty much what I expected; I think that's pretty common, and that's exactly what the author claims to do (use this process for discovery and learnings, then write some guff for chef once they know what they're doing).
It was common to hear that you can only automate what you do manually, but these days a lot of the underlying tasks have changed. Iptables is an absolute doddle to automate, TBH, and the entire security coverage that Iptables offered has often been superseded by security groups in AWS, sidecar proxies with service meshes, etc, so the whole paradigm has changed. What I meant about the article being quite old, is not just that the solutions are old, but the problems being solved have gone away, or should have. Ok, I did come across a client at the beginning of the year doing it the old way, but they were an Internet Exchange around since 1994, so I understood that.
Are you saying that everything (or almost everything) worth automating is already automated by someone else so learning how to do new things isn't important?
> - name: install the latest version of Apache and MariaDB
That's not a name, that an intention describing comment, why isn't it called that way (or for brevity's sake, just "comment")? That might be nit-picking, but IMHO such misnomer cause unnecessary confusion and at the very least make the tool harder to learn.
I think this is really cool and I'm tempted to try it out for my own VPS so I can easily set up a fresh one if I need to.
Modern declarative automation is great for a lot of things, especially at work and for CI/CD pipelines, but it's no substitute for learning what you need to do to reproduce the same setup by hand.
We have a convention that the call should be just "ansible-playbook play.yml" and everything else should live inside ansible. This way there's no need to document how to run the playbook.
And how do you share the context / documentation which yaml file is intended to be called by which ansible utility with which utility parameters and switches?
I ask b/c this is normally the problem I run into when things grow and I found the suggestion in the OP interesting to this regard as it adds the context while implementing, not documenting afterwards (and such documentation often becomes stale). Also documentation is often declarative (like do this, then that etc.) and it does not show the original thoughts/ideas behind a certain utility invocation.
I'd be interested to learn a bit more here, so if you could share a bit of context from your end would be nice.
> And how do you share the context / documentation which yaml file is intended to be called by which ansible utility with which utility parameters and switches?
Like I said: Ideally there are no switches and parameters - the playbook should work as-is. If there are switches, we document them in our internal confluence. And yes, our docs get stale, too. That is a problem that we did not fix yet. :)
Many playbooks also get executed automatically, so there'S no need to remember parameters for them. You just start the CI-job that then runs the playbook.
despite being old, I think the value lies in the "principle", you don't have to generate your code out of your markup but at least if you document your thoughts and reflect upon them and what and why certain decisions were made, etc... That invaluable on it's own IMO
The process doesn't allow it, either - using Ansible with Packer (e.g.) to build immutable images means that contact between build and compute is very limited.
Installing software on a production machine? Get outta here.
I like the concept, but the problem with embedding random snippets of code into a document is there are too many assumptions about the state of the system they're running in. The operating system, the revision, the installed software, how it's configured, where things are located, orders of operation, what network you're on, what access your user has to what systems, etc.
Rather than embed snippets, write a single simple script that automates your steps, and create an immutable artifact. The simplest way to do this is to use Docker with a base image that mimics the target system, and performs all necessary steps in one go. This allows you to perform a 'test run' in an empty container to make sure the steps work. And you can embed in-line documentation and generate it with Doxygen or some other tool. (I highly recommend you get all your teams to standardize on a tool like this, and use it literally everywhere)
(Also, pet peeve, but what does DevOps have to do with this? Can we stop over-using this term please? Not everything that's "sysadminny" is DevOps)
Yes, much better. This person should use Docker instead and output the RPM from the container. Reproducible as all assumptions are clearly stated and much faster to execute again if the source changes (using cached layers for most prerequisite steps).
> The simplest way to do this is to use Docker with a base image that mimics the target system, and performs all necessary steps in one go.
Or better yet, run all your code in docker containers so you can run arbitrary slices of your production environment on your local machine.
You’ll eventually hit some host-OS network issue where you need to inspect the machine that’s running your containers, but for artifacts/RPMs as described in this doc you should be able to do everything locally.
This is great. Now, there's quite a bit new to learn here.
I use Markdown notes files with quoted commands and outputs as first exploratory documentation and before it goes into Ansible/Dockerfile/Packer/Whatever. I can add links, images etc and anybody can easily follow and edit/PR since well, it's Markdown.
Seems to me what I'm doing is basically the same idea as this, with total flexibility (no schema), nothing new to learn only I can't run the .md file (how hard would it be to parse a md file and ignore everything except the triple ticks and execute that, in say Bash?).
Well, I consider org-mode friendlier than markdown, but besides the language differences you are mostly missing the Emacs goodies.
- you can weave this file to one or several code files, even export the plain text parts as comment on the language if you decide so.
- you can execute the code directly from your local machine into one, or different remote machines.
- you can use the output from a previous block inside another block, even preprocess it on another language before executing in another block.
- you can get the results recorded on the same file you're executing so you can also document what would the results be for executing x command.
- if you use Emacs as another tool, you can easily send the results for example on email, slack, irc or whatever system you can interact with on Emacs. I think confluence made it harder on recent versions, but previously I was able to generate documentation pages and easily upload them to confluence without worrying about format or even upload.
Maybe even make your code blocks language aware to pick the execution context, provide a language tidbit or even shebang, ie
```js
console.log('yo, sup')
```
or
```#!/usr/bin/env node
console.log('yo, sup')
```
That way, your md files could be both executable/interactive and be flexible in regards to their execution environment (as long as you are on a unix-like). Trying to resist the urge to go make a VSCode extension to do this right now.
It was kind of more like coming up with organized snippets and notes while understanding a problem.
I'm expressing the wish that were I better at organizing them better that I could have tangled all of those snippets into a working (if ugly) source file and work from there.
Ohh I see, yeah. I usually draw diagrams or write pseudo-code out on paper (or both combined) when trying to grasp something difficult but "notebook-style" exploration also sounds interesting.
I do this day to day, nice to see that there's a name for it :)
Start every piece of work in a markdown with vim. and proceed to dump my thoughts, what I have tried, what did work and what didn't. Also any useful links that I've found useful.
And as they say "You only learn once you reflect", this proved invaluable for me going forward when faced with "ah. I've done that before but don't quite recall the details"
I move feel this to be like literate/log driven development
It would be interesting to take the idea in some of these comments and turn out some Ansible or Terraform code from literate org-babel documents. Kind of a 2020 DevOps take on the linked article.
Most of my org-babel stuff is running AWS commands and collating the results into tables for more documentation. I write my Terraform code outside of org-babel.
It's not complicated per se, but the industry makes it complicated. Running a basic web app isn't complicated if you throw up a server, but if you use Kubernetes then you've suddenly made things much harder.
As someone who just went from zero to a working Kubernetes implementation, like a lot of things in tech, the problem isn't the tool: it's how others explain it.
The tool is excellent and extremely well-conceived. Being able to spin up a highly scalable fault-tolerant system with a few manifest files is impressive. The only frustration I ran into was trying to understand which files I needed when vs. what was unnecessary and intended for a special use case.
It depends on your scale of course. It’s not like “the industry” just wants to make things complicated for the fun of it. There comes a point where you’ll need more than one server to host your web app and when you do Kubernetes provides a nice abstraction in addition to providing a ton of other useful things at scale like infrastructure as code to zero downtime, continuous deployments.
> It’s not like “the industry” just wants to make things complicated for the fun of it.
Doesn't it? I think that there are plenty of folks who want to make things more complicated, because complicated things are fun, while just making things work can be a pain but also boring.
Computers are hard to manage as easily as our codebases for a lot of reasons. But it should also be known that the field is fairly young and there's a lot of competing ideas while we're trying to keep the lights on with software from several decades ago that keep paying the bills. This is not a challenge that we incentivize our best engineers to tackle necessarily because our industry currently values pumping out more features than actually streamlining existing features. It's also not clear whether some of these problems can be solved with "better" engineering rather than a completely new way of approaching software systems. From a project / engineer alignment perspective, see https://www.youtube.com/watch?v=2JNXx8VdbAE
I wouldn't call it just DevOps. It's computers. We can all do what we do, we can write our `print "Hello World"` because we leverage millions upon millions of lines of code that other people have written to abstract away the other complications to getting code shipped.
org-babel is a really cool extension of org-mode and Emacs. But it is only useful for Emacs users and therefor useless for any professional use-cases. As much as I love Emacs expecting people to learn it is a bit too much.
Well, it is possible to run Emacs in batch mode from the command line. Therefore you could have someone create the files as documentation, with instructions on how to run them as a single command.
I haven't tried this specifically with org-babel, and as other commenters in the thread state, Terraform/Ansible may not lend themselves to org-babel quite so easily, but it's certainly possible. Although maybe not worth the effort unless you really push hard for a team to document their infrastructure code in this way.
I have used emacs with make for this purpose. It works, but the main issue is white space. Any language that has significant white space is liable to break in org mode, and a ton of devops tools are python based, or python like.
I've been using guile and noweb, with a tangle I wrote myself, for literate devops on top of Guix and it works better than anything else I've tried.
If you look at Emacs/org-babel as a tool, why is it useless for professional use-case? It is just like saying: it is too much for people to learn to use Rstudio/Shiny, yet this is frequently a go-to tool in the scientific community.
Why would it be useless for non-Emacs users? It's an integrated tool, just like any other tool which you learn to use. You can use it for this purpose while you use some other editor for other tasks.
You don't have to use Emacs for everything just because you use a certain application of it.
This isn't a system you'd deploy and expect others to maintain. It's a system for developing your playbook and Ansible / Puppet / etc configurations.
Emacs is absolutely suitable for the use case of a professional developer writing these things.
The literate programming notebook approach is very interesting. Better IMO is to reverse it and write copious comments throughout an idempotent script that you can continually edit and re-execute.
In either case, the final product that you share with others are your notes and an actual deployed configuration for a real system, not a config file for an IDE.
So your point is, that org mode is just inside one tool and therefore one cannot expect everyone else on the team to learn it?
I think we can make the same argument for every single tool out there, like one specific IDE or whatever you have. It's just a question of how many people already use Emacs on the team, to make a democratic decision.
Or is there some criteria, by which I can sort tools into 2 camps, "OK to force upon coworkers" and "not OK to force upon other coworkers"?
I suspect a mandate to use Org could be counter-productive.
One of Org's unique advantages is the pliability it has by virtue of running in Emacs. I doubt whether someone who was forced to use it would have the intrinsic motivation to climb the notoriously steep learning curves for both Elisp and Emacs. Subtract Elisp (and with it, the ability to do any debugging or customization) and you're left with stock Org mode running in an inferior text-editor.
Maybe stock Org is still enough of an improvement over Markdown that it's worth the overhead of learning Emacs, but I'd have a hard time making that pitch to my coworkers.
Someone tried to liberate Org from Emacs by writing a parser in Rust (see previous discussion on HN [1]), but it looks like the project died unfinished [2].
You may want to instead do:
and add: to ~/.ssh/config. Otherwise if you delete your VM and make a new one, the entry from the old one is going to stay in your main ssh config, so it's going to fill up with cruft over time.