Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Batsd: 37 Signals' Ruby Statsd implementation persisting to Redis (github.com/noahhl)
69 points by uggedal on May 30, 2012 | hide | past | favorite | 28 comments


Those interested in lower memory consumption and efficiency should take a look at https://github.com/armon/statsite another statsd "clone". Originally a Python implementation statsite is now rewritten in C.


I wouldn't exactly call that a clone since it doesn't support key features of statsd (e.g. gauges) and has unique features of its own (e.g. key/value pairs).

Looks brilliant otherwise.


It is actually just a difference of naming. Gauges in statsd operate the same as kv data in statsite. Originally, both projects referred to them as kv, this seems to be a more recent change in statsd.


+1 for saying "key/value" instead of "key-value".


So what can I use this for? It's not obvious from the docs on github, and I had a quick look at the doc for statsd as well and it doesn't shed much light either. (and shouldn't it be ratsd? :)


(I work for 37signals and wrote batsd)

This is really just one piece in a bigger set of things to track performance, usage, etc.

You can think of it as: Emitters --> Statsd (or in this case, Batsd) --> dashboards, alerts, etc.

We have emitters coming from Nginx, HAproxy, bluepill, postfix, etc. log files, a gem within all of our Rails apps, and a variety of other scripts that gather data. Those all point to batsd, which aggregates and stores them. We then extract the data into graphs on our dashboard, and use it extensively for Nagios alerting as well. There's a basic sample client included in this repository that we use for those purposes, though you're right, it just gets you raw numbers out of the box.

We're planning on releasing more of both the "emitters" that gather data, as well as a major part of our graphing/dashboard interface "soon".

And point well taken about making it more obvious how to get started and what you can use it for. I'll work on improving the documentation.


Could you explain briefly why you chose to write a replacement for statsd, rather than improve on it? What aspects of statsd were you not happy with?

(I don't have a horse in this race, I haven't used statsd before -- but I am planning to deploy some sort of statistics gathering soon and I wonder why I would choose your implementation over Etsy's, apart from the obvious appeal of the 37signals brand.)


Briefly, probably not (everyone here at 37signals got treated to a 3000 word treatise on our statsd journey a few weeks ago). I did write up a few reasons at https://github.com/noahhl/batsd/blob/master/doc/why-not.md.

In short: we as a company have a ton of Ruby experience and comparatively little Python/Node.js experience (both in terms of understanding the tools that we use, which we like to do, and simply just in being able to confidently manage dependencies, etc.), and we knew we were going to want to build our own UI eventually anyway, which limited the utility of Graphite itself.

Edited to add: I can't say it enough, Etsy and Graphite are both fantastic pieces of engineering, with fantastic communities and support behind them (there's a fascinating writeup about Graphite in particular at http://www.aosabook.org/en/graphite.html).


I briefly read the chapter on persistence -- basically you're doing what RRD originally did (one file per metric), except without actual round-robin storage, before RRDcache was born. The long-term performance implications could be worrisome. Unless you're backing this with solid-state storage, if you have many thousands of metrics, the seek capability of the disk may not be able to keep up with the I/O flush rate.


You're witnessing second degree dilettantism at work.

Remember, we started out with a rock-solid reference impl called RRDTool. RRDTool is 13 years old and about as mature as it gets. It's also surprisingly usable and relatively wart-free.

However, its documentation is not written as a narrative "guide", so inevitably some kid eventually found it too complicated and decided to reinvent it, without realizing how far out of his depth he went. That's how graphite happened.

Now 37signals sees graphite, and goes full Dunning Kruger with yet another knock-off, this time leaving out everything that would acknowledge the slightest understanding of the problem domain. While graphite at least tried to mimic the RRDTool file-format 37signals just skips over that whole "complicated binary-stuff" and writes the data as newline-delimited ascii-text...


I believe Graphite/Whisper were created to address some inabilities in RRDTool: http://readthedocs.org/docs/graphite/en/latest/whisper.html#...

Are you saying that graphite is somehow deficient? How is/was the author "out of his depth"?


While graphite at least tried to mimic the RRDTool file-format 37signals just skips over that whole "complicated binary-stuff" and writes the data as newline-delimited ascii-text...

What benefit lies in trying to mimic RRDTool's file format?


Scalability.


That makes sense, and speaks to me (I'm more of a Ruby guy myself.) Thanks for taking the time to reply.

Edit: and, "it looked like it would be easy" made my day :-)


Thank you, that's very helpful. Looking forward to the emitters and dashboard whenever they're ready - I suspect they will help drive adoption of Batsd and encourage development of additional emitters.


Definitely. There's a teaser screenshot of Flyash (the big, reusable chunk of our dashboard) towards the bottom of http://37signals.com/svn/posts/3091-pssst-your-rails-applica... (that post also details some of the major emitter components we use).


"457,739 different metrics in Flyash" Oh...kay... Flyash looks very nice from the screenshots, but I'm interested to learn how you solve the discoverability problem with that much data. (Too much of a good thing?)


Looks great, can't wait to see it open sourced. I've been dealing with the clunkiness of statsd/data -> graphite -> graphene for a dashboard, and have more than a handful of times almost started writing exactly what it looks like you already have done.

Any idea when we'll be able to use/contribute to Flyash?


A couple of weeks, probably, depending on how much I feel like working on it. It was designed to be modular and easily extracted, but still needs some cleanup work and has a few nasty bugs I'd like to fix first.


The Etsy blog post that introduced StatsD is helpful:

http://codeascraft.etsy.com/2011/02/15/measure-anything-meas...


Similar, but backed by Graphite: https://github.com/mojodna/metricsd

It's awesome to see the community standardizing around a simple way of transporting measurements into a storage and analytics system.


Content was really very informative. From http://www.indiagiftservices.com


I would like to see Flyash you mentioned in the readme as soon as possible. For a monitor system, it is a crucial part.


What does "'wireline' compatibility" mean?


We don't use the same backend as Etsy's original implementation, but you should be able to take any script, etc. that emits statsd measurements for use with the original implementation and point it at Batsd. This means that the 50+ statsd clients that are out there, as well as lots of custom instrumentation, should "just work", even though it's stored and accessed in different ways after it's received.


Why should I use this over http://collectd.org?


You don't use this over collectd; you can use BatsD with it. These tools are complementary.

collectd is very good at collecting system-level metrics and sending them somewhere. Batsd is a system for receiving those metrics and storing them.

(Note that collectd has a graphite plugin, but not one for (B|St)atsD. You'd need to write a proxy.)


YEAAAAAAAAH!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: