If you're using Consul for web services, I really recommend the Traefik web server: https://traefik.io
Traefik replaces Nginx: it's the reverse proxy that maps the incoming requests to your various services, which are advertising on some arbitrary localhost port.
The amazing thing is that Traefik integrates with Consul: you only need to point it to your Consul endpoint, and it can automatically publish your services! You can also do other dynamic configuration of Traefik, e.g. by publishing a service via a REST API, via Kubernetes, etc.
I've struggled for years to get Nginx configured correctly, and it's been frustrating to have no alternative. In Nginx, dynamic binding is a premium feature. In the free version, you have to rewrite the config and restart the service. That's not fun if you expect services to come and go as part of your natural life-cycle.
Traefik's still pretty young. Notably, the docs are shit. But the application is well-designed, and it compiles as a single self-contained binary (as it's written in Go).
Traefik isn't worth the trouble. We used it in production over the course of 2 years, and it was consistently our only source of downtime due to rediculous bugs: not closing file descriptors, breaking changes, and silent failures with no log output, panic or exit even with debug logs enabled.
As you mentioned, the docs are terrible. What makes that worse are the undocumented breaking changes between each release. They don't even pretend to follow semver, so v1.5 broke v1.4, and v1.6 broke v1.5. Each update you pray that it doesn't take your whole setup down. If anything goes wrong, since nothing is documented and there's often no logs explaining what went wrong, you might be down for an extended period while you make 100 best-guess changes to the config that worked in staging, but for whatever reason isn't working in production. May the odds be ever in your favor.
Last I checked, Traefik was 988,000 (!!!) lines of code. That's 20x the size of my very complex web application. I replaced it with 500 lines of go providing all the essential features for me. Higher reliability, way fewer bugs, no breaking changes.
You may encountered issues while using Traefik so giving your opinion is totally fine, but I don't think that's fair to overreact.
Many users (and I mean big companies) have been using Traefik in production for years without issue. I'm not saying there is not bug, which software can claim this, I'm just saying that many users have a good opinion on Traefik stability.
We follow semver, there shouldn't be any breaking change between 2 minor versions. But, yes, it can happen, sometimes, we may have forgot to check a specific use case. But hey, again, let's be fair, we don't want it. We are just human. And no, this does not happen at every minor version and this is pretty uncommon...
Finally, on Traefik size. You are including Traefik dependencies, in vendor/, which is a bit weird. In go, the convention is to push the dependencies in your repository to get reproducible builds, so that's not a good way to count.
If you exclude vendor/:
golocc --no-vendor ./...
Lines of Code: 58532 (2987 CLOC, 55545 NCLOC)
Which is rather tiny.
So all in all, I regret you had such a bad experience with Traefik, but I just wanted to express the fact that many users are using it without any issue :) I would be happy to discuss further on this.
Re LOC: Found the 988k LOC number hard to believe, so I checked. If you count the vendored dependencies, then it's indeed a lot, about 1M LOC. Traefik itself though is comparatively a lot less at about 60k lines of Go code with another 8k or so of scripts, config etc.
Any suggestion? I'm stuck on this problem right now at work, don't like Traefik at all for the same reason as you, liked Docker Flow Proxy but it's not configurable enough and, similarly to Traefik, the documentation is quite bad if you need to do something a little more complicated (like, IP access control!).
Fabio might be a good alternative, but we don't need Consul right now, so I don't want to have to manage yet another puzzle piece.
With nginx, you can do dynamic binding in the free version in at least two ways:
1. If you "just" want to map a hostname, to a private IP, you can assign the hostname to a variable, and use the variable in your backend config. This works because Nginx resolves static addresses at start, but resolves names pulled from variables at runtime.
2. If you need to map ports as well, you can use a Lua or Mruby script. E.g. I have a blog post on doing it with mruby here [1], and it's run production sites for a couple of years. This options lets you integrate against pretty much whatever you want.
For those interested in a lua example, from this article [1], I found & cherry-picked bits from here [2]. The key ingredients (to aid searching) are init_worker_by_lua_block & balancer_by_lua_block.
In the free version, you have to rewrite the config and restart the service. That's not fun if you expect services to come and go as part of your natural life-cycle
If that's happening super-frequently I could see the problem, but in practice I'd expect you to use `consul-template` and issue a `reload` to nginx to make it reload configurations with no downtime. This is the solution I've used and it works pretty well.
nginx does that already. The problem is it'll stop you being able to reload configuration on other bits of config. Once there's a config error nothing can reload.
Limited blast radius don't really apply when it comes to nginx config errors (which given its origins is understandable). One error in one bit of config and it all stops.
Old workers stay around until all the connections for them or their grace period has expired. So i hope you're not too close to the resource limit to handle many generations of workers sticking around. also if you have many services behind a single port with path based routing and do a deploy the connection will stick around on the old workers until everything dies.
You can also use the consul k-v store as your source of traefik config and then update the config in consul and it updates in traefik instantly without any restart.
We tried Traefik, but ultimately went with Fabio, which also integrates with Consul nicely; plus it does tcp based stuff for unencrypted gRPC services we shuffle around (which at the time, traefik didn't have.)
It's okay for our use case. Honestly, we were just wanting to get a load balancer up and working with consul, and we had grpc services and internal http apps to balance.
I'm not going to lie; it works and does failover correctly, quickly, and plays nice with consul. Not to mention, the name brings back memories...
If I had my way though, we would use HAProxy, but that's just because there is more options, ways to properly route, and more battle-tested.
Could you elaborate? I would never say that the code is perfect, like any other "big" open source project, but we follow a pretty strict review process on each PR (3 LGTM from maintainers).
I'm curious to know why you are so negative.
If you want to try the new Connect feature from Consul yourself, we've put up an interactive tutorial on our Instruqt learning platform, together with the nice folks at HashiCorp: https://play.instruqt.com/hashicorp/tracks/connect
this instruqt thingy is pretty cool, it's actually addictive. I started with the Connect course and I'm not going for Istio:
> Please wait while we setup a Kubernetes cluster with Istio preinstalled. In the meantime, browse through these notes to learn more about the sample application.
I've been using Consul as part of micro (https://github.com/micro/micro) for 3 years now. It's a great mechanism for service discovery. This additional feature is going to be seamlessly integrated. They've done a fantastic job of pushing forward Consul as a whole.
If I use Kubernetes, this service is superfluous, right?
Instead, it's useful if you use Docker containers in other fashion since services should communicate with each other.
I think the answer is yes and no depending on your needs. I don't have a lot of experience with the Kubernetes NetworkPolicy which does support selector based allow/block of communication between pods, but I believe it does not encrypt the traffic itself (although you could always do so on top of the network layer). It also is constrained to only controlling communications within Kubernetes and requires an actual controller to implement the networking. Consul Connect does use a sidecar proxy for intra cluster communication, but in addition to just the authorization it also does a mutual TLS and can allow that secure communication to endpoints outside the cluster. It now occupies a space very similar to Istio: https://www.consul.io/intro/vs/istio.html
I wish Hashicorp will seriously maim many of the other devops solutions. Their stuff has grew on me slowly, but I'm appreciating the hell out of their focused tools. Smells like Unix spirit.
Could someone care to elaborate what are the main differences between Consul and Istio?
What would be the primary reasons to choose one service mesh over the other?
Well, I don't know how mature the service mesh aspects of Consul will be now, but the rest of it is a very mature product that provides a distributed K/V store like etcd and also provides service discovery via either REST or DNS.
Consul is not a service mesh. At its core it's a distributed key-value store which is commonly used for service discovery, configuration management and healthchecks.
Traefik replaces Nginx: it's the reverse proxy that maps the incoming requests to your various services, which are advertising on some arbitrary localhost port.
The amazing thing is that Traefik integrates with Consul: you only need to point it to your Consul endpoint, and it can automatically publish your services! You can also do other dynamic configuration of Traefik, e.g. by publishing a service via a REST API, via Kubernetes, etc.
I've struggled for years to get Nginx configured correctly, and it's been frustrating to have no alternative. In Nginx, dynamic binding is a premium feature. In the free version, you have to rewrite the config and restart the service. That's not fun if you expect services to come and go as part of your natural life-cycle.
Traefik's still pretty young. Notably, the docs are shit. But the application is well-designed, and it compiles as a single self-contained binary (as it's written in Go).