Hacker Newsnew | past | comments | ask | show | jobs | submit | hyades's commentslogin

Hey I wanted to apply at Shopify. I have almost 4 years of experience in software engineering. My linkedin - https://www.linkedin.com/in/aayushahuja/


hardware?


It's from folks visiting his site. (Linked.)


Under what circumstances would one prefer Redis streams over Kafka and vice versa?


I can think in some circumtances: 1 - You already have a Redis infrastructure and don't wanna or don't have resources to deploy a full Kafka infrastructure (3 kafka brokers + 3 zookeeper nodes)

2 - Kafka clients are not available (or are poorly available) for every programming language. Redis has a simpler protocol, so it has more/and better clients available and even if you use an exotic language, it is easy to write a client to it (well... easier than Kafka)

3 - Kafka AFAIK does not have any internal cache implementation, so every read is served from disk (+ page cache). This means that Redis Streams will (probably) perform much better for use cases when the consumers need to fetch data from old offsets.

edit: added reason number 3.


Regarding point 3, unless your system is under massive memory pressure, no caught-up Kafka consumer should be serviced from disk. Old offsets that are flushed out of memory because you do not have it obviously are served from disk, with essentially linear reads of disk blocks (of consecutive logical addresses if flush sizes are large enough that then can end up on disk in any number of ways, depending on how much the firmware lies, I know) of the requested file.

I really can not see how Redis is going to perform "much better" reading from disk once the entries are no longer in RAM. At that point both Kafka and Redis have to read from disk, and you either have the IOPS to serve all the lagging consumers or you don't. Maybe you have enough of them to service 1 or 2 concurrent reads, maybe 10-12. But for the same messages counts, sizes and concurrent consumers, your workload will become IOPS bound rather fast.

Note: "much better" to me implies 10x+ better, not "my C library read() is 2.3% better than your Java".


re: client support - I dunno, this seems like a pretty comprehensive list to me? I mean, there's even a rust client: https://cwiki.apache.org/confluence/display/KAFKA/Clients


Confluent only officially supports the Java client (and now has a Python, Go and .Net clients as well that I didn't know) and it is really recommended to use the client with the same version of broker due to protocol incompatibilities.

Most Kafka client implementations are open-source projects of their own, this is also true for most redis clients implementations, but again: Kafka protocol is much more complicated than Redis.

I haven't used Kafka with other languages besides Java or Scala, so I can't really say how mature are the other clients.

But my point about how easy is to implement a client for Redis if needed is still valid. =)


They also support a c reference implementation (which is how they get others).


As someone who has been burned by non-Java Kafka drivers, beware the perception of ecosystem support here. The Kafka design pushes a huge amount of complexity onto the client, and in our experience only the Java client deals with this complexity well. We started out using Python clients but eventually moved to Confluent's REST API (wrapping the Java driver) because we had so many problems with it.


One that immediately comes to mind is cases where Kafka is overkill. Kafka is a great tool, but there's a lot of overhead in setting up and maintaining it (e.g. Zookeeper), so if your throughput needs are low, it's a poor fit. Spinning up a Redis server is dead simple, and if you're already using Redis for other things, then there's no need to bring an additional tool into the mix.


Genuine question - Why does everyone seem to think running a zookeeper cluster is so hard? You can run it on three small VMs and basically forget about it. We didn't have any zookeeper experience at my last startup before we started using it for Kafka and we used a very simple puppet module to install it on three instances in each of our AWS regions. It really never gave us many problems in the several years since.

Also, all the tooling around it is quite mature - there are great monitoring and management tools for probing at the internals which helped when we were dealing with more exotic kafka surgery.


For some reason Zookeeper is unjustly seem as uncool technology. I even seen it being blamed for issues that it had nothing to do with.

People say that setting ZK cluster is a huge issue, yet they don't see a problem spinning etcd, or sentry nodes in case of redis.

When I learned about ZK I was skeptic, didn't like that it was written in Java, but ZK proved to be extremely robust.


I'm not sure about it being "so hard", but it's extra stuff to deploy, maintain, monitor, and pay for. Very small teams benefit from keeping things small and simple. Again, if you actually need Kafka, then it's worth it. If you just need something similar, but can deal with the limitations of redis streams, then it's an easy choice.


Because nothing in the current hype cycle depends on zookeeper so people use that as a crutch for following hype.


Simplicity. Redis 4.0 with the rxlists module [1] provides a fast queue system with full persistence. Unless you need replay, Redis is often easier and has plenty of throughput. This streams feature now solves the replay disadvantage.

1. http://redismodules.com/modules/rxlists/


Interesting. Do you have the links of these papers so that I can read more?


The process to update Insomnia in Linux is quite painful. the only option is downloading a new .deb every time.


In arch it's definitely not bad.


Ya, Arch is easier because all you need is a Git repo with a file that links to the package. For apt, you have to have a full web server with a specific structure AFAIK.


Tools such as aptly will happily generate that structure for you. Hosting can happen on an s3 bucket or any Webserver that can serve static files.


I haven't had the time yet to learn how to host an apt repository. Maybe one day :)


It’s not that hard. I recommend reprepro.


You can use the Open Build Service (build.opensuse.org) to build Debian/Ubuntu packages and I believe it also supports publishing an apt repo.


Now it's opensource, we could create an apt repository for the opensource version.

I love insomnia, use it for Ubuntu, and love updates.. but hate updating :).


That would be awesome. Here's the issue for it if you want to talk more about it: https://github.com/getinsomnia/insomnia/issues/182


Look into setting up a PPA on launchpad.net, or use Bintray. Both are easier to operate than reprepro, especially if you don't care to become an expert in Debian package hosting.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: