Hacker News new | past | comments | ask | show | jobs | submit login

Hi, author of diesel here.

I can only assume you're probably referring to some of these things:

1. Response latency, esp at 99%-ish mark, under load 2. Memory usage per connection under many idle connections 3. Scalability wrt. data sharing, backing persistent data, replication/redundancy strategies, etc

I'm not sure if you were referring to us, the diesel authors, when you indicated that someone didn't understand something about Comet scaling, but I assure you that diesel does 1 and 2 quite nicely, as do most sensibly-written things based on epoll, kqueue, etc. The benchmark page is with 1k concurrent connections, and it does well with more than that, too.

If you're referring to item 3, I'm afraid diesel doesn't tackle that element of scalability yet. It's more of an I/O library, not really a framework with aspirations of providing high availability. We have some plans and quite a bit of mostly-working code that implements a paxos-based framework for achieving those goals as well, but release of that part of the framework is some months out.

Writing good unit tests for this stuff is a higher priority--unit testing async code is a PITA. We're probably going to need to steal ideas from twisted.trial or something.

Thanks for checking out diesel.




3. Scalability wrt. data sharing, backing persistent data, replication/redundancy strategies, etc

...

We have some plans and quite a bit of mostly-working code that implements a paxos-based framework

Don't. Seriously, don't go that route, it's a huge waste of time. Instead: Keep your server shared-nothing and make it interface with as many different breeds of Message Queues on the backend as possible (AMQP, Stomp etc.).

That's what any non-trivial comet app needs to do anyways and any kind of intelligence inside the comet-server beyond dispatching between the backend-queue and the browser only gets in the way.

The canonical setup is:

    Browser <-> Diesel <-> Message Queue
Diesel should be able to maintain connections to multiple Brokers on the backend in parallel, for failover and load distribution - and that's it.


Right, but those message queues themselves implement Paxos or have some master election scheme, etc (if they promise replication/failover). You have to get that durability somewhere.

So, we're internalizing that queueing behavior into diesel.

(You can tell us not to do it, but it's pretty much done. :-)


You have to get that durability somewhere.

Yes, and the message broker is pretty much the only place where it makes sense.

So, we're internalizing that queueing behavior into diesel.

An exercise in futility.

The main application dealing with the actual messages will still need to know which Diesel instance to talk to, in order to reach a particular subscriber.

How do you solve that?


I'm not sure what main application you're referring to.

Every diesel node is an instance of an application that is willing to be provide a service that acts as the master/router for a certain class of messages. Paxos ensures there is only one master elected for every message class, and this router is re-elected should the router go down. The routing table of who is master for what class of messages is kept in sync across all nodes.

It's almost exactly like registered processes in an erlang cluster--reserving the exclusive right to handle certain messages. Simpler master/slave relationships can take over from there.


Every diesel node is an instance of an application

You mean one that the user writes (starting with "import diesel")? If so then I'd think such a tight coupling is a bad idea. Why prevent non-python users from using Diesel as a comet-broker? Why even force python-users to tightly couple their app with Diesel when it'd be so much easier to abstract out the interface they need?

provide a service that acts as the master/router for a certain class of messages. Paxos ensures there is only one master elected for every message class, and this router is re-elected should the router go down. The routing table of who is master for what class of messages is kept in sync across all nodes.

If you're going to all these lengths then why on earth couple it to a comet server? All these features belong in a message broker, not in a protocol endpoint. You'd make many people happy by building a STOMP or AMQP broker with these features, even people that are not interested in comet at all.

Wrt your other reply: No, we don't agree. But at least I have a better idea of what you have in mind now, thanks for that. Also this all is ofcourse just my humble opinion. It's your project and you're free to overengineer at your peril ;-)


Okay, the "comet" bit, I can see that--why conflate those?

To clarify--we're going to build a comet framework _on_ diesel, but diesel itself is more of a general async I/O system with cluster messaging features. We intend it to be applicable for building arbitrary networked application using similar patterns to what you'd do in erlang--message passing.

We just _happen_ to be focusing on building a comet framework first and foremost on it. That will probably be called "dieselweb".

So, the "comet server" portion of our framework may not, in fact, utilize any of this message-broker stuff. But other components might. We're actually going to try to build something fairly unique here, but I don't have all the details ready to put out there yet.


Good luck :-)


So, maybe we agree? Because I'm not disputing (and never have) that the message broker is the place to do that. But you seem to be conceptualizing the message broker as a separate _service_ or application, and I'm saying it's just a function, a role. It doesn't matter so much what particular process (or group of processes) it runs in. That's what I meant by "internalize".


Thanks for the response, and good to see some of the other issues understood. It wasn't anything personal, I haven't checked out Diesel fully yet, I just saw the page at http://dieselweb.org/lib/benchmarks/ and thought it could do with exploring all the other scalability issues...

I guess I just have "Here's a graph showing req/s for twisted/tornado/etc etc" overload lately.

It's definitely nice to see more open source options in the Comet arena :)

Oh and http://shoptalkapp.com/ looks very interesting also


Writing unit tests should be your highest priority. I am far from a TDD advocate, and compared to many I am terrible at maintaining my own tests, but I could never justify using an IO package that contains zero tests.

IO errors are some of the most annoying errors to debug, and the whole point of using a preexisting IO package is to not have to worry about errors in it. I would say you need 100% test coverage of your core library before I could consider using Diesel for a major project.

Just my $0.02...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: