More

lorendsr · on Feb 14, 2024

Temporal is designed to handle lower latency use cases than data pipeline systems like Airflow. It also has added a feature recently called Update designed for request-response style interactions that allows for communication with Workflows on the order of tens of ms.

lorendsr · on Feb 2, 2024

There are certainly use cases for which it's more than is required. Like the most simple would be adding a cron string to a GitHub Action or Vercel function, but in most cases, and certainly Slack's case, you want more reliability, scalability, flexibility, and/or observability. And Temporal has extreme levels of those things, including pausing & editing schedules and seeing the current status of all triggered scripts/functions and all the steps a function has taken so far, and guaranteeing the triggered function completes, including ensuring that if the process or container dies, the function continues running on another one. Even if you don't care about all those things, you might care about some of them in the future, and it doesn't hurt to run a system that has capabilities you don't use.

lorendsr · on Feb 2, 2024

I wrote up this comparison: https://community.temporal.io/t/what-are-the-pros-and-cons-o...

lorendsr · on Sept 22, 2023

Using Temporal is in the category of building it yourselves, not a billing service. But it makes it take much less time to build, because it makes the changes, scaling, and grandfathering quick to do. Especially with the new Scheduled workflows feature: https://github.com/temporalio/samples-typescript/tree/main/s...

lorendsr · on Sept 6, 2023

Here's a Temporal v Prefect comparison I wrote: https://community.temporal.io/t/what-are-the-pros-and-cons-o...

tldr is Temporal is more general-purpose: for reliable programming in general, vs data pipelines. It supports many languages, and combining languages, has features like querying & signaling, and can do very high scale.

CI/CD is a common use case for Temporal—used by HashiCorp, Flightcontrol, Netflix: https://www.youtube.com/watch?v=LliBP7YMGyA

lorendsr · on Aug 8, 2023

Thanks!

You do need to run it on the same code version. There are different ways deploy code changes. If you use one of our in-built versioning systems, the version is recorded in the workflow history (and you can either keep track of version-sha mapping or use a sha as the version). Otherwise, you can add code that adds the current code version as workflow metadata.

lorendsr · on Aug 7, 2023

Author here, curious if there are any major TTD debuggers I missed? Also let me know if anything in the post didn't make sense, and I'll edit to clarify!

lorendsr · on May 31, 2023

2PC tends to have limited throughput due to the participants needing to hold a lock between the voting and commit phase, and all the participants need to support the protocol. Sagas work across different services and data stores and can have high throughput.

However, if all of the data you need to update is in a single database that supports atomic commits, I'd go with that over sagas.

mkleczek · on May 31, 2023

> 2PC tends to have limited throughput due to the participants needing to hold a lock between the voting and commit phase

Scalability depends on lock granularity, what's more...

> Sagas [...] can have high throughput

There is no real difference as sagas in practice implement locking in disguise - take the scenario of flight/hotel/car booking:

Once you book a hotel - this particular resource (a room at a particular time) is effectively locked. Cancelling the booking (because other participants failed in saga) is effectively releasing the lock. The room (resource) is locked for the duration of the whole process anyway (as no other customer can book it during this time).

The downside of sagas is that a programmer is forced to explicitly handle all failure scenarios - which costs development time, is error-prone etc.

lorendsr · on May 30, 2023

At some point, if you can't automatically fix something, you have to stop and report to a human for manual intervention/repair. While a saga doesn't guarantee that you avoid manual repair, it significantly reduces the need for it. If each of these has a 1% chance of non-retryable failure:

Step1

Step2

Step1Undo

then this has a 1% chance of needing manual repair (it's okay if step1 fails, but if step1 succeeds and step2 fails, we need to repair):

do Step1

do Step2

and this has a .01% chance (we only repair if Step2 and Step1Undo fails, 1% * 1%):

do Step1

try {

  do Step2

} catch {

  do Step1Undo

}

nivertech · on May 30, 2023

There is also the case when Step1 was successfull, but the Saga Orchestrator (or Saga participant in case of Choreography) for some reason (like communication error) doesn't know about it.

In case Step1's service doesn't expose an API to poll its status, then the only recourse is to execute it again (with the same input key, assuming it's idempotent ;)

lorendsr · on May 30, 2023

Sagas are for when you can't do an update in an ACID transaction, for example when updating state across different types of data stores.

If you're asking whether the catch clause in a Temporal Workflow saga is guaranteed to execute, the answer is yes. The way it's able to guarantee this is by persisting each step the code takes and recovering program state if a process crashes, server loses power, etc. For an explanation of how this works, see: https://temporal.io/blog/building-reliable-distributed-syste...

agumonkey · on May 30, 2023

Seems like a distributed stack unwind

fortunaTemporal · on May 30, 2023

Author of the article here. Yeah, @agumonkey I like that analogy a lot!

Generally, that's a good way of thinking about it. The one additional bit of nuance is it's like a "safe" stack unwind while other processes could be still modifying databases at the same time, so it's not a complete "rollback" of the whole world if that makes sense.

agumonkey · on May 30, 2023

Thanks, you're probably right, there's more to it. It's a thrilling topic, are there any other patterns, or abstractions into controlling "distributed state" (apologies if i'm twisting things too much again) between agents to keep things in a correct order ?

fortunaTemporal · on May 31, 2023

One pattern is having a Workflow that runs as long as the lifetime of a domain object and holds a conceptual lock on that object—it receives requests to modify the object, and makes sure to only perform one operation at a time. (like the state pattern on a particular agent)

Also related: Signals are events that you can send to Workflows and between Workflows, and they’re always delivered in the order they’re received.

More generally, for a handy reference of Distributed Systems patterns, check out https://microservices.io/patterns/data/saga.html (though I personally find his diagrams a bit...overwhelming) and the MSN writeups: https://learn.microsoft.com/en-us/azure/architecture/pattern...

agumonkey · on May 31, 2023

oh that's cool, thanks a lot

lorendsr · on May 30, 2023

If I'm getting your point right, I agree! If you have the workflow / durable execution primitive to depend on (a durable function is guaranteed to complete executing), then there are a lot of pieces of distributed systems stacks that you no longer need to use. Your durable code is automatically backed by the event sourcing, timers, task queues, transfer queues, etc that Temporal internally uses to provide the guarantee, so that you don't need to build them yourself.