Folks are using us for long-lived tasks traditionally considered background jobs, as well as near-real-time background jobs. Our latency is acceptable for requests where users may still be waiting, such as LLM/GPU inference. Some concrete examples:
1. Repository/document ingestion and indexing fanout for applications like code generation or legal tech LLM agents
2. Orchestrating cloud deployment pipelines
3. Web scraping and post-processing
4. GPU inference jobs requiring multiple steps, compute classes, or batches
1. Repository/document ingestion and indexing fanout for applications like code generation or legal tech LLM agents
2. Orchestrating cloud deployment pipelines
3. Web scraping and post-processing
4. GPU inference jobs requiring multiple steps, compute classes, or batches