I've built a production-ready Rust SDK for OpenRouter.ai that makes it trivial to work with 100+ LLMs (Claude 3, GPT-4, Mistral, etc.). Key features:
Zero-Cost LLM Orchestration – Automatic request batching and model fallback
Multi-Modal First – Clean API for text+image prompts (e.g., GPT-4 Vision)
Streaming Done Right – Tokio-based async with backpressure support
Why Rust? After wrestling with Python's GIL in production, we needed:
Memory safety for long-running inference servers
Easy WASM compilation for edge deployments
Fearless concurrency for parallel model queries