Using an ASGI server that supports async/await, such as Uvicorn, instead of green threads, forking, etc, seems like a good idea these days. Also means you can use Starlette which has a much nicer design IMO than some of the old frameworks.
Any of the above (or Sanic) can do ~3K RPS on a single core on a Raspberry Pi (which is where I test things for portability, optimisation and a little fun), and the RAM overhead is generally not that bad small (just did a little "hello world" uvicorn/blacksheep app and I see 22MB resident/10MB shared per worker, and one of my Clojure servers taking up over four times that...)
- https://www.uvicorn.org/
- https://www.starlette.io/