This is a very useful way of thinking about responses and latency, and a great example of a domain with a lot of very specialized knowledge (industrial engineering and logistics) that probably doesn't overall have enough cross-talk into CS given how much we talk about queues, latency, and workers.
I'm going to have to brush up on my IE one of these days to try to figure out if there are some useful insights to potentially be gleaned from models that account for server-level concurrency more directly, along the lines of https://en.wikipedia.org/wiki/M/M/c_queue
Instead of giving the entire dataset to the model (which in this case already identifies a performance limit visually), give a partial set and then see if these algorithms can predict the remaining known values. That should be the real test of whether prediction works.
I'm going to have to brush up on my IE one of these days to try to figure out if there are some useful insights to potentially be gleaned from models that account for server-level concurrency more directly, along the lines of https://en.wikipedia.org/wiki/M/M/c_queue