Not fast enough! I used it to call llama.cpp server but it would crash if requests were "too fast". Calling the llama.cpp server directly solved the issue.
Did you get "connection reset by peer" when you sent a bit too many requests perchance? I've never found the source of that in my programs. There's no server logging about it, connections are just rejected. None of the docs talk about this.
Interesting, I've used fastAPI to serve many thousands of requests a second (per process) for a production system. How were you buffering the requests?