Double checked, it still makes calls to unregistered port that does not have a backend to check for heartbeat of inference server, and still times out - do you not use streaming responce?
And there is a lot of errors...
okay, so there are a lot of errors
1. Player2 is not responding
2. Health request sent to unknown port on localhost, i suppose it tries to determine health of AI inference endpoint (:4315/v1/health)
3. ChatManager gives "Could not fetch token usage! 401"
4. and when game determined that "Local LLM: OFFLINE" - it gave "operation timed out" error (it did send request ignoring previous statement)
fourth error is 10 seconds after request, while my endpoint is still generating (i tested with CPU inference, GPU is unavailable at the moment)
And after that, repeated error of this kind, with few missing animations