node:18.4.0-bullseye
docker image)At scale, when the event loop is incredibly busy and/or blocked by larger sync tasks, timeouts can fire when the underlying event has occurred at a lower level.
One example of this is the connect
timeout. Although the underlying socket connects very fast, the setTimeout
that aborts the request fires before the I/O phase of the event loop has a chance to let the request know the connection was actually established.
Request is aborted with Timeout awaiting 'connect' for 50ms
Connection is successfully established
No easy way to produce. We started observing this at scale.
Here's the patch we've applied locally: main...thegedge:got:main
And here's a timeline of the Timeout awaiting 'connect' for XYZms
errors from patching:
I understand we can go either way here. Ideally we wouldn't have to go through the event loop and could just ask the underlying socket "hey, are you connected" before aborting, but alas.
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too