Unfortunately we we're unable to find a fix for yesterday's failures, so on Thursday June 7th between 3:49am CEST and 7:38am CEST (1:49am–5:38am UTC, 6:49pm–10:38pm PT) there was an increased application error rate on our Docker host instances in the US region.
The servers are now stabilized and it is safe to restart the failed jobs.
We're looking into this issue. We have started additional instances to help with the load and we'll be looking into the HW architecture of the instances to help us figure out what causes the issue. Meanwhile we'll try to implement a retry on such failed jobs.
We're sorry for this inconvenience.