Stuck jobs in AWS us-east-1 [resolved]

2022-04-15 10:30 UTC - We are investigation stuck jobs on a new queue. Some jobs will probably end up with an internal error. Next update in 15 min.

2022-04-15 10:50 UTC - The situation was resolved, however 6 jobs were terminated by an internal error with no automatic restart. We apologize for the inconvenience and please feel free to restart your jobs manually.

Affected jobs:

  • job-839828541
  • job-839828546
  • job-839830974
  • job-839831721
  • job-839831773
  • job-839831782



Longer jobs runtime on new queue in AWS eu-central-1

2022-04-11 15:04 UTC - We are investigating transient delays in jobs processing. It manifests as a two hours gap without any activity in job events. It is happening randomly across projects and configurations, most of the occurrences are around 04:00 UTC. Only jobs running on new queue are affected. We are investigating the issue, next update in three hours or when new information will be available.

2022-04-11 16:54 UTC - We have increased minimum number of nodes which might help to avoid the issue happening again. Meanwhile we are investigating the root causes of timeouts. We are also working on decreasing timeouts from two hours to much lower value to prevent unnecessary job runtime increase in case of networking issues. Next update when new information will be available.

2022-04-14 12:54 UTC - We have reduced the timeouts from two hours to two minutes. This will prevent a job to get stuck for such a long time when a connection issue occurs. We are still investigating the root networking problem. Next update when new information is available.

Stuck Orchestrations [resolved]

2022-04-05 5:28 UTC - We are investigation stuck orchestration jobs on a new queue. The next update in 15 minutes.

2022-04-05 5:50 UTC - We can see the problem occurs in AWS regions, however we haven't found the root and continue investigation. Next update in 15 minutes.

2022-04-05 6:10 UTC - We rolled back previously deployed version of queue internal component and it seem to unblocked the stuck orchestrations jobs. We don't see any stuck orchestrations for now. We continue monitoring the situation and investigate for the root cause.

2022-04-05 9:10 UTC - We identified a root cause and now preparing a fix. However as of previous quick fix we are not noticing the stuck orchestrations anymore.

2022-04-05 11:40 UTC - We deployed fix and everything is running operational now. The root cause was a misconfigured network access for an internal Queue component.

Delayed processing of jobs in AWS EU stack

2022-04-01 07:36 UTC We are experiencing higher number of jobs in waiting state more than usual. We are investigating the issue. Next update in one hour or when new information will be available.

2022-04-01 08:36 UTC Backlog is cleared, delays were caused by increased traffic.

Scheduled Maintenance (Azure North Europe Region)

Scheduled maintenance of Azure North Keboola Connection (https://connection.north-europe.azure.keboola.com) will take place on Thursday, Mar 31th 2022 at 05:00 pm UTC and should take less than one hour. 

It should not affect project's running jobs, these will be queued during the short project maintenance. Orchestrations and running transformations will be generally delayed, but not interrupted. 

During the maintenance, you can't access your data and projects. All network connections will be terminated by "HTTP 503 - down for maintenance" status message.

We'll update this status about the progress.


UpdateMar 31th, 16:57 UTC - All changes done, in the end not maintenance was necessary.

Some Python Transformations Failing on Date Parsing Error

2022-03-16 12:45 UTC
We have discovered some failing python transformations that throw an exception with the message "bad escape \d at position".  This error was caused by a breaking change to the underlying regular expression library.

If your transformation is failing in this way you can fix it by specifying "regex==2022.03.02" in the packages input.

2022-03-18 15:00 UTC regex package is not part of our base images, so there will be no further actions. Users have set working version of package in their dependencies


High error rate from Sklik API

We are seeing high error rate response from Sklik API

Since 2022-03-15 00:00 UTC we are experiencing jobs failures due to a shortage in Sklik API. We are going to monitor the situation and keep you posted.


Since 2022-03-15 09:30 UTC last error from Sklik API was 8:34 UTC, for now it looks like Sklik API operates normally.

Errors When loading Tables in AWS US region

We're seeing failed table loads, it seems that the cause is a problem with Amazon AWS services.

Update 21:22 UTC (Resolved): AWS confirmed recovery of the services experiencing errors hence we consider the issue to be resolved as well.

Update 21:10 UTC: AWS informed about issues in us-east-1 region and started investigation however we are not seeing anymore issues on our side and US multitenant stack seem to be fully operational now. We continue monitoring the situation.

Update 20:56 UTC: The issues started occurring at approximately at 20:44 UTC. The issue is isolated to the US multitenant stack https://connection.keboola.com/.