Delayed orchestrations on Azure North Europe stack

2022-09-07 11:35 UTC - We are investigating delayed orchestrations on Azure North Europe Keboola Connection stack (https://connection.north-europe.azure.keboola.com). Next update in 30 minutes.

2022-09-07 12:25 UTC - After our investigation, problem starts at 10:00 UTC. It is caused by outage of our Billing Api. Orchestrations that should have be scheduled after this time, are delayed in the order of minutes or tens of minutes. This applies to standard projects.

Orchestrations of Pay As You Go projects are not scheduled at the moment.

Next update in 60 minutes.

2022-09-07 14:00 UTC - We implemented a hotfix and the situation is returning back to normal. Orchestrations should be scheduled properly now. Problems were caused by the outage of Azure CosmosDB (more information on https://status.azure.com/en-gb/status)

We are monitoring a situation, next update in two hours.

2022-09-07 17:10 UTC - All orchestrations are executed without delay. We apologize for any inconvenience.


Azure North Europe stack - Pay-as-you-go projects - Billing api problems

2022-09-07 10:55 UTC - We are investigating problems with our Billing Api on Azure North Europe stack (https://connection.north-europe.azure.keboola.com/). This is caused by degraded performance of one of the Azure services. Non Pay As You Go projects should not be affected.

Next update in 30 minutes.

2022-09-07 12:10 UTC - Problems with one of the Azure services still persist and our Billing Api is still unavailable. We are waiting for service recovery by Azure Support.

All of Pay As You Go projects are affected:

  • Your credit balance is temporary on zero value
  • Orchestration jobs are not scheduled at the moment
  • You cannot manually execute any component

Next update in 60 minutes.

2022-09-07 13:15 UTC - After our investigation, problem starts at 10:00 UTC. Azure service is still not working properly, but we implemented fix to avoid this situation until Azure Support resolve the root cause.

We keep monitoring the situation closely. At the moment Billing Api service is available, jobs and orchestrations are running.

Next update in 120 minutes.

2022-09-07 17:10 UTC - All Keboola Connection services are running normally. The incident is resolved. Problems were caused by the outage of Azure CosmosDB (more information on https://status.azure.com/en-gb/status)

We apologize for any inconvenience.

Jobs Failing on Internal Errors in Azure North Europe stack

2022-08-30 9:10 UTC - We are investigating some jobs failing on internal error since 6:30 UTC. Next update in 30 minutes.

2022-08-30 9:20 UTC - We identified the root cause in some DNS resolving errors in Azure VMs, see Azure status for details: https://status.azure.com/en-us/status. We are restarting the running nodes which should solve the problem. Next update in 30 minutes.

2022-08-30 9:40 UTC - The Kubernetes nodes that are running containers with jobs were restarted and no error is visible in the logs since then.

Failing Synchronous Actions

Synchronous actions, a mechanism behind some UI features, like testing credentials or listing available databases on a distant server, were affected by a bug and were not working properly from 12:50 UTC until 13:25 UTC when the revert of defective release was finished and the functionality is back to normal since then.

Jobs Failing on Internal Errors in Azure North Europe stack

2022-08-20 14:57 UTC - we are investigating some jobs failing on internal error. Next update in 30 minutes.

2022-08-20 15:30 UTC - we see the internal error is caused by one of the internal components failing to call Azure API, however don't know the root cause. We restarted one of the instances and see no error for now. We continue monitoring the issue.

Scheduled Maintenance of AWS EU Stack 2022-09-24

On 24th September 2022 between 10:00 CET and 10:30 CET, https://connection.eu-central-1.keboola.com will undergo planned maintenance during which one of our internal databases will be upgraded. As a result, any loads or unloads to storage will be paused during that time for up to 30 minutes. Any currently running jobs will continue to run. However, they may be delayed by up to 30 minutes as well.

Delayed processing of jobs in AWS US stack

2022-08-15 10:40 UTC - We are investigating component job delays in connection.keboola.com. Next update when new information will be available or in hour.
2022-08-15 11:14 UTC - We have identified the root cause of the problem and we are working on a solution. Next update when new information will be available or in hour.
2022-08-15 14:04 UTC - We've added another worker to help with the workload and the processing times has returned to normal

2022-08-15 15:32 UTC - We are investigating reoccurrence of the issue causing jobs stuck in Created state. Next update when new information will be available or in hour.

2022-08-15 15:59 UTC - The reoccurrence was resolved. We keep monitoring the situation closely, but at the moment job runtimes should be back to the normal.

AWS us-east-1 internal job errors [resolved]

2022-08-04 12:05 UTC We are investigating internal job errors. Next update in 30 minutes.

2022-08-04 12:36 UTC We experienced two waves of internal errors, between 11:50-11:57 and 12:00-12:07 UTC. The root cause has been removed and I don't see any other errors in the reporting. We continue to monitoring the situation. Next update in 30 minutes.

2022-08-04 13:06 UTC All Keboola Connection services are running normally. The incident is resolved. 

We apologize for any inconvenience.