Delayed orchestrations on Azure North Europe stack

2022-07-02 7:45 UTC We are investigating delayed orchestrations on Azure North Europe Keboola Connection stack (https://connection.north-europe.azure.keboola.com). Next update in 30 minutes.

2022-07-02 8:40 UTC  After our investigation, problem starts at 6:40 UTC. Orchestrations that should have be scheduled after this time, were delayed in the order of minutes or tens of minutes. Situation returned back to normal at 7:45 UTC and all orchestrations are scheduled properly.

We are sorry for the inconvenience.

Stuck Jobs in US Stack

2022-06-23 22:30 UTC: We're investigating cases of jobs not processing in the US stack (https://connection.keboola.com/). Next update in 30 minutes.

2022-06-23 07 23:00 UTC: The situation is now under control and our services are running normally. Processing jobs will start to gradually speed up.

We are sorry for the inconvenience.

Increased Error Rate of Components Using Specific Processor

There was an increased error rate for components using keboola.processor-create-manifest in configurations. This error affected only component configurations which were using empty enclosure in CSV Settings.

Affected component may include data sources like AWS S3, FTP, etc.

We are sorry for any inconvenience. Please feel free to restart your jobs manually.

2022-06-20 16:44 UTC - We are investigating increased error rate of some component using keboola.processor-create-manifest.

2022-06-21 07:22 UTC - We identified a root cause and continue working on a fix.

2022-06-21 12:30 UTC - Incident is resolved, last occurrence of the error was at 12:13 UTC.

Higher jobs error rate in AWS eu-central-1 stack

2022-06-20 06:40 UTC - We are investigating higher rate of jobs ending with internal error Job "XYZ" terminated unexpectedly in connection.eu-central-1.keboola.com stack since June 18th. At the moment the situation is stabilised and we don't see any more internal errors, we are investigating the root causes. Next update when new information will be available.

2022-06-20 14:36 UTC - Incident is resolved, last occurence of the error was at 05:35 UTC. We have found the jobs causing the errors and notified the project owners.

Job queue high error rate on Azure stacks

2022-06-16  13:09 UTC - We are investigating high job queue error rates on Azure stacks. Next update when new information will be available or in 30 minutes.

2022-06-16  13:28 UTC - We have found a root cause and implemented a fix. API does not longer returns the errors and platform should stabilized. We continue to monitor the situation.  Next update when new information will be available or in 30 minutes.

2022-06-16  13:51 UTC - Incident has been resolved and since 13:28 UTC  we don't see an increased api error rate and platform is stable.

Increased API Error rate in AWS us-east-1 connection.keboola.com stack

2022-06-10 03:40 UTC Today since 2:50 UTC we are experiencing increased Connection api internal error rates as well as increased latency due to an incident in AWS us-east-1 region. The job processing shouldn't be affected, however there might be slow down in the UI responsiveness.

2022-06-10 04:20 UTC - AWS still haven't resolved the incident. Furthermore we are noticing some jobs processing might be affected ending in internal error due to failed connections retries to AWS API. 

2022-06-10 08:15 UTC - the issue has been resolved and since 04:30 UTC we don't see an increased api error rate.

Stuck jobs in AWS us-east-1 connection.keboola.com stack

2022-06-06 21:00 UTC We are investigating a problem with not processing jobs in AWS US stack https://connection.keboola.com/. Some jobs will probably end up with an internal error.

2022-06-06 21:25 UTC Correction - jobs are only delayed and all jobs ends successfully.

2022-06-06 21:40 UTC Unfortunately the problem still persists, we are trying to find out the cause.

2022-06-06 22:30 UTC We will update this post when we have more information.

2022-06-06 23:25 UTC We have managed to improve the situation. We will continue monitoring the situation and tomorrow we will provide more information about the problem.

2022-06-07 12:40 UTC The situation is now under control and our services are running normally.

We are sorry for the inconvenience.

Job Processing Slowing down

Since approximately 13:30 we're seeing some jobs processing longer then usual in the the us-east and eu-central stack (connection.keboola.com and connection.eu-central-1.keboola.com). We're investigating the issue. Next update in 30 minutes.

Update 14:30 UTC: We have identified the root cause of the issue. We're actively working on fixing the issue. Next update in 30 minutes.

Update 15:20 UTC: We are still working on the fix. Next update in 30 minutes.

Update 15:55 UTC: We have deployed the fix to both affected stacks and the issues should be resolved now. We will continue monitoring the situation.

Increased error rate in Pay-as-you-go projects

Since 2022-06-03 23:30 UTC we're seeing a higher error rate in Pay As You Go projects in Azure Stack (https://connection.north-europe.azure.keboola.com/). This is caused by degraded performance of one of the Azure services. You may encounter higher loading times of the main and billing page and occasional errors when running jobs. Non Pay As You Go projects are not affected.

Update 2022-06-04 7:20 UTC: The Azure service performance was restored back to normal at approximately 2022-06-04 2:40. Our services closely followed. The issue is now resolved.

Next update in 8 hours.