Limited service disruption for AWS US and EU stacks on March 21st and 22nd

Due to necessary database upgrades to our AWS US and EU stacks, a limited service disruption will take place on March 21st and 22nd.

  • On Tuesday March 21st at 12:00 pm UTC, the disruption will begin for AWS EU, and
  • on Wednesday March 22nd at 10:00 am UTC, it will begin for AWS US.

We anticipate that the limited service disruption will take approximately 15 minutes, but it should not exceed 60 minutes. Hopefully, this will be resolved before you return from your lunch or coffee break.

During this period, Storage jobs, Queue v1 and Orchestration (in projects with Queue v1) processing will stop, and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays.

We apologize for any inconvenience caused and thank you for your understanding.



Hidden Transformations v2 configurations in UI

2023-03-01 16:00 CET - We are investigating hidden "Transformation v2" configurations on the UI. The next update in 15 minutes or when more info is available.

2023-03-01 16:10 CET - We have identified the root cause and prepared a fix which will be deployed within 10 minutes

(Resolved) 2023-03-01 16:24 CET - The fix has been deployed and transformations v2 are no more hidden in the UI. We advise users to reload their browsers as this was an UI issue.

Job failures in AWS EU stack

2023-02-20 15:20 UTC - A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident between Feb 19 15:10 UTC and Feb 20 14:00 UTC due to an underlying node failure. We're actively investigating the cause and taking measures to prevent this from happening again. 

2023-02-20 15:56 UTC - The incident has been resolved, with the last occurrence of the error happening at Feb 20 14:35 UTC. We are continuing to monitor the situation closely to prevent any reoccurrence. 

Failing jobs on all stacks

2023-02-10 09:20 UTC - We are currently investigating the problem of failing jobs on all stacks that occurred on 2023-02-09 08:48 UTC. The error is manifested by the error message "K8S request has failed: events is forbidden: User "system:serviceaccount:job-queue-jobs:daemon-service-account" cannot list resource "events" in API group "" in the namespace "job-queue-jobs"".

UPDATE 09:41 UTC: We have identified the problem and rolled back previous version of our service. All services are now operating normally.

UPDATE 10:35 UTC After a deeper research we found that this problem affected only a small fraction of the jobs.

We're sorry for this inconvenience. 

Storage jobs restarts

2023-02-09 10:07 - We are currently investigating storage job restarts that occurred on 2023-02-09 07:35 UTC and 2023-02-07 08:04 UTC. These restarts have caused longer job run times or errors such as "table already exists" during transformation executions. We will provide another update when new information is available.

2023-02-09 10:57 - We have identified the root cause. We will deploy a fix within two hours, which might cause another occurrence of these restarts for some jobs.

2023-02-09 13:53 - We have deployed a fix at 13:20 UTC which caused the last occurrences of restarts. The issue is now resolved and you should not experience any more job restarts.

Templates & Keboola CLI errors

10:50 UTC Due to recent changes in Storage API, the Templates API and Keboola CLI are returning errors in multiple situations since approximately 9:00 UTC. As a result, you might see unexpected errors when working with the Keboola CLI or when trying to apply templates. We're working on the fix, which is expected to be released today ETA 15:00 UTC.

13:05 UTC Issue on Storage API was fixed. All services are now operating normally. We apologize for any inconvenience this may have caused.

Service disruption in Azure and Snowflake (in Azure regions)

Azure and Snowflake in Azure regions are reporting general service disruptions. We are closely monitoring the situation and, so far, we have observed only a few symptoms of the issues and the platform operations have not been impacted. Please refer to the status updates of the affected services for more information.

We're sorry for this inconvenience. 

Delayed jobs on Azure North Europe stack

2023-01-23 21:50 UTC We're investigating increased job wait times in Azure North Europe stack AWS US stack (connection.north-europe.azure.keboola.com) . Next update in 15 minutes or when new information is available. 

2023-01-23 22:10 UTC The root cause was fixed and all operations are back to normal.

Increased job wait times in AWS US and EU stack

We're investigating increased job wait times in AWS US stack (connection.keboola.com) and AWS EU stack (connection.eu-central-1.keboola.com) . Next update in 15 minutes or when new information is available. 


UPDATE 12:55 UTC: We have identified the problem and rolled back previous version of our service.

UPDATE 13:05 UTC: All services are now operating normally.