Storage jobs restarts

2023-02-09 10:07 - We are currently investigating storage job restarts that occurred on 2023-02-09 07:35 UTC and 2023-02-07 08:04 UTC. These restarts have caused longer job run times or errors such as "table already exists" during transformation executions. We will provide another update when new information is available.

2023-02-09 10:57 - We have identified the root cause. We will deploy a fix within two hours, which might cause another occurrence of these restarts for some jobs.

2023-02-09 13:53 - We have deployed a fix at 13:20 UTC which caused the last occurrences of restarts. The issue is now resolved and you should not experience any more job restarts.

Templates & Keboola CLI errors

10:50 UTC Due to recent changes in Storage API, the Templates API and Keboola CLI are returning errors in multiple situations since approximately 9:00 UTC. As a result, you might see unexpected errors when working with the Keboola CLI or when trying to apply templates. We're working on the fix, which is expected to be released today ETA 15:00 UTC.

13:05 UTC Issue on Storage API was fixed. All services are now operating normally. We apologize for any inconvenience this may have caused.

Delayed jobs on Azure North Europe stack

2023-01-23 21:50 UTC We're investigating increased job wait times in Azure North Europe stack AWS US stack (connection.north-europe.azure.keboola.com) . Next update in 15 minutes or when new information is available. 

2023-01-23 22:10 UTC The root cause was fixed and all operations are back to normal.

Increased job wait times in AWS US and EU stack

We're investigating increased job wait times in AWS US stack (connection.keboola.com) and AWS EU stack (connection.eu-central-1.keboola.com) . Next update in 15 minutes or when new information is available. 


UPDATE 12:55 UTC: We have identified the problem and rolled back previous version of our service.

UPDATE 13:05 UTC: All services are now operating normally.

Broken output mapping on legacy queue

2023-01-13 22:45 UTC We have identified an issue with the legacy queue system. Specifically, during Snowflake transformation, the incremental output mapping could ignore filters configured in the "Delete Rows" process, resulting in all rows in the target table being deleted.

The problem began with a release that took place today at 9:30 UTC. At 22:15 UTC we rolled back to a previous version, which has resolved the issue for the time being.

We are still investigating the root cause of the problem and apologize for any inconvenience this may have caused.

2023-01-18 8:35 UTC We found the root cause of the problem and deployed the fixed version. 

Invalid credits balance on PayAsYouGo projects

12:10 UTC On PayAsYouGo projects on the https://connection.north-europe.azure.keboola.com/ stack, an incorrect credit balance may be displayed. The situation will be fixed shortly.

12:15 UTC The credits are now reported correctly again. In case you attempted to run a job within the incident timeframe, it erroneously failed with "You do not have credits to run a job". Please restart such jobs. We sincerely apologize for the trouble.


Failed jobs on eu-central-1 stack (AWS EU)

2023-01-09 07:45 UTC - We have identified an issue on one of the servers running Queue jobs on the EU Central 1 (AWS EU) stack. Numerous jobs are stuck in a terminating state and we are currently investigating the cause of the issue.

2023-01-09 08:05 UTC - We have unblocked the stuck jobs, which were unexpectedly terminated. We are investigating the root cause of the node failure.


Job failures on eu-central-1 stack (AWS EU)

2022-12-30 08:15 UTC - We are investigating occasional job failures that started on December 29, 2022 at 11:00 PM UTC. We will provide an update with new information when it becomes available.

2022-12-30 09:12 UTC - The error rate is lower, but there are still some occurrences of errors. We are investigating the root cause and will provide an update with new information when it becomes available.

2022-12-30 10:38 UTC - We have identified and fixed the problem, which was caused by rate limiting on the container registry. The last error occurred at 10:08 AM UTC. We are monitoring all systems closely.

2022-12-30 11:23 UTC - We don't see any new occurrences of errors. Platform is fully operational and incident is resolved. 

Failed jobs on eu-central-1 stack (AWS EU)

We have discovered a problem on one of servers running Queue jobs on eu-central-1 stack (AWS EU). Jobs were terminated unexpectedly in between 08:20 UTC and 09:20 UTC. The problem has been removed and all jobs should be running OK now again. We are still looking for the root cause to prevent it happening again in the future. We apologize for any inconvenience this may have caused.

Failed jobs on eu-central-1 stack (AWS EU)

We have discovered a problem on one of servers running Queue jobs on eu-central-1 stack (AWS EU). Jobs were terminated unexpectedly from 12:00 AM CET. We are investigating the cause of the problem

Update 12:50 PM CET - The problem has been removed and all jobs should be running OK now again. We are still looking for the root cause to prevent it happening again in the future.

We apologize for any inconvenience this may have caused.