2023-06-06 02:08 UTC We experienced incident on connection.eu-central-1.keboola.com. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause.
Update 2023-06-06 02:38 UTC The incident has been resolved. A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident.
We are continuing to monitor the situation closely to prevent any reoccurrence.
Beginning June 1st, 2023, we are transitioning our platform logs monitoring system from Papertrail to Datadog. This is a platform-level change and does not affect user experience or functionality. Regular users are not affected by this change.
For our 3rd party Keboola component vendors, this change modifies the way you receive application error notifications:
Email Notifications Only: Notifications will now be sent exclusively via email. Webhook support may be considered in the future.
Notification Email Address: Vendors previously notified via Papertrail or generic webhook will now receive notifications to the email address specified in their vendor profile. Vendors who were already receiving notifications via email will continue to do so at the same email address.
New Sender Email Address: All notifications will come from alert@dtdg.eu.
Should our vendors have any questions or concerns regarding this change, please contact us at support@keboola.com.
]]>Since 09:39 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.
UPDATE 10:30 UTC we managed to find the root cause, new worker nodes have a problem authorization accessing the container registry, we are working on a fix. Next update in 30 minutes.
UPDATE 10:57UTC The problem with authorization to container registry is now solved. All systems are now operating normally.
We apologize for any inconvenience caused.
Since 13:40 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.
14:14 UTC - All systems are now operating normally.
If your project run out of credits and you have enabled automatic top-up, this would have failed between approximately 13:40 to 14:10. Restarting the job will trigger automatic top-up correctly now.
We apologize for any inconvenience caused.
2023-04-26 11:00 UTC - We have discovered a problem with orchestrations not starting on the legacy queue. We are currently investigating possible causes.
2023-04-26 11:30 UTC - The problem was caused by a release earlier today, and as a result, no orchestrations on the legacy queue were run since 08:10 UTC. We have done rollback of the release and orchestrations should be functioning properly again as of 11:30 UTC. We apologize for any inconvenience caused.
]]>We are adding new outbound IP addresses for the connection.keboola.com and connection.eu-central-1.keboola.com stacks for Queue V2. These new addresses are available now, but are not yet being used automatically.
This update is important for Keboola Connection customers. It may affect their ability to connect to their resources, particularly if they are behind a firewall.
Outbound IP addresses are unique addresses assigned to a device for the purpose of identifying it and sending information over the Internet. When Keboola Connection customers connect to their resources (typically databases), those resources are usually behind a firewall. In order for Keboola Connection to connect to those resources, customers whitelist our outbound IP addresses.
If your resources are behind a firewall, ensure that all the new IP addresses are added to the whitelist, so as to enable connection to your system through Keboola Connection.
Use the “Test with new outbound IPs” feature to check the connectivity for any or all configurations in the credentials section. This will verify that your resources are accessible from the new addresses.
If you have multiple projects in your organization and have already tested the connection from the new IP addresses, you can ask our support team for help. They can switch all your projects at once, so you don’t have to do it individually for each one.
If you are not yet making use of Queue V2 for your projects, don’t hesitate to whitelist, as this will speed up your migration to the new queue in the future.
To ensure uninterrupted connectivity, the new IP addresses must be whitelisted by June 30, 2023. Otherwise, you run the risk of your connection not working. If the above update is not done manually by this date, Keboola Connection will perform the switch globally. To make sure of a smooth change, please add the new IP addresses to your whitelist and switch your projects as soon as you can.
52.7.83.136
52.20.72.254
3.222.3.15
(new)34.206.78.206
(new)3.213.250.110
(new)107.22.113.103
(new)54.144.9.113
(new)54.204.61.145
(new)34.239.7.70
(new)3.217.232.144
(new)149.72.196.5
34.224.0.188
34.200.169.177
52.206.109.126
34.203.87.137
3.66.248.180
3.64.150.30
35.157.62.225
(new)3.71.156.204
(new)3.74.28.187
(new)18.158.155.128
(new)35.157.208.189
(new)3.72.243.47
(new)18.193.225.37
(new)3.127.158.56
(new)149.72.196.5
35.157.170.229
35.157.93.175
For your convenience, you can programmatically fetch and process the list of existing IP addresses in JSON format. Read more about outbound IP addresses in documentation.
We appreciate your cooperation in making this transition as smooth as possible.
]]>2023-04-21 14:38 UTC - We experienced another incident on connection.eu-central-1.keboola.com. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause and taking measures to prevent future incidents.
A limited service disruption on AWS US stack will start at 09:00 a.m. UTC today, as announced earlier. Storage jobs processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.
All APIs and other unaffected services, such as Workspaces and Jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends.
We apologize for any inconvenience caused and thank you for your understanding.
Update 08:50 a.m. UTC: The limited service disruption has begun.
Update 09:20 a.m. UTC: The service disruption has been resolved and the stack is now fully operational.
Thank you for your patience.
A limited service disruption on AWS EU stack will start at 07:00 a.m. UTC today, as announced earlier. Storage jobs processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.
All APIs and other unaffected services, such as Workspaces and Jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends.
We apologize for any inconvenience caused and thank you for your understanding.
Update 6:55 a.m. UTC: The limited service disruption has begun.
Update 07:45 a.m. UTC: The service disruption has been resolved and the stack is now fully operational.
Thank you for your patience.
2023-04-14 02:11 UTC Due to internal incident few jobs ended with incorrect user exception `Component terminated. Possibly due to out of memory error`.
We are very sorry for the inconvenience.
2023-04-13 07:40 UTC We are investigating failing workspace creation on EU stack (connection.eu-central-1.keboola.com). The issue is manifesting as Loading data to workspace failed: Client error:...
when you try to create a workspace from python transformation. More information within the hour.
2023-04-13 08:40 UTC We are still investigating root cause. This issue happens only when creating new workspace from python transformation with empty input mapping on connection.eu-central-1.keboola.com. More information within the hour.
2023-04-13 09:15 UTC The service disruption has been resolved and the stack is now fully operational.
We are very sorry for the inconvenience.
2023-04-12 09:50 UTC Table dependency graph shows error message on projects with queue v1. More information within the hour.
Update 2023-04-12 10:43 UTC We have deployed the latest functional version, the problem should be solved by now.
We apologize for the inconvenience.
2023-04-06 12:45 UTC We are investigating failing workspace creation on all stacks. The issue is manifesting as na Application error when you try to create a workspace. The fix is already on the way, we expect the operations to resume in 20 minutes.
2023-04-06 13:05 UTC The service disruption has been resolved and the stack is now fully operational.
We're sorry for this inconvenience.
2023-04-03 10:15 UTC - We are investigating delayed telemetry data. More information within the hour.
2023-04-03 11:30 UTC - Delayed telemetry data on all Keboola Connection stacks have been recorded since approximately 20:00 UTC on March 29. We were able to determine the root cause and perform a backfill. Now all telemetry data tables are up-to-date.
We are very sorry for the inconvenience. If you encounter any discrepancies, please contact us immediately.
2023-03-31 09:50 We are investigating failing workspace (Python, R, SQL) loads on all stacks.
2023-03-31 09:58 Affected are all newly created user workspaces (Python, R, SQL,.. ). A fix will be available soon. The next update will be provided in 30 minutes or as soon as new information becomes available.
2023-03-31 10:24 Problem occurred on when new table was added into workspace. This issue was resolved, we are now working to fix already corrupted workspaces. Workaround now is to remove tables from input mapping and add them again. The next update will be provided in 60 minutes or as soon as new information becomes available.
2023-03-31 11:00 We fixed remaining workspaces and preparing fix to prevent this problem in the future. If you encounter this issue please contact our support and mention this status post.
We sincerely apologize for any inconvenience caused and appreciate your understanding.
2023-03-30 16:22 UTC - We are investigating the delays in job start-up within the https://connection.north-europe.azure.keboola.com stack. The next update will be provided in 30 minutes or as soon as new information becomes available.
2023-03-30 16:54 UTC - The investigation into the cause of the issue is still ongoing. The next update will be provided in 30 minutes or as soon as new information becomes available.
2023-03-30 17:56 UTC - The investigation into the cause of the issue is still ongoing. The next update will be provided in 30 minutes or as soon as new information becomes available.
2023-03-30 18:55 UTC - We have identified and fixed the root cause of the issue. The job backlog has now been cleared. We will continue to monitor the situation to ensure that everything remains stable.
2023-03-30 19:18 UTC - The service disruption has been resolved and the stack is now fully operational.
Regrettably, we were unable to upgrade all necessary databases during the previous planned service disruption. Due to a strict deadline imposed by our service provider (AWS), we must carry out another service disruption for maintenance purposes.
This maintenance will impact both the AWS US and AWS EU stacks.
It is scheduled for Saturday, April 15, 2023,
During this time, Storage jobs will be paused or delayed, and the platform will be unavailable for a brief period (approximately 5 minutes). The platform will then generate a 500 HTTP response for the majority of API requests. Throughout the remainder of the maintenance window, the platform will be fully accessible but will not process any new or existing Storage jobs.
We sincerely apologize for any inconvenience caused and appreciate your understanding.
]]>A limited service disruption on AWS US stack will start at 10:00 a.m. UTC today, as announced earlier. Storage jobs, Queue v1, and Orchestration (in projects with Queue v1) processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.
All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends.
We apologize for any inconvenience caused and thank you for your understanding.
Update 10:00 a.m. UTC: The limited service disruption has begun.
Update 10:10 a.m. UTC: The service disruption has been resolved and the stack is now fully operational.
Thank you for your patience.
We have encountered a brief metadata DB outage in AWS US at 15:07 UTC. Affected services are
This outage may cause some jobs being executed during the outage fail or run twice in parallel.
We're sorry for this inconvenience.
UPDATE 15:39 UTC: All affected jobs were restarted and any duplicate executions were terminated.
]]>A limited service disruption on AWS EU stack will start at 12:00 p.m. UTC today, as announced earlier. Storage jobs, Queue v1, and Orchestration (in projects with Queue v1) processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.
All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends.
We apologize for any inconvenience caused and thank you for your understanding.
Update 12:00 p.m. UTC: The limited service disruption has begun.
2023-03-14 14:15 UTC - Since Sunday, we have been experiencing component failures when communicating with the Facebook Graph API on our Azure North Europe stack https://connection.north-europe.azure.keboola.com
Following components failing with application error:
keboola.ex-facebook-ads2023-03-14 8:30 UTC - We are investigating delayed orchestrations on AWS US Keboola Connection stack (https://connection.keboola.com/).
2023-03-14 9:25 UTC - The issue is resolved. Thank you for patience and understanding.
We apologize for any inconvenience caused and thank you for your understanding.
Due to necessary database upgrades to our AWS US and EU stacks, a limited service disruption will take place on March 21st and 22nd.
We anticipate that the limited service disruption will take approximately 15 minutes, but it should not exceed 60 minutes. Hopefully, this will be resolved before you return from your lunch or coffee break.
During this period, Storage jobs, Queue v1 and Orchestration (in projects with Queue v1) processing will stop, and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.
All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays.
We apologize for any inconvenience caused and thank you for your understanding.
2023-03-01 16:00 CET - We are investigating hidden "Transformation v2" configurations on the UI. The next update in 15 minutes or when more info is available.
2023-03-01 16:10 CET - We have identified the root cause and prepared a fix which will be deployed within 10 minutes
(Resolved) 2023-03-01 16:24 CET - The fix has been deployed and transformations v2 are no more hidden in the UI. We advise users to reload their browsers as this was an UI issue.
]]>2023-02-21 07:41 UTC - We experienced another incident on connection.eu-central-1.keboola.com between Feb 20 18:00 UTC and Feb 21 06:40 UTC. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause and taking measures to prevent future incidents.
]]>2023-02-20 15:20 UTC - A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident between Feb 19 15:10 UTC and Feb 20 14:00 UTC due to an underlying node failure. We're actively investigating the cause and taking measures to prevent this from happening again.
2023-02-20 15:56 UTC - The incident has been resolved, with the last occurrence of the error happening at Feb 20 14:35 UTC. We are continuing to monitor the situation closely to prevent any reoccurrence.
2023-02-10 09:20 UTC - We are currently investigating the problem of failing jobs on all stacks that occurred on 2023-02-09 08:48 UTC. The error is manifested by the error message "K8S request has failed: events is forbidden: User "system:serviceaccount:job-queue-jobs:daemon-service-account" cannot list resource "events" in API group "" in the namespace "job-queue-jobs"".
UPDATE 09:41 UTC: We have identified the problem and rolled back previous version of our service. All services are now operating normally.
UPDATE 10:35 UTC After a deeper research we found that this problem affected only a small fraction of the jobs.
We're sorry for this inconvenience.
2023-02-09 10:07 - We are currently investigating storage job restarts that occurred on 2023-02-09 07:35 UTC and 2023-02-07 08:04 UTC. These restarts have caused longer job run times or errors such as "table already exists" during transformation executions. We will provide another update when new information is available.
2023-02-09 10:57 - We have identified the root cause. We will deploy a fix within two hours, which might cause another occurrence of these restarts for some jobs.
2023-02-09 13:53 - We have deployed a fix at 13:20 UTC which caused the last occurrences of restarts. The issue is now resolved and you should not experience any more job restarts.
10:50 UTC Due to recent changes in Storage API, the Templates API and Keboola CLI are returning errors in multiple situations since approximately 9:00 UTC. As a result, you might see unexpected errors when working with the Keboola CLI or when trying to apply templates. We're working on the fix, which is expected to be released today ETA 15:00 UTC.
13:05 UTC Issue on Storage API was fixed. All services are now operating normally. We apologize for any inconvenience this may have caused.
]]>