Since May 13, 20:00 UTC, we have been seeing intermittent delays when performing service management operations on Azure Storage accounts hosted in the West Europe region. Storage availability and data-processing workflows remain fully operational; however, you may notice jobs delays.
Azure’s latest update (not publicly available):
Current Status: Our monitoring shows that our mitigation strategy has worked and less than 5 % of the traffic is impacted at this stage; most customer impact should be mitigated at this stage. We continue to monitor our infrastructure and expect to see the delays decrease over the next few hours.
We will update this post as new information becomes available. If you have any questions or concerns, please reach out to our support team.
Update May 16, 08:00 UTC: Azure's latest update (not publicly available):
Service restored, and customer impact mitigated.
We can confirm the issue is resolved with our findings.
We are sorry for this inconvenience.
]]>We are currently experiencing job failures related to the keboola.legacy-transformation
component.
Our team is actively working on reverting to the previous stable release to resolve the issue.
We sincerely apologize for any inconvenience this may have caused and appreciate your patience.
Update 10:56 UTC: The issue has been fixed by reverting to previous release.
We’ve identified an issue in the Keboola UI where the list of tables in the transformation may render incorrectly after a table has been deleted from the list. This can result in visually corrupted table rows (e.g., mixed or misaligned entries).
This is a UI-only issue — data processing is not affected. Reloading the page restores the table view correctly.
We’ve successfully reproduced the problem and are working on a fix.
Update 13:50 UTC: the issue has been fixed.
We would like to inform you about the planned maintenance of all Keboola stacks hosted on GCP.
During the database upgrades there will be a short service outage on all GCP stacks, including all single-tenant stacks and GCP US and EU multi-tenant stacks (connection.us-east4.gcp.keboola.com, connection.europe-west3.gcp.keboola.com). This will take place on Saturday, May 24, 2025 between 05:30 and 06:30 UTC.
Effects of the Maintenance
During the above period, services will be scaled down and the processing of jobs may be delayed. For a very brief period (at around 06:00 UTC) the service will be unavailable for up to 10 minutes and APIs may respond with a 500 error code. After that, all services will scale up and start processing all jobs. No running jobs, data apps, or workspaces will be affected. Delayed scheduled flows and queued jobs will resume after the maintenance is completed.
Detailed Schedule
We are currently investigating an issue involving missing telemetry data that appears to affect all Azure stacks. The issue began on April 25, 2025, at approximately 08:30 UTC.
We will continue to update this article with additional information as our investigation progresses.
UPDATE 13:00 UTC
Additionally, since April 23, 2025, at approximately 08:00 UTC, we have identified missing billing data for Workspaces, DWH Direct Query, Data Streams, and Data Apps across all stacks (not limited to Azure).
All missing telemetry and billing data will be backfilled within the next few hours.
UPDATE [April 29, 2025, 05:38 UTC]
We apologize for the inconvenience during the incident.
We have identified an issue where configuring schedules via the UI orchestrator/flow leads to an off-by-one-day error. For example setting an execution for Sunday would incorrectly represent it as Saturday. See included media how to check for the inconsistency of a date.
Incident Timeline:
Start: April 25, 2025 11:42 AM UTC
End: April 26, 2025 10:10 AM UTC
Impact:
Schedules created through the UI during this timeframe have been translated into wrong crontab expressions, causing misaligned execution days.
Action Required:
If you configured any schedules via the UI, please review and correct them manually to ensure they are aligned with the intended execution days.
We apologize for the inconvenience
We are currently experiencing very rare encryption errors across all Azure stacks, resulting in immediate job failures. These errors occur very infrequently, approximately at a rate of 1 in 50,000 jobs.
Affected jobs have either a very short runtime (approximately 1 second) or no recorded runtime. These jobs typically lack events other than the error message itself. The specific error messages observed include:
These errors manifest in two distinct ways: transient and permanent.
Transient errors occur during the scheduling or initial starting phase of a job. Restarting the affected job resolves these transient issues.
Permanent errors occur during the saving of configurations or state, notably involving OAuth configurations when storing new refresh tokens after a job successfully completes. Subsequent job runs retrieve this improperly encrypted value, causing the job to fail with an application error. Unlike transient errors, restarting does not resolve permanent errors. To correct permanent errors, the configuration itself must be updated to remove or correct the incorrectly encrypted value.
We are proactively monitoring all occurrences of these errors. In the case of permanent failures, we are directly contacting affected customers to resolve the issue promptly.
We apologize for any inconvenience this may cause. Please do not hesitate to reach out to our support team if you have further questions or require assistance.
UPDATE 2025-04-23
We have identified the root cause of these issues and will be carefully deploying a fix in the coming days. We continue to proactively monitor all occurrences of these errors. In the case of permanent failures, we are directly contacting affected customers to resolve the issue promptly.
]]>Due to a bug introduced in the latest deployment, all Job Queue jobs started between 2025-04-15 10:30 UTC and 12:00 UTC are missing approximately 1/3 of their log events. The issue has been identified and resolved.
Only AWS and Azure stacks were affected — GCP was not impacted.
Audit logs were not affected and remain fully intact.
We sincerely apologize for the disruption this has caused.
The announced partial maintenance has just started. The platform has been scaled down and is no longer accepting new jobs. We expect a brief downtime in 30 minutes, around 12:00 UTC. We will update this post with the progress of the partial maintenance.
Update 12:00 UTC: The announced partial maintenance is ongoing, and we are expecting downtime to begin any minute now. We will continue to update this post as the maintenance progresses.
Update 12:10 UTC: The maintenance has been completed, and all services have been scaled back up. The platform is fully operational, and jobs are now being processed as usual. All delayed jobs will be processed shortly. Thank you for your patience.
2025-04-06 07:21 UTC We are facing a problem with the authorization of the `kds-team.ex-onedrive` component.
The problem manifests itself in not being able to authorize a new component or reauthorize an old one. The action ends with an authorization error message. The problem appears on our AWS and Azure stacks. GCP stacks seem to be without problem.
Update 2025-04-06 09:00 UTC We have determined the cause of the problem and are working on a fix.
Update 2025-04-06 09:30 UTC - We have solved the problem and the component authorization should be functional again on all stacks.
We apologize for the inconvenience.
We would like to inform you about the planned maintenance of all Keboola stacks hosted on Azure.
During the database upgrades there will be a short service outage on all Azure stacks, including all single-tenant stacks and Azure North Europe multi-tenant stack (connection.north-europe.azure.keboola.com). This will take place on Saturday, April 12, 2025 between 11:30 and 12:30 UTC.
Effects of the Maintenance
During the above period, services will be scaled down and the processing of jobs may be delayed. For a very brief period (at around 12:00 UTC) the service will be unavailable for up to 10 minutes and APIs may respond with a 500 error code. After that, all services will scale up and start processing all jobs. No running jobs, data apps, or workspaces will be affected. Delayed scheduled flows and queued jobs will resume after the maintenance is completed.
Detailed Schedule
We are investigating a brief outage of our Billing API on the Azure North Europe stack (https://connection.north-europe.azure.keboola.com/) between 23:03 and 23:19 UTC. Non–Pay-As-You-Go projects were not affected.
Update 23:35 UTC: The service disruption also caused job delays across all projects in this stack during the affected period. All operations have since returned to normal. We apologize for the inconvenience.
On March 12, 2025, our support button was unavailable from 07:15 UTC to 07:50 UTC.
We apologize for any inconvenience this may have caused.
The button is now working again properly.
The announced partial maintenance of connection.keboola.com and connection.eu-central-1.keboola.com will start in one hour at 08:00 UTC. We will update this post with the progress of the partial maintenance for each stack.
Update 07:00 UTC: The scheduled partial maintenance for connection.keboola.com has begun.
Update 07:21 UTC: The maintenance of connection.keboola.com has been completed, and all services have been scaled back up. The platform is fully operational, and jobs are now being processed as usual. All delayed jobs will be processed shortly. Thank you for your patience. Maintenance of connection.eu-central-1.keboola.com will start at 08:00 UTC.
Update 08:00 UTC: The scheduled partial maintenance for connection.eu-central-1.keboola.com has begun.
Update 08:25 UTC: The maintenance of connection.eu-central-1.keboola.com has been completed, and all services have been scaled back up. The platform is fully operational, and jobs are now being processed as usual. All delayed jobs will be processed shortly. Thank you for your patience.
]]>2025-03-13 23:46 CET – We are investigating a service degradation on the https://connection.europe-west3.gcp.keboola.com/ stack.
2025-03-14 00:01 CET – Services on the https://connection.europe-west3.gcp.keboola.com/ stack have resumed normal operations. Component events (level INFO) display only 50% of the logs.
2025-03-14 22:09 CET – To avoid database bottleneck during peak hours this night we have implemented events sampling. Component events (level info) will display only 50% of the logs. All storage events and component events with higher level will not be affected by this change.
2025-03-12 06:00 UTC - we are investigation possible service degradation on https://connection.europe-west3.gcp.keboola.com/ stack
2025-03-12 07:15 UTC - we are still working on the issue, stack is operational, but jobs are being delayed
2025-03-12 08:15 UTC - All services are fully operational now however you may experience job delays due to limited worker capacity. We have identified a database bottleneck and are working towards getting to full capacity.
2025-03-12 23:40 UTC - To avoid database bottleneck during peak hours this night we have implemented events sampling. Component events (level info) will display only 50% of the logs. All storage events and component events with higher level will not be affected by this change.
2025-03-13 09:15 UTC - Sampling of the component events has been removed and all events created from now on are displaying.
]]>2025-03-11 23:15 UTC - we are investigation possible service degradation on https://connection.europe-west3.gcp.keboola.com/ stack
2025-03-12 00:15 UTC - stack is now fully operational, ale jobs are being processed.
2025-03-11 10:50 UTC - we are investigating failed jobs for component keboola.ex-telemetry-data
2025-03-11 11:30 UTC - We found invalid record in telemetry data which caused telemetry extractor to fail. Record was fixed and all telemetry extractors are working as expected.
We are sorry for caused inconvenience.
]]>2025-03-11 10:00 UTC - we are investigation possible service degradation on https://connection.eu-central-1.keboola.com stack
2025-03-11 11:10 UTC - all services are operating normally now. Only marginal number of component job was affected with delayed processing.
]]>2025-02-08 23:55 UTC We are experiencing issues with the Storage API on our stack. This may cause slower Queue jobs and occasional downtime for the Storage API and Connection administration. The next update will be in 30 minutes.
2025-02-09 0:35 UTC We identified an issue where some parts of our system were overloaded due to a high number of jobs running during the nightly peak after 11 PM. This caused slowdowns and occasional errors in our UI and Storage API and some Queue jobs may have been delayed or failed.
We have increased system capacity to prevent this issue in the future.The incident is now resolved.
We apologize for any inconvenience caused.
]]>
2025-03-05 16:10 UTC We are currently experiencing degraded performance with workspace creation, which may impact all jobs that rely on workspace creation as part of their lifecycle.
Our team is actively investigating the issue, and we will provide updates as we have more information.
We appreciate your patience and apologize for any inconvenience this may cause.
2025-03-05 17:15 UTC We've identified the core issue and deployed a fix. All Storage jobs associated with WorkspaceCreate should now be functioning properly.
]]>2025-03-04 10:00 UTC - We are investigating import failures for the Stream service on connection.eu-central-1.keboola.com due to receiver errors. We will provide an update as soon as we have more information.
2025-03-04 10:10 UTC – We have identified the root cause and resolved the issue. The backlog of imports is now being processed, and we are continuing to monitor the situation.
2025-03-04 12:11 UTC – Some projects still have unprocessed Stream imports. We are continuing the investigation.
2025-03-04 13:21 UTC – The backlog has been cleared, and Stream services are stable and fully operational. However, a few customers' data was not imported during the incident window starting on February 28th. We will contact these customers directly.
The announced partial maintenance has just started. The platform has been scaled down and is no longer accepting new jobs. We expect a brief downtime in 30 minutes, around 12:00 UTC. We will update this post with the progress of the partial maintenance.
Update 12:00 UTC: The announced partial maintenance is ongoing, and we are expecting downtime to begin any minute now. We will continue to update this post as the maintenance progresses.
Update 12:13 UTC: The maintenance has been completed, and all services have been scaled back up. The platform is fully operational, and jobs are now being processed as usual. All delayed jobs will be processed shortly. Thank you for your patience.
Starting at 12:00 UTC, all jobs that were either processing or started at that time began failing due to an error introduced in the latest release. At 12:40 UTC, we reverted to the previous stable version, and we expect job processing to return to normal.
We are actively monitoring the situation and will provide updates as needed.
The deprecated jobs listing endpoint will be temporarily unavailable on the following dates and times:
Please be aware that this endpoint will be permanently shut down on March 1, 2025.
Ensure that your API integrations are migrated to the new endpoint by this date.
We would like to inform you about the planned maintenance of all Keboola stacks hosted on Azure.
During the database upgrades there will be a short service outage on all Azure stacks, including all single-tenant stacks and Azure North Europe multi-tenant stack (connection.north-europe.azure.keboola.com). This will take place on Saturday, March 1, 2025 between 11:30 and 12:30 UTC.
Effects of the Maintenance
During the above period, services will be scaled down and the processing of jobs may be delayed. For a very brief period (at around 12:00 UTC) the service will be unavailable for up to 10 minutes and APIs may respond with a 500 error code. After that, all services will scale up and start processing all jobs. No running jobs, data apps, or workspaces will be affected. Delayed scheduled flows and queued jobs will resume after the maintenance is completed.
Detailed Schedule
2025-02-13 11:01 UTC - We are currently investigating a potential column mismatch issue with the Salesforce extractor, which started occuring around February 12, 2025. The investigation is ongoing, and we will provide more information as soon as we have further details.
2025-02-13 11:48 UTC - We have reverted the affected extractor version to the previous one. Extracted objects with frozen columns that were not aligned with the current Salesforce API version may have had mismatched columns if they were processed using the affected extractor version, which was released on February 11, 2025, at 8:07 UTC. A full load using the reverted extractor is advised.
]]>2025-02-11 1:44 UTC: We are investigating job execution delays at https://connection.north-europe.azure.keboola.com/. We will provide updates as soon as we have more information.
2025-02-11 2:20 UTC: We have identified that the job execution delays were caused by an underlying node issue. We have replaced the node and are monitoring the situation.
2025-02-11 3:05 UTC: Due to issues with a node, the following problems occurred:
All jobs are now running normally, notifications are no longer delayed, and the platform is fully operational.
The incident is resolved. We apologize for any inconvenience.
2025-02-08 11:13 UTC: We are investigating job execution delays at https://connection.us-east4.gcp.keboola.com/. We will provide updates as soon as we have more information.
2025-02-08 11:34 UTC: We have identified the root cause and resolved the issue. All jobs are now running normally, and the platform is fully operational.
The announced partial maintenance has just started. The platform has been scaled down and is no longer accepting new jobs. We expect a brief downtime in 30 minutes, around 8:00 UTC for GCP europe-west3 and us-east4. We will update this post with the progress of the partial maintenance.
2025-02-08 8:00 UTC: The announced partial maintenance of both GCP stacks is ongoing, and we are expecting downtime to begin any minute now. We will continue to update this post as the maintenance progresses.
2025-02-08 8:17 UTC: The maintenance of both stacks has been completed, and all services have been scaled back up. The platform is fully operational, and jobs are now being processed as usual. All delayed jobs will be processed shortly. Thank you for your patience.