tag:status.keboola.com,2013:/posts Keboola Status 2023-06-06T02:37:38Z Keboola Connection - Platform Status tag:status.keboola.com,2013:Post/1984473 2023-06-06T02:17:07Z 2023-06-06T02:37:38Z Stuck jobs and failures in AWS EU stack

2023-06-06 02:08 UTC We experienced incident on connection.eu-central-1.keboola.com. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause.

Update 2023-06-06 02:38 UTC The incident has been resolved. A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident. 

We are continuing to monitor the situation closely to prevent any reoccurrence. 

]]>
tag:status.keboola.com,2013:Post/1982132 2023-05-31T12:00:01Z 2023-05-31T12:00:01Z Platform Update: Transition to Datadog for Platform Logs Monitoring - Vendors Only

Beginning June 1st, 2023, we are transitioning our platform logs monitoring system from Papertrail to Datadog. This is a platform-level change and does not affect user experience or functionality. Regular users are not affected by this change.

For our 3rd party Keboola component vendors, this change modifies the way you receive application error notifications:

  1. Email Notifications Only: Notifications will now be sent exclusively via email. Webhook support may be considered in the future.

  2. Notification Email Address: Vendors previously notified via Papertrail or generic webhook will now receive notifications to the email address specified in their vendor profile. Vendors who were already receiving notifications via email will continue to do so at the same email address.

  3. New Sender Email Address: All notifications will come from alert@dtdg.eu.

Should our vendors have any questions or concerns regarding this change, please contact us at support@keboola.com.

]]>
tag:status.keboola.com,2013:Post/1979320 2023-05-24T09:40:19Z 2023-05-24T10:59:21Z Slowdown of processing of jobs on Azure North Europe stack [resolved]

Since 09:39 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.

UPDATE 10:30 UTC we managed to find the root cause, new worker nodes have a problem authorization accessing the container registry, we are working on a fix. Next update in 30 minutes.

UPDATE 10:57UTC The problem with authorization to container registry is now solved. All systems are now operating normally.

We apologize for any inconvenience caused.

]]>
Václav Eder
tag:status.keboola.com,2013:Post/1970122 2023-04-27T13:50:15Z 2023-04-27T15:01:54Z Slowdown of processing of jobs on Azure North Europe stack

Since 13:40 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.

14:14 UTC - All systems are now operating normally.

If your project run out of credits and you have enabled automatic top-up, this would have failed between approximately 13:40 to 14:10. Restarting the job will trigger automatic top-up correctly now.

We apologize for any inconvenience caused.

]]>
tag:status.keboola.com,2013:Post/1969737 2023-04-26T11:10:43Z 2023-04-26T11:31:23Z Orchestrations not starting on legacy job queue

2023-04-26 11:00 UTC - We have discovered a problem with orchestrations not starting on the legacy queue. We are currently investigating possible causes.

2023-04-26 11:30 UTC - The problem was caused by a release earlier today, and as a result, no orchestrations on the legacy queue were run since 08:10 UTC. We have done rollback of the release and orchestrations should be functioning properly again as of 11:30 UTC. We apologize for any inconvenience caused.

]]>
tag:status.keboola.com,2013:Post/1969585 2023-04-25T16:48:55Z 2023-04-25T16:48:55Z New Outbound IP Addresses for Keboola Connection: Action Required

We are adding new outbound IP addresses for the connection.keboola.com and connection.eu-central-1.keboola.com stacks for Queue V2. These new addresses are available now, but are not yet being used automatically.

This update is important for Keboola Connection customers. It may affect their ability to connect to their resources, particularly if they are behind a firewall.

What are outbound IP addresses?

Outbound IP addresses are unique addresses assigned to a device for the purpose of identifying it and sending information over the Internet. When Keboola Connection customers connect to their resources (typically databases), those resources are usually behind a firewall. In order for Keboola Connection to connect to those resources, customers whitelist our outbound IP addresses.

What must I do?

  1. If your resources are behind a firewall, ensure that all the new IP addresses are added to the whitelist, so as to enable connection to your system through Keboola Connection. 

  2. Use the “Test with new outbound IPs” feature to check the connectivity for any or all configurations in the credentials section. This will verify that your resources are accessible from the new addresses.

  3. If the connection works well, switch the project to the new IP addresses. In the event of problems, you can temporarily revert to your original IP address and contact our support team for assistance.

If you have multiple projects in your organization and have already tested the connection from the new IP addresses, you can ask our support team for help. They can switch all your projects at once, so you don’t have to do it individually for each one.

If you are not yet making use of Queue V2 for your projects, don’t hesitate to whitelist, as this will speed up your migration to the new queue in the future.

By when must I do it?

To ensure uninterrupted connectivity, the new IP addresses must be whitelisted by June 30, 2023. Otherwise, you run the risk of your connection not working. If the above update is not done manually by this date, Keboola Connection will perform the switch globally. To make sure of a smooth change, please add the new IP addresses to your whitelist and switch your projects as soon as you can.

Current list of outbound IP addresses

connection.keboola.com

  • Queue V2
    • 52.7.83.136
    • 52.20.72.254
    • 3.222.3.15 (new)
    • 34.206.78.206 (new)
    • 3.213.250.110 (new)
    • 107.22.113.103 (new)
    • 54.144.9.113 (new)
    • 54.204.61.145 (new)
    • 34.239.7.70 (new)
    • 3.217.232.144 (new)
  • Email delivery
    • 149.72.196.5
  • Queue V1 - legacy syrup services
    • 34.224.0.188
    • 34.200.169.177
    • 52.206.109.126
    • 34.203.87.137

connection.eu-central-1.keboola.com

  • Queue V2
    • 3.66.248.180
    • 3.64.150.30
    • 35.157.62.225 (new)
    • 3.71.156.204 (new)
    • 3.74.28.187 (new)
    • 18.158.155.128 (new)
    • 35.157.208.189 (new)
    • 3.72.243.47 (new)
    • 18.193.225.37 (new)
    • 3.127.158.56 (new)
  • Email delivery
    • 149.72.196.5
  • Queue V1 - legacy syrup services
    • 35.157.170.229
    • 35.157.93.175

For your convenience, you can programmatically fetch and process the list of existing IP addresses in JSON format. Read more about outbound IP addresses in documentation.

We appreciate your cooperation in making this transition as smooth as possible.

]]>
Václav Eder
tag:status.keboola.com,2013:Post/1968694 2023-04-22T14:39:10Z 2023-04-22T14:39:10Z Stuck jobs and failures in AWS EU stack

2023-04-21 14:38 UTC - We experienced another incident on connection.eu-central-1.keboola.com. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause and taking measures to prevent future incidents.

]]>
Václav Eder
tag:status.keboola.com,2013:Post/1966113 2023-04-15T08:00:15Z 2023-04-15T09:21:41Z Limited service disruption for AWS US

A limited service disruption on AWS US stack will start at 09:00 a.m. UTC today, as announced earlier. Storage jobs processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends. 

We apologize for any inconvenience caused and thank you for your understanding.

Update 08:50 a.m. UTC: The limited service disruption has begun.

Update 09:20 a.m. UTC: The service disruption has been resolved and the stack is now fully operational. 

Thank you for your patience.

]]>
tag:status.keboola.com,2013:Post/1966091 2023-04-15T06:06:02Z 2023-04-15T07:45:32Z Limited service disruption for AWS EU

A limited service disruption on AWS EU stack will start at 07:00 a.m. UTC today, as announced earlier. Storage jobs processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends. 

We apologize for any inconvenience caused and thank you for your understanding.

Update 6:55 a.m. UTC: The limited service disruption has begun.

Update 07:45 a.m. UTC: The service disruption has been resolved and the stack is now fully operational. 

Thank you for your patience.

]]>
tag:status.keboola.com,2013:Post/1965577 2023-04-14T02:19:36Z 2023-04-14T02:19:36Z Failed jobs in connection.eu-central-1.keboola.com

2023-04-14 02:11 UTC Due to internal incident few jobs ended with incorrect user exception `Component terminated. Possibly due to out of memory error`. 

We are very sorry for the inconvenience.

]]>
tag:status.keboola.com,2013:Post/1965153 2023-04-13T07:43:46Z 2023-04-13T09:15:43Z Python workspace can't be created from transformation

2023-04-13 07:40 UTC We are investigating failing workspace creation on EU stack (connection.eu-central-1.keboola.com). The issue is manifesting as Loading data to workspace failed: Client error:... when you try to create a workspace from python transformation. More information within the hour.

2023-04-13 08:40 UTC We are still investigating root cause. This issue happens only when creating new workspace from python transformation with empty input mapping on connection.eu-central-1.keboola.com. More information within the hour.

2023-04-13 09:15 UTC The service disruption has been resolved and the stack is now fully operational.

We are very sorry for the inconvenience.

]]>
tag:status.keboola.com,2013:Post/1964684 2023-04-12T09:59:21Z 2023-04-12T10:44:00Z Queue v1 table graph shows error

2023-04-12 09:50 UTC Table dependency graph shows error message on projects with queue v1.  More information within the hour.

Update 2023-04-12 10:43 UTC We have deployed the latest functional version, the problem should be solved by now. 

We apologize for the inconvenience.

]]>
tag:status.keboola.com,2013:Post/1962204 2023-04-06T12:53:20Z 2023-04-06T13:06:24Z Workspace creation failing (all stacks)

2023-04-06 12:45 UTC We are investigating failing workspace creation on all stacks. The issue is manifesting as na Application error when you try to create a workspace. The fix is already on the way, we expect the operations to resume in 20 minutes.

2023-04-06 13:05 UTC The service disruption has been resolved and the stack is now fully operational.


We're sorry for this inconvenience. 

]]>
tag:status.keboola.com,2013:Post/1960690 2023-04-03T10:15:40Z 2023-04-03T11:31:12Z Delayed telemetry data [resolved]

2023-04-03 10:15 UTC - We are investigating delayed telemetry data. More information within the hour.

2023-04-03 11:30 UTC - Delayed telemetry data on all Keboola Connection stacks have been recorded since approximately 20:00 UTC on March 29. We were able to determine the root cause and perform a backfill. Now all telemetry data tables are up-to-date.

We are very sorry for the inconvenience. If you encounter any discrepancies, please contact us immediately.

]]>
Václav Eder
tag:status.keboola.com,2013:Post/1959572 2023-03-31T09:51:39Z 2023-03-31T11:03:00Z Workspace table load fails (all stacks)

2023-03-31 09:50 We are investigating failing workspace (Python, R, SQL) loads on all stacks.

2023-03-31 09:58 Affected are all newly created user workspaces (Python, R, SQL,.. ).  A fix will be available soon. The next update will be provided in 30 minutes or as soon as new information becomes available.

2023-03-31 10:24 Problem occurred on when new table was added into workspace. This issue was resolved, we are now working to fix already corrupted workspaces. Workaround now is to remove tables from input mapping and add them again. The next update will be provided in 60 minutes or as soon as new information becomes available.

2023-03-31 11:00 We fixed remaining workspaces and preparing fix to prevent this problem in the future. If you encounter this issue please contact our support and mention this status post. 

We sincerely apologize for any inconvenience caused and appreciate your understanding.

]]>
tag:status.keboola.com,2013:Post/1959331 2023-03-30T16:23:08Z 2023-03-31T10:51:50Z Job start-up delays in Azure North Europe

2023-03-30 16:22 UTC - We are investigating the delays in job start-up within the https://connection.north-europe.azure.keboola.com stack. The next update will be provided in 30 minutes or as soon as new information becomes available.

2023-03-30 16:54 UTC - The investigation into the cause of the issue is still ongoing. The next update will be provided in 30 minutes or as soon as new information becomes available.

2023-03-30 17:56 UTC - The investigation into the cause of the issue is still ongoing. The next update will be provided in 30 minutes or as soon as new information becomes available.

2023-03-30 18:55 UTC - We have identified and fixed the root cause of the issue. The job backlog has now been cleared. We will continue to monitor the situation to ensure that everything remains stable.

2023-03-30 19:18 UTC - The service disruption has been resolved and the stack is now fully operational. 

Thank you for your patience.]]>
tag:status.keboola.com,2013:Post/1956994 2023-03-24T21:22:59Z 2023-03-24T21:22:59Z Planned service maintenance on April 15th in AWS US and AWS EU stacks

Regrettably, we were unable to upgrade all necessary databases during the previous planned service disruption. Due to a strict deadline imposed by our service provider (AWS), we must carry out another service disruption for maintenance purposes.

This maintenance will impact both the AWS US and AWS EU stacks.

It is scheduled for Saturday, April 15, 2023,

  • between 07:00 and 08:00 UTC (09:00 and 10:00 CEST) for the AWS EU stack, and
  • between 09:00 and 10:00 UTC (02:00 and 03:00 PDT) for the AWS US stack.

During this time, Storage jobs will be paused or delayed, and the platform will be unavailable for a brief period (approximately 5 minutes). The platform will then generate a 500 HTTP response for the majority of API requests. Throughout the remainder of the maintenance window, the platform will be fully accessible but will not process any new or existing Storage jobs.

We sincerely apologize for any inconvenience caused and appreciate your understanding.

]]>
Ondrej Hlavacek
tag:status.keboola.com,2013:Post/1956210 2023-03-22T08:39:50Z 2023-03-22T10:10:26Z Limited service disruption for AWS US

A limited service disruption on AWS US stack will start at 10:00 a.m. UTC today, as announced earlier. Storage jobs, Queue v1, and Orchestration (in projects with Queue v1) processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends. 

We apologize for any inconvenience caused and thank you for your understanding.

Update 10:00 a.m. UTC: The limited service disruption has begun.

Update 10:10 a.m. UTC: The service disruption has been resolved and the stack is now fully operational. 

Thank you for your patience.


]]>
Ondrej Hlavacek
tag:status.keboola.com,2013:Post/1955997 2023-03-21T15:23:18Z 2023-03-21T15:40:27Z Brief metadata database outage in AWS US

We have encountered a brief metadata DB outage in AWS US at 15:07 UTC. Affected services are

  • legacy Transformations
  • legacy Orchestrations
  • projects and jobs running on old Queue V1

This outage may cause some jobs being executed during the outage fail or run twice in parallel.

We're sorry for this inconvenience. 

UPDATE 15:39 UTC: All affected jobs were restarted and any duplicate executions were terminated.

]]>
Ondrej Hlavacek
tag:status.keboola.com,2013:Post/1955893 2023-03-21T09:01:04Z 2023-03-21T12:34:50Z Limited service disruption for AWS EU

A limited service disruption on AWS EU stack will start at 12:00 p.m. UTC today, as announced earlier. Storage jobs, Queue v1, and Orchestration (in projects with Queue v1) processing will stop and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays. We will provide an update when the service disruption starts and ends. 

We apologize for any inconvenience caused and thank you for your understanding.

Update 12:00 p.m. UTC: The limited service disruption has begun.

Update 12:33 p.m. UTC: The service disruption has been resolved and the stack is now fully operational. 

Thank you for your patience.]]>
Ondrej Hlavacek
tag:status.keboola.com,2013:Post/1953159 2023-03-14T14:20:48Z 2023-03-20T07:58:15Z Failing Facebook/instagram components on Azure North Europe stack

2023-03-14 14:15 UTC - Since Sunday, we have been experiencing component failures when communicating with the Facebook Graph API on our Azure North Europe stack https://connection.north-europe.azure.keboola.com

Following components failing with application error:

keboola.ex-facebook-ads
keboola.ex-facebook
keboola.ex-instagram

We believe that issue is related to following reported bug in Facebook API https://developers.facebook.com/support/bugs/737701844772490

We are monitoring situation.

2023-03-20 08:00 UTC -  We have not received any update from Meta, last error occurred more than 72 hours ago, we will monitor situation. For now the situation looks stable and we are considering issue as resolved.
]]>
tag:status.keboola.com,2013:Post/1953052 2023-03-14T08:34:35Z 2023-03-14T09:26:47Z Delayed orchestrations on AWS US stack

2023-03-14 8:30 UTC - We are investigating delayed orchestrations on AWS US Keboola Connection stack (https://connection.keboola.com/).

2023-03-14 9:25 UTC - The issue is resolved. Thank you for patience and understanding.

]]>
Erik Žigo
tag:status.keboola.com,2013:Post/1949952 2023-03-07T13:38:07Z 2023-03-07T14:12:53Z Jobs outage on connection.keboola.com (us-east-1)

2023-03-07 14:37 CET - We are investigating problem with jobs on connection.keboola.com (us-east-1 stack). 

2023-03-07 14:42 CET - We have identified a problem with one of our internal databases, containing metadata about jobs. As a result, no jobs can be run since 14:30 CET, and the rest of the platform may be behaving abnormally.

2023-03-07 15:05 CET - The database that was affected has been fixed, and operations should be running normally since 15:00 CET.


We apologize for any inconvenience caused and thank you for your understanding.

]]>
tag:status.keboola.com,2013:Post/1947801 2023-03-02T16:22:24Z 2023-03-02T16:22:24Z Limited service disruption for AWS US and EU stacks on March 21st and 22nd

Due to necessary database upgrades to our AWS US and EU stacks, a limited service disruption will take place on March 21st and 22nd.

  • On Tuesday March 21st at 12:00 pm UTC, the disruption will begin for AWS EU, and
  • on Wednesday March 22nd at 10:00 am UTC, it will begin for AWS US.

We anticipate that the limited service disruption will take approximately 15 minutes, but it should not exceed 60 minutes. Hopefully, this will be resolved before you return from your lunch or coffee break.

During this period, Storage jobs, Queue v1 and Orchestration (in projects with Queue v1) processing will stop, and new jobs will be delayed until the upgrade is completed. All running jobs will be cancelled, but will resume after the upgrade.

All APIs and other unaffected services, such as Workspaces and Queue v2 jobs, will remain operational, though their operations may be delayed due to the Storage job delays.

We apologize for any inconvenience caused and thank you for your understanding.



]]>
Ondrej Hlavacek
tag:status.keboola.com,2013:Post/1947262 2023-03-01T15:07:56Z 2023-03-01T15:24:32Z Hidden Transformations v2 configurations in UI

2023-03-01 16:00 CET - We are investigating hidden "Transformation v2" configurations on the UI. The next update in 15 minutes or when more info is available.

2023-03-01 16:10 CET - We have identified the root cause and prepared a fix which will be deployed within 10 minutes

(Resolved) 2023-03-01 16:24 CET - The fix has been deployed and transformations v2 are no more hidden in the UI. We advise users to reload their browsers as this was an UI issue.

]]>
tag:status.keboola.com,2013:Post/1943613 2023-02-21T07:45:37Z 2023-02-21T07:45:38Z Stuck jobs and failures in AWS EU stack

2023-02-21 07:41 UTC - We experienced another incident on connection.eu-central-1.keboola.com between Feb 20 18:00 UTC and Feb 21 06:40 UTC. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause and taking measures to prevent future incidents.

]]>
tag:status.keboola.com,2013:Post/1943251 2023-02-20T15:33:20Z 2023-02-20T15:56:53Z Job failures in AWS EU stack

2023-02-20 15:20 UTC - A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident between Feb 19 15:10 UTC and Feb 20 14:00 UTC due to an underlying node failure. We're actively investigating the cause and taking measures to prevent this from happening again. 

2023-02-20 15:56 UTC - The incident has been resolved, with the last occurrence of the error happening at Feb 20 14:35 UTC. We are continuing to monitor the situation closely to prevent any reoccurrence. 

]]>
tag:status.keboola.com,2013:Post/1939284 2023-02-10T09:26:11Z 2023-02-10T10:40:07Z Failing jobs on all stacks

2023-02-10 09:20 UTC - We are currently investigating the problem of failing jobs on all stacks that occurred on 2023-02-09 08:48 UTC. The error is manifested by the error message "K8S request has failed: events is forbidden: User "system:serviceaccount:job-queue-jobs:daemon-service-account" cannot list resource "events" in API group "" in the namespace "job-queue-jobs"".

UPDATE 09:41 UTC: We have identified the problem and rolled back previous version of our service. All services are now operating normally.

UPDATE 10:35 UTC After a deeper research we found that this problem affected only a small fraction of the jobs.

We're sorry for this inconvenience. 

]]>
tag:status.keboola.com,2013:Post/1938908 2023-02-09T10:11:45Z 2023-02-09T13:55:30Z Storage jobs restarts

2023-02-09 10:07 - We are currently investigating storage job restarts that occurred on 2023-02-09 07:35 UTC and 2023-02-07 08:04 UTC. These restarts have caused longer job run times or errors such as "table already exists" during transformation executions. We will provide another update when new information is available.

2023-02-09 10:57 - We have identified the root cause. We will deploy a fix within two hours, which might cause another occurrence of these restarts for some jobs.

2023-02-09 13:53 - We have deployed a fix at 13:20 UTC which caused the last occurrences of restarts. The issue is now resolved and you should not experience any more job restarts.

]]>
tag:status.keboola.com,2013:Post/1938563 2023-02-08T10:58:43Z 2023-02-08T13:09:36Z Templates & Keboola CLI errors

10:50 UTC Due to recent changes in Storage API, the Templates API and Keboola CLI are returning errors in multiple situations since approximately 9:00 UTC. As a result, you might see unexpected errors when working with the Keboola CLI or when trying to apply templates. We're working on the fix, which is expected to be released today ETA 15:00 UTC.

13:05 UTC Issue on Storage API was fixed. All services are now operating normally. We apologize for any inconvenience this may have caused.

]]>