Stuck storage jobs in Azure North Europe Stack

Today, 16th of June  since 3:03 UTC we are experiencing jobs are stuck on import and export data. It is due to a Snowflake incident in Azure west europe region https://status.snowflake.com/ where the warehouse of the Azure North Europe stack is located.

We monitor the snowflake incident and keep you updated here.

UPDATE 6:15 UTC - the snowflake incident is still ongoing, with the last update at 05:28 UTC: "We've identified an issue with a third-party service provider, and we're coordinating with the provider to develop and implement a fix to restore service. We'll provide another update within 60 minutes.". The issue is most likely due to a problem in Azure, which informed about an incident in West Europe region see https://azure.status.microsoft/en-us/status.

UPDATE 7:00 UTC  - we see progress, that is storage import/export data jobs are being processed. However the snowflke incident is still open, we continue to monitoring it.

UPDATE 8:00 UTC [resolved] - Snowflake has resolved incident stating "We've coordinated with our third-party service provider to implement the fix for this issue, and we've monitored the environment to confirm that service was restored. If you experience additional issues or have questions, please open a support case via Snowflake Community.". We don't see any more stuck jobs so we conclude it is resolved as well.

Degraded AWS US/EU Stack (connection.keboola.com,connection.eu-central-1.keboola.com)

2023-06-13 19:40 UTC Service components.keboola.com is degraded due incident in AWS US-EAST-1 Region https://health.aws.amazon.com/health/status we are monitoring situation.

2023-06-13 20:15 UTC Incident in AWS is affecting also our oauth authorization service in AWS US Stack (connection.keboola.com). All components relying on oauth authorization could be affected and may randomly fail. 

2023-06-13 20:20 UTC We are investigating slower jobs processing in AWS US Stack (connection.keboola.com)

2023-06-13 20:40 UTC Incident in AWS US-EAST-1 is causing jobs to be stuck on both AWS US (connection.keboola.com) and AWS EU (connection.eu-central-1.keboola.com) stack. This includes components jobs and services which are running jobs as part of their workflow, like creation of workspace.

2023-06-13 20:55 UTC AWS is reporting incident as resolved. All services are running normally, some jobs may still take longer to process due to large number of jobs waiting in queue. We are monitoring situation. 

Stuck jobs and failures in AWS EU stack

2023-06-06 02:08 UTC We experienced incident on connection.eu-central-1.keboola.com. Some jobs ended in error due to an underlying node failure. We're still investigating the root cause.

Update 2023-06-06 02:38 UTC The incident has been resolved. A small number of jobs on the connection.eu-central-1.keboola.com stack either ended by timeout or with a "Component terminated. Possibly due to out of memory error" error message during a recent incident. 

We are continuing to monitor the situation closely to prevent any reoccurrence. 

Platform Update: Transition to Datadog for Platform Logs Monitoring - Vendors Only

Beginning June 1st, 2023, we are transitioning our platform logs monitoring system from Papertrail to Datadog. This is a platform-level change and does not affect user experience or functionality. Regular users are not affected by this change.

For our 3rd party Keboola component vendors, this change modifies the way you receive application error notifications:

  1. Email Notifications Only: Notifications will now be sent exclusively via email. Webhook support may be considered in the future.

  2. Notification Email Address: Vendors previously notified via Papertrail or generic webhook will now receive notifications to the email address specified in their vendor profile. Vendors who were already receiving notifications via email will continue to do so at the same email address.

  3. New Sender Email Address: All notifications will come from alert@dtdg.eu.

Should our vendors have any questions or concerns regarding this change, please contact us at support@keboola.com.

Slowdown of processing of jobs on Azure North Europe stack [resolved]

Since 09:39 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.

UPDATE 10:30 UTC we managed to find the root cause, new worker nodes have a problem authorization accessing the container registry, we are working on a fix. Next update in 30 minutes.

UPDATE 10:57UTC The problem with authorization to container registry is now solved. All systems are now operating normally.

We apologize for any inconvenience caused.

Slowdown of processing of jobs on Azure North Europe stack

Since 13:40 UTC we're seeing job starting with delays on https://connection.north-europe.azure.keboola.com/ We're investigating the situation. Next update in 30 minutes.

14:14 UTC - All systems are now operating normally.

If your project run out of credits and you have enabled automatic top-up, this would have failed between approximately 13:40 to 14:10. Restarting the job will trigger automatic top-up correctly now.

We apologize for any inconvenience caused.

Orchestrations not starting on legacy job queue

2023-04-26 11:00 UTC - We have discovered a problem with orchestrations not starting on the legacy queue. We are currently investigating possible causes.

2023-04-26 11:30 UTC - The problem was caused by a release earlier today, and as a result, no orchestrations on the legacy queue were run since 08:10 UTC. We have done rollback of the release and orchestrations should be functioning properly again as of 11:30 UTC. We apologize for any inconvenience caused.

New Outbound IP Addresses for Keboola Connection: Action Required

We are adding new outbound IP addresses for the connection.keboola.com and connection.eu-central-1.keboola.com stacks for Queue V2. These new addresses are available now, but are not yet being used automatically.

This update is important for Keboola Connection customers. It may affect their ability to connect to their resources, particularly if they are behind a firewall.

What are outbound IP addresses?

Outbound IP addresses are unique addresses assigned to a device for the purpose of identifying it and sending information over the Internet. When Keboola Connection customers connect to their resources (typically databases), those resources are usually behind a firewall. In order for Keboola Connection to connect to those resources, customers whitelist our outbound IP addresses.

What must I do?

  1. If your resources are behind a firewall, ensure that all the new IP addresses are added to the whitelist, so as to enable connection to your system through Keboola Connection. 

  2. Use the “Test with new outbound IPs” feature to check the connectivity for any or all configurations in the credentials section. This will verify that your resources are accessible from the new addresses.

  3. If the connection works well, switch the project to the new IP addresses. In the event of problems, you can temporarily revert to your original IP address and contact our support team for assistance.

If you have multiple projects in your organization and have already tested the connection from the new IP addresses, you can ask our support team for help. They can switch all your projects at once, so you don’t have to do it individually for each one.

If you are not yet making use of Queue V2 for your projects, don’t hesitate to whitelist, as this will speed up your migration to the new queue in the future.

By when must I do it?

To ensure uninterrupted connectivity, the new IP addresses must be whitelisted by June 30, 2023. Otherwise, you run the risk of your connection not working. If the above update is not done manually by this date, Keboola Connection will perform the switch globally. To make sure of a smooth change, please add the new IP addresses to your whitelist and switch your projects as soon as you can.

Current list of outbound IP addresses

connection.keboola.com

  • Queue V2
    • 52.7.83.136
    • 52.20.72.254
    • 3.222.3.15 (new)
    • 34.206.78.206 (new)
    • 3.213.250.110 (new)
    • 107.22.113.103 (new)
    • 54.144.9.113 (new)
    • 54.204.61.145 (new)
    • 34.239.7.70 (new)
    • 3.217.232.144 (new)
  • Email delivery
    • 149.72.196.5
  • Queue V1 - legacy syrup services
    • 34.224.0.188
    • 34.200.169.177
    • 52.206.109.126
    • 34.203.87.137

connection.eu-central-1.keboola.com

  • Queue V2
    • 3.66.248.180
    • 3.64.150.30
    • 35.157.62.225 (new)
    • 3.71.156.204 (new)
    • 3.74.28.187 (new)
    • 18.158.155.128 (new)
    • 35.157.208.189 (new)
    • 3.72.243.47 (new)
    • 18.193.225.37 (new)
    • 3.127.158.56 (new)
  • Email delivery
    • 149.72.196.5
  • Queue V1 - legacy syrup services
    • 35.157.170.229
    • 35.157.93.175

For your convenience, you can programmatically fetch and process the list of existing IP addresses in JSON format. Read more about outbound IP addresses in documentation.

We appreciate your cooperation in making this transition as smooth as possible.