Failing creation of projects for pay-as-you-go customers

There was an error in the platform causing failure during the creation of projects for pay-as-you-go customers since yesterday, 16 January 2022, 14:30 UTC. The problem was fixed a moment ago and we are investigating its cause.


January 19 2022, 10:00 UTC - all users which registered in pay-as-you-go (between Jan 15 22:00:00 UTC and January 16 2022, 14:30 UTC), should be now able to access their projects. 

In case of redirect to last step of pay-as-you-go, cookies for domain https://connection.north-europe.azure.keboola.com/ must be cleared (closing browser application should be sufficient) 

If closing browser did not help, clear cookies according instruction for you browser:

Chrome https://support.google.com/chrome/answer/95647?hl=en&co=GENIE.Platform%3DDesktop

Edge https://support.microsoft.com/en-us/microsoft-edge/delete-cookies-in-microsoft-edge-63947406-40ac-c3b8-57b9-2a946a29ae09

Firefox https://support.mozilla.org/en-US/kb/clear-cookies-and-site-data-firefox

Safari https://support.apple.com/guide/safari/manage-cookies-sfri11471/mac


Planned maintenance due to table migration

In order to keep our codebase clean, we occasionally have to make large changes to it. This time we have to migrate all projects on all stacks to use the new code.
This is an internal change only. Afterwards, you will not notice anything different.

The migration will take place between January 17 and January 21, 2022. During the migration, we will disable all projects for a few seconds to convert the data.
All your configurations will remain intact, and your usage of production and development branches will be unchanged.


Delayed processing of job in AWS EU stack

2022-01-12 8:40 UTC We are experiencing number of jobs in waiting state more than usual. We are investigating the issue.

2022-01-12 9:05 UTC There is sudden increased traffic on our EU Snowflake warehouse so that we upgraded it to larger instance and the queued jobs were immediately processed. The delay should be fixed soon. We are still monitoring the warehouse until the traffic is settled down.

connection.eu-central-1.keboola.com maintenance

UPDATE 12:05 UTC - Previously announced maintenance of connection.eu-central-1.keboola.com will start in one hour at 13:00 UTC. During the maintenance, you can't access your data and projects. All network connections will be terminated by "HTTP 503 - down for maintenance" status message.

UPDATE 13:00 UTC - EU stack (connection.eu-central-1.keboola.com) maintenance started.

UPDATE 14:53 UTC - EU stack (connection.eu-central-1.keboola.com) is finished. Platform should be stable, we continue to monitor it.

connection.keboola.com maintenance

UPDATE 06:54 UTC - Previously announced maintenance of connection.keboola.com will start in one hour. During the maintenance, you can't access your data and projects. All network connections will be terminated by "HTTP 503 - down for maintenance" status message.

UPDATE 08:00 UTC - US stack (connection.keboola.com) maintenance started.

UPDATE 10:43 UTC - US stack (connection.keboola.com) is finished. Platform should be stable we continue to monitor it.

High API errors rate in AWS service

The AWS service in the US-EAST-1 region where we operate the AWS US stack is disrupted by networking connectivity issues of some instances in one availability zone. So far our service does not seem to be directly affected but it may reach some of its parts eventually. We are monitoring the situation and let you know in an hour about its progress.

UPDATE: The Keboola Academy site is down too most probably due to this outage.

UPDATE 14:30 CET: The problem was identified by the AWS team (a power outage in one data center in USE1-AZ4 availability zone) and they are already restoring the power and recovering from the problem. It seems that some other services like Slack and SolarWinds Papertrail were affected too but the Connection seems to be unaffected except for some short job processing delays. We are still monitoring the situation and let you know about the situation in an hour.

UPDATE 15:30 CET: Power to all affected instances and network devices was restored and recovery for the majority of EC2 instances and EBS volumes within the affected Availability Zone is being seen. The impact on the Connection stack should be almost zero by now.

AWS Stacks Maintenance Announcement

A maintenance of AWS stacks Keboola Connection will take place on Saturday, Jan 8th, 2022 and should take less than three hours.

During the maintenance, you can't access your data and projects. All network connections will be terminated by "HTTP 503 - down for maintenance" status message.

All running tasks will be monitored by us and restarted in case of any interruption. Orchestrations and running transformations will be generally delayed, but not interrupted. However, feel free to re-schedule your saturday's orchestrations to avoid this maintenance window.

Delayed processing of job in Azure North Europe stack

2021-12-16 17:40 UTC We are experiencing number of jobs in waiting state more than usual. We continue investigating the issue.

2021-12-16 18:45 UTC The issue has been resolved, everything is working as expected. 

2021-12-16 19:10 UTC Further investigation revealed the parallel config rows execution might have been affected leaving some jobs stuck. Please review your jobs run as a configuration in parallel, terminate such jobs if they seem to be stuck and run them again.

Log4j zero-day vulnerability update

Regarding the security issue (CVE-2021-44228) with the Log4j zero-day vulnerability, we have completed all necessary steps to investigate if our system had been compromised.

After a deep investigation, we can say that there were no security issues or breaches on our systems. We don't utilize Log4j for our main services.

We also checked all 3rd party services we are using, but thanks to our very strict security standards those services are not publicly accessible, they run in a separate environment (disconnected from customer's data), and cannot be used as an attack vector. We also haven't received any security issues from our SaaS partners.

We take the security of your data very seriously, so we applied additional threat detection regarding the Log4j security issue.

Please reach out if you have any questions.

Column, Table, and Bucket metadata overwritten – repair

We found a way to repair the overwritten column, table, and bucket user metadata, caused by the incident reported here: Column, table or bucket metadata possibly overwritten

The incident affected column, table, and bucket metadata that had two (or more) metadata with the same key but a different provider. If metadata had been updated for one provider, values were changed for all of them. This could have led to a rewrite of user-defined metadata for column type, length, or any other metadata. These metadata are used for input mapping. Existing mappings were not affected. But you may be facing a problem when you create a new input mapping and use any table with affected metadata that works in existing mappings. This may cause a problem with the newly created input mapping. As a temporary solution, you can reset this user-defined metadata for a data type manually to the correct value.

We will find all affected metadata and obtain the correct values by “replaying” update metadata storage events. For all user metadata we fix, we also update the time stamp. While repairing the metadata, we will disable a project for a short time (we expect seconds or a few minutes at most), during which you will be unable to use the project. We apologize for any inconvenience. In the following days, we will add a message (shown on the project dashboard) to the affected projects with the expected date when the process to repair corrupted metadata will start. 

Any changes to the metadata after the issue has been fixed (December 3, 9:03 UTC) will be also taken into account and will not be lost.