Telemetry data extractor failure

2022-05-18 10:00 UTC We have started investigation of Telemetry Data extractor failures

Resolved - The problem is that the table kbc_component_configuration_version is now extracted with a new column kbc_branch_id that is also a part of the primary key but is missing in the destination/existing table kbc_component_configuration_version. To fix this, delete kbc_component_configuration_version table and run the Telemetry Data extractor again - that will correctly extracts the table without an error.

ECB Currency Rates Data Source delayed data updates

2022-05-09 18:10 UTC - We are investigating delayed data updates for ECB Currency Rates Data Source. Since 2022-05-07 15:00 UTC there were no updates in currency rate tables. Next update when new information is available. 

2022-05-09 19:41 UTC - We have identified the root cause and we are working on the fix. Next update when new information is available.

2022-05-10 10:45 UTC - The issue is fixed, missing data were backfilled.

New Telemetry Release

Dear Keboola Connection users,

On Monday May 16, 2022, we are releasing a new version of our telemetry.

There will be a couple of changes to the data structure:

  • All platform (kbc_*) tables - Primary Keys will be slightly changed to add stack identifiers, which will help us programmatically manage the script and make implementation of the new stack easier. Example: “kbc_project_id” 9241_us-east-1_aws will be changed to 9241_kbc-us-east-1. This is important only in case you are parsing values from Primary Keys.

  • Usage Metrics Values - the usage_breakdown “Snowflake Writer” will be renamed “DWH Direct Query” so that it is less confusing for the user, who is often anticipating queue jobs of the component called Snowflake Writer (Data Destination).

  • KBC Project - the column “kbc_project_stack” will be added, which uniquely identifies the stack where the project exists.

The major changes are happening on the backend of the Telemetry data processing:

  • All our internal Keboola Connection projects related to telemetry are managed by Keboola CLI, which allows us to easily manage updates, use GitHub reviews of the changes or initiate rollbacks in case of an issue.

  • All our internal Keboola Connection projects related to telemetry are using Queue V2, so they are able to take advantage of all of the new and future Keboola Connection features.

  • Using a new strictly managed multi-project architecture enables better readiness of the entire flow.

  • We are addressing minor issues related to product updates that were not covered in the legacy telemetry.

How the release affects our users:

  • Snowflake usage of Workspaces and Direct Queries will be adjusted, as the legacy telemetry did not correctly identify all of the Snowflake queries related to these usage breakdowns. The vast majority of users will see little to no impact from these changes and all those who will be most impacted have already been provided with further details by our Customer Success Team.

  • We will force FULL load in the Telemetry Data extractor, as changes to Primary Keys would cause duplicates if incremental processing were to be used in your configuration. It also leaves one task for our users - they will need to process the data in full in their subsequent processes after the new telemetry is released in order to align their telemetry with the update.

  • Telemetry data won’t be updated for up to two days so that we can ensure that users are not receiving incorrect data during the new telemetry product implementation.

In case you encounter any issues after the new telemetry is released, please contact support through the button in your project.

Orchestration jobs end after a second but child job still running

2022-05-05 13:10 CET - Starting with 9:10 CET it could happen that a parent job (orchestration or orchestration phase) ends with success even though the child jobs are still running. The child jobs will finish their work normally, but things may appear out of sync in case of nested or chained orchestrations.

The issue is fixed for Azure stacks since 12:50 CET, the fix for AWS stacks is under way.

UPDATE 2022-05-05 14:33 CET - We discovered one of the manifestations may be that all the phases started at once.

UPDATE 2022-05-05 23:30 CET - The issue is fixed since 12:50 CET.

AWS Stacks Maintenance Announcement 2022-05-28

Maintenance of Keboola Connection AWS stacks will take place on Saturday, May 28th, 2022 and should take less than three hours.The following stacks will be affected:

During the maintenance, you won’t be able to access your data and projects. All network connections will be terminated by the "HTTP 503 - down for maintenance" status message.

We will monitor all running tasks and restart any affected by interruption. Orchestrations (flows) and running transformations will be generally delayed, but not interrupted. However, feel free to re-schedule your Saturday orchestrations to avoid this maintenance window.

Stuck jobs in AWS us-east-1 [resolved]

2022-04-15 10:30 UTC - We are investigation stuck jobs on a new queue. Some jobs will probably end up with an internal error. Next update in 15 min.

2022-04-15 10:50 UTC - The situation was resolved, however 6 jobs were terminated by an internal error with no automatic restart. We apologize for the inconvenience and please feel free to restart your jobs manually.

Affected jobs:

  • job-839828541
  • job-839828546
  • job-839830974
  • job-839831721
  • job-839831773
  • job-839831782

Longer jobs runtime on new queue in AWS eu-central-1

2022-04-11 15:04 UTC - We are investigating transient delays in jobs processing. It manifests as a two hours gap without any activity in job events. It is happening randomly across projects and configurations, most of the occurrences are around 04:00 UTC. Only jobs running on new queue are affected. We are investigating the issue, next update in three hours or when new information will be available.

2022-04-11 16:54 UTC - We have increased minimum number of nodes which might help to avoid the issue happening again. Meanwhile we are investigating the root causes of timeouts. We are also working on decreasing timeouts from two hours to much lower value to prevent unnecessary job runtime increase in case of networking issues. Next update when new information will be available.

2022-04-14 12:54 UTC - We have reduced the timeouts from two hours to two minutes. This will prevent a job to get stuck for such a long time when a connection issue occurs. We are still investigating the root networking problem. Next update when new information is available.

Stuck Orchestrations [resolved]

2022-04-05 5:28 UTC - We are investigation stuck orchestration jobs on a new queue. The next update in 15 minutes.

2022-04-05 5:50 UTC - We can see the problem occurs in AWS regions, however we haven't found the root and continue investigation. Next update in 15 minutes.

2022-04-05 6:10 UTC - We rolled back previously deployed version of queue internal component and it seem to unblocked the stuck orchestrations jobs. We don't see any stuck orchestrations for now. We continue monitoring the situation and investigate for the root cause.

2022-04-05 9:10 UTC - We identified a root cause and now preparing a fix. However as of previous quick fix we are not noticing the stuck orchestrations anymore.

2022-04-05 11:40 UTC - We deployed fix and everything is running operational now. The root cause was a misconfigured network access for an internal Queue component.