Errors in Generic Extractor Post-Mortem

Summary


On April 4, 2020 at 10:07 UTC, we deployed a version of Generic Extractor which contained a bug.
Some Generic Extractor jobs failed with the following error:

CSV file "XXX" file name is not a valid table identifier, either set output mapping for "XXX" or make sure that the file name is a valid Storage table identifier. 

Generic Extractor was reverted to its previous version at 14:08 UTC. The error affected 10% of all Generic Extractor jobs running during the four-hour period. We sincerely apologize for the trouble this may have caused you.

What Happened?

We changed the output generation rules so tables are always generated even if empty. Table names are normally generated using the outputBucket setting. However, it can also be done using undocumented alternative settings via ID or name properties. Unfortunately, the new code did not take the alternative settings into account and failed to generate correct table names.

What Are We Doing About This?

We have extended the tests to cover the undocumented settings, though we recommend you stick with the documented ones.

Errors in Generic Extractor jobs

Today we have released a version of Generic extractor in which a bug was present. It caused certain specific configurations to fail with the error:

CSV file XXX file name is not a valid table identifier, either set output mapping for XXX or make sure that the file name is a valid Storage table identifier. 

We have reverted the release. We sincerely apologize for the error. We will publish a postmortem next week.


Orchestrations API increased error rate in EU

There are some problems causing errors of Orchestrations API responses in EU region. We are investigating and will give here more details in under an hour.

UPDATE Apr 2 11:32 CEST - The errors stopped occurring by now. We are watching it and investigating the root cause.

UPDATE Apr 2 12:05 CEST - We've found out that API servers were flooded with some unexpected requests bursts. We've upgraded the infrastructure and will find a way how to prevent such a situation for next time.

Week in Review - March 31th, 2020

UI Improvements

  • Action buttons are now directly accessible when hovering over list items in transformations and components which use generic input or output mappings.
  • We added a new modal to improve the orchestration set-up experience.  You can now more easily schedule orchestrations on an hourly, daily or weekly basis. There's still an option to set up a custom schedule.
  • When you want to edit tables or edit credentials in your database writers, you no longer have to click on the “Edit” button, you can directly edit the values and push “Save“ button.
  • We added a new modal for database writers that support provisioned credentials(Redshift, Snowflake). You can now directly create provisioned credentials.

Minor Improvements

  • Julia transformation and sandbox have been updated to julia1.4


Transformation failures

We’re currently experiencing a transformation failures, we are investigating the problem. Next update in one hour.

UPDATE March 31, 6:17 AM UTC: We've identified the issue and deployed rollback. Transformations started after 6:11 AM should run without any issues. We’re monitoring to ensure transformations are running as expected. Next update in one hour.

UPDATE March 31, 7:30 AM UTC: The rollback was finished and no other issues were reported within the last hour. We are going to investigate the root cause and publish post mortem soon.

Transformation errors

Since March 26, 4:00 PM UTC we are experiencing failures for starting transformations in US and EU regions with error Storage API bucket 'configuration_id' with configuration not found.

Error was caused by incorrect configuration.

We're investigating the issue and will update this post with our findings.

We apologize for the inconvenience.

UPDATE March 26, 4:35 PM UTC: Problem was fixed.

Degraded Snowflake Performance (US region) - March 24, 2020

Since March 24, 8:15 am UTC we are seeing decreased performance of Snowflake in US region. That may cause degradation in performance jobs and sandbox loading in US region. We are investigating the causes. Next update in one hour.

UPDATE Mar 24, 10:10 UTC  - Performance should be back to normal, we're closely monitoring the situation.

Week in Review - March 19th, 2020

Announcements

New Features

  • Keboola Connection in-app news — we now display important news inside the application. We aim to replace the current status with the in-application news:

  • Search in transformations now searches the individual transformations again:
  • Orchestrations can now be copied (from the configuration versions page); the new orchestrations are created as disabled so that they don't run unexpectedly:
  • An entire bucket can be added in input mapping in transformations: 
  • Orchestrator now supports setting the timezone for the orchestration schedule:
  • We're working on making the UI less cluttered; therefore, we're hiding numerous action buttons into an action menu. In the following releases, some of the actions that are now in the action menu will be directly accessible when hovering over the item in the list:

Community News

For Czech speaking folks - we're participating in Covid-19 CZ activity.

Minor Improvements

  • DWHM Manager now has the option to reset users' passwords; the password link will be returned in job events. Keep in mind that the link can only be clicked once: 

Fixes

  • Microsoft SQL Server writer now correctly creates the primary key when creating a table.

  • Thoughtspot writer now creates the target database & schema if it does not exist.

Developers

  • Developer portal now shows the link to your development project: 


Deprecation of public File Uploads

If you are uploading a file to Storage (manually or automatically), there's an option to upload it with the Public flag. The file can then be accessed publicly outside of Keboola Connection.

Only a minority of Keboola Connection users take advantage of this feature, and they do so in a very non-standard way (e.g., for HTML files). That's why we decided to deprecate it. Also, the new File Storage we have prepared (Azure Blob Storage) doesn't support public File Uploads, and we want to make this behavior consistent across all supported File Storage Backends.

The option to create Public Files from the UI has been removed (effective with the publication of this post).

The option to create Public Files via an API will be removed in about three months, by the end of June, 2020.

An alternative solution could be the AWS S3 Writer component, but we don't recommend relying on Public Files at all. Not even outside of Keboola Connection.