New S3 Extractor

This one took us a while, but we believe it's worth it. We carefully gathered feedback and made the most commonly used features accessible through a new streamlined UI. And there's even more under the hood.

The original AWS S3 extractor was renamed to Simple AWS S3. It stays fully supported and is not being deprecated. There's no need to migrate your configurations.

There are several major differences between the original and the new extractor. The new AWS S3 extractor

  • can download multiple files/tables using a single set of credentials.
  • fully supports incremental loads.
  • is more flexible.

The UI of the new extractor supports many features, but the extractor is not limited by its UI: it is the first component that openly supports processors. Opening the JSON editor (aka Power User Mode) opens up the configuration to endless possibilities. The extractor itself does only a simple job – downloads a set of files from S3. All other jobs (decompression, CSV fixing, setting the manifest file, etc.) are delegated to processors. You can order and configure the processors so that they handle the files as required. You can even develop your own processor in case you're missing something. We're fully aware that this is not an easy concept to grasp, but it's intended for advanced users. Not advanced? Use the UI.

The list of available processors will be kept and updated in the Developer Portal list of components. A full description of the extractor is available in our documentation.

One step closer to replacing legacy Restbox. The HTTP extractor will follow shortly. 

Snowflake Outage in US Region

There was a short Snowflake outage between 10:30 and 10:35 CEST (09:30am and 09:35am UTC) in US region.

  • Sandboxes might have lost their data and worksheets
  • Transformation jobs might have finished with an error
  • Async data loads and exports were unaffected

We're investigating the impact and root cause and will update this post as soon as we know more. Snowflake is now back fully operational.

UPDATE Jan 30 2018: Snowflake released their RCA.

Week in Review -- January 9, 2018

You haven't heard from us for a while. We're sorry. Here's what's new.

New Components

Updated Components

Minor Improvements

  • SSL configuration of all database extractors is on the same page with credentials and the SSH tunnel configuration
  • Database extractors give a warning if your connection is invalid
  • Apify Extractor accepts a list of urls from a table in Storage
  • New part about ad-hoc data exploration in Jupyter was added to the tutorial on Ad-hoc Data Analysis

Fixes

  • Database extractors automatically changes table names to lowercase
  • Fixed a bug affecting non-incremental import of sliced tables in the BigQuery and Snowflake extractors 

Blog

Our developers have published 2 blog posts

Community News

Slower job processing

We're experiencing slower Docker components jobs processing, many jobs stalled in waiting state. Finding the root cause, hopefully we'll be back online soon.

UPDATE 9:40 AM CET: All operations are back to normal, the stalled jobs were caused by a misbehaving Redshift cluster. We're going to investigate the root cause. 

We're very sorry for this inconvenience.


Job Failures Tuesday, November 21, 2017

We have been experiencing temporary technical difficulties today around 9:15 AM CET and 9:20 AM CET.

Some component jobs may have failed as a result. We're investigating the issue and post an update when the root cause is found.

UPDATE 11:35 AM CET Jobs storage was temporarily unavailable for about two minutes. Jobs scheduling wasn't affected and running jobs were waiting until storage came up again so these jobs weren't affected. Unfortunately few orchestrations have failed, we'll do step to prevent this in the future.


Deprecating MySQL Storage and Transformations

Support for MySQL in Keboola Connection is coming to its end. Here's what will happen.

Effective immediately

  • New projects and projects without existing MySQL transformations will not be able to create new MySQL transformations.

MySQL Storage Backend (supported until January 2018)

  • The default storage backend for all projects is immediately switched from MySQL to Snowflake.
  • All MySQL buckets in all projects will be migrated to Snowflake in January 2018. This will not affect any operations, only a short maintenance on the project will be required.
  • No changes in the project are required.
  • You can apply for a sooner migration at support@keboola.com.

MySQL Transformations (supported until April 2018)

  • Your existing MySQL transformations will need to be migrated to Snowflake by the end of April 2018.
  • If you need any help migrating your MySQL transformations, contact support@keboola.com.

These steps will allow us to deprecate a piece of the legacy infrastructure and focus on the state of the art technologies. The Snowflake storage backend and transformations have significant performance and scaling benefits, so your projects will run faster than on MySQL without any extra charge.

Week in Review -- November 6, 2017

New Features

  • Finer granularity in all input mappings. The Days parameter was renamed to Changed in last and you can set the interval of changed data to as short as 10 minutes. Please note, that combining a legacy Days value with a new Changed in last is not allowed (eg. in a transformation chain, where the same table gets imported multiple times, all input mappings have to use the same filter type).

Minor Improvements

  • All new transformations come with a predefined boilerplate that will help you at the start. It is especially useful for Python and R transformations.
  • Sliced tables in Storage console (eg. exports from a Snowflake backend) are now merged when downloaded.
  • Database extractors display the name of the query that failed in their error messages.

Updated Components

  • Oracle DB Extractor and Writer are now available in the EU region.

Fixes

  • Input Mapping modal has autofocus in the Source field when opened.

Job Failures [Not Resolved]

Today, October 4, 2017, we experienced several jobs failures between 12:07--14:40 PM CEST. One of our worker servers became unresponsive which led to the jobs failures. We have shut down the instance and launched a new one. Everything is working properly now and we will further investigate the root cause of the issue. 

UPDATE 09:00 PM CEST 
We're still encountering rare job failures. We're monitoring the issue closely and trying to find the root cause.

We are sorry for any inconvenience.