Week in Review -- April 30, 2018

Core

  • Improved generated configuration changes descriptions
  • Added configuration version to jobs results of Docker-based components (it is not yet available for legacy components like transformation and gooddata-writer)
  • Refreshed Manage API docs with working examples
  • Fixed loading of large tables for R-studio and Jupyter sandboxes
  • Fixed random CSV Import upload errors in EU region

Components

  • Improved "show details" experience for input and output mappings
  • Added visibility of columns non-existing in Storage to writers
  • Increased query timeout for all Keboola Provisioned Snowflake writers from 15 seconds to 15 minutes
  • Added support of unconventional column names to MySQL extractor
  • Removed static state from MongoDB extractor

Processors

  • Added support of snappy format to processor-decompress
  • Added processor filter-files
  • Added support for sanitization of invalid utf-8 in processor-iconv

Developers

New Debug API call is available (replaces very rarely used sandbox, dry-run and input-data calls). It creates a snapshot of the data directory used for running the component and stores it in your KBC project. To learn more, feel free to go through the API Docs or through the tutorial. In short the API call:

  • uses the same calling convention as the Run API,
  • filters encrypted values from the data directory,
  • works with all components (previously only those without encryption were supported),
  • works with Processors,
  • works with Configuration Rows,
  • works also with broken components and configurations (even if the run fails, you'll still get a snapshot of the data directory).

Python transformations

Pip version 10 was released recently which removes the pip.main method (more reading). The recommended way to install packages from within python is:

import subprocess
import sys
subprocess.call([sys.executable, '-m', 'pip', 'install', '--disable-pip-version-check', 'PACKAGE_NAME'])

Currently there are 70 transformations using the removed pip methods. If your projects are using them we'll be contacting you with a list of affected transformations. This breaking change introduced in pip is currently blocking us from upgrading python to 3.6.5 where pip 10 is used by default.

Unexpected Job Failures

Between April 28 2:30 and 3:15 UTC there was a high rate of application errors on one of our instances processing component jobs. 

The instance was under heavy load and we're investigating the root cause. Instance is now back to normal and is safe to restart the jobs.

We're sorry for any inconvenience. 

Degraded performance of Google Sheets Writer

On March 23, 2018 we released a new version of Google Sheets Writer to remove workaround which resized sheet's grid. Unfortunately this version caused a significant performance degradation for tables with larger number of rows.

We decided to revert this version to bring back original performance.

We are working on proper fix and it'll be released soon.

Orchestration Notification Updates Resulted in Deleted Tasks

There was an update to the orchestrator this week that had an unfortunate side-effect.  If you updated your orchestrations' notifications it would delete the orchestration's tasks.

Thankfully, the orchestrations are versioned, so if this happened to you, we will restore the tasks from the last version.
If you have any concerns about this please contact us at support@keboola.com.  

For what it's worth, updating notifications will no longer delete orchestration tasks, please accept our humble apologies if you were affected.

Week in Review -- April 09, 2018

Updated Components

Google AdWords Reports

  • This extractor is finally enabled also for customers using EU instance

Snowflake Writer

  • Added support of VARIANT data type

Google Drive Extractor/Writer, Google Sheets Writer

  • We added support for Team Drives

Impala Extractor

  • Added support for internal tables

Generic Components

We continue with removing so called "static state" from components. Few weeks ago we removed static state from Transformations, and there was a time for additional components. JSON configurations are also editable straight away. This includes configurations from templates (e.g. Youtube Extractor) and configurations for Custom Science Apps (e.g. Custom Science Python).

Fixes

  • Python/R transformation sandboxes correctly apply filters in input mappings, so input data will be loaded correctly
  • CSV Import uses server side encryption in S3 stage (before uploading to our storage) by default
  • Gmail Extractor supports "message parts" in more sections and there should no longer be messages without parts
  • ThoughtSpot writer correctly handles the "Test Credentials" action

Deprecations

We are deprecating direct import from URL into Storage. Please use the new Http Extractor instead which gives you much more flexibility.

Jupyter and RStudio Sandboxes are not starting

3:45pm CEST: We're investigating the issue.

3:55pm CEST: Not only starting, existing sandboxes do not seem to respond.

4:10pm CEST: We're shutting down existing sandbox instances and spinning up new ones. This will take a couple of minutes before the sandboxes will be available again. All existing sandboxes will be unfortunately deleted.

4:20pm CEST: Sandboxes are starting again. All previous sandboxes are deleted. We're sorry for this inconvenience.

Week in Review -- March 29, 2018

New Components

New Features

  • We have released the Guide Mode, an interactive tutorial for Keboola Connection
  • "Sudo" mode - important changes are protected by requiring password

Updated Components

  • Google Drive Extractor, Google Drive Writer and Google Sheets Writer all support Team Drives
  • Google Sheets Writer preserves formatting when writing into an existing sheet
  • Generic Extractor supports arrays as properties in child jobs.

Minor Improvements

  • Encrypted values are now filtered from component events. This prevents accidental leak of credentials from a component, e.g. when it crashes as prints its stack trace or other internal logs to events
  • keboola.processor-orthogonal is now available to fix malformed CSVs. Handy if you encounter Load error: Line 1 - Extra column(s) found errors

New HTTP(S) Extractor

Another one joins the band.

In our effort to replace Restbox with modern components, the next logical step was the HTTP(s) extractor. It allows you to download a single CSV file or a compressed, publicly available file and import them into a single table in Storage. In case you have more public files to download from a single domain, the UI allows you to reuse the same base URL for more files.

The UI of the new extractor supports many features out of the box, but the extractor is not limited by its UI: it's another component that supports processors. So your CSV file can be invalid, in a weird charset, pivoted or mutilated in some other way, and there's tooling to get that fixed.

The list of available processors will be kept and updated in the Developer Portal list of components. A full description of the extractor is available in our documentation.

Introducing Guide Mode

We are happy to announce the immediate availability of Guide Mode. In Guide mode, the Keboola Connection user interface will switch to an interactive tutorial which will guide you through the basics of using Keboola Connection. 

Guide mode is designed for new users and works best on empty projects. Therefore, when you invite a new person to Keboola Connection, they will receive a special link in their invitation email:

The link leads to the try.keboola.com page. By following the link, they will receive a 15day demo project with the Guide mode activated. 

The Guide Mode is the very first step in creating a replacement the old Academy. We are gradually going to fill it with more advanced content, but in the mean time try it out and let us know what you think.