Docker Jobs Application Errors

Unfortunately we we're unable to find a fix for yesterday's failures, so on Thursday June 7th between 3:49am CEST and 7:38am CEST (1:49am–5:38am UTC, 6:49pm–10:38pm PT) there was an increased application error rate on our Docker host instances in the US region.

The servers are now stabilized and it is safe to restart the failed jobs.

We're looking into this issue. We have started additional instances to help with the load and we'll be looking into the HW architecture of the instances to help us figure out what causes the issue. Meanwhile we'll try to implement a retry on such failed jobs.

We're sorry for this inconvenience.

Docker Jobs Application Errors

On June 6th between 2:15am CEST and 2:35am CEST (5:15pm PT and 5:35pm PT, 12:15am UTC and 12:35am UTC) there was an increased rate of application errors on one of our Docker host instances in the US region. The instance is now fully operational and the jobs are safe to restart.

Furthermore one of our EU region Docker host instances went down at 6:56am CEST and caused a few unexpected application errors. There is a new one in place, we recommend restarting any failed jobs.

We're sorry for this inconvenience, we're working on preventing these errors in the future.


Unexpected Job Failures

Between April 28 2:30 and 3:15 UTC there was a high rate of application errors on one of our instances processing component jobs. 

The instance was under heavy load and we're investigating the root cause. Instance is now back to normal and is safe to restart the jobs.

We're sorry for any inconvenience. 

Jupyter and RStudio Sandboxes are not starting

3:45pm CEST: We're investigating the issue.

3:55pm CEST: Not only starting, existing sandboxes do not seem to respond.

4:10pm CEST: We're shutting down existing sandbox instances and spinning up new ones. This will take a couple of minutes before the sandboxes will be available again. All existing sandboxes will be unfortunately deleted.

4:20pm CEST: Sandboxes are starting again. All previous sandboxes are deleted. We're sorry for this inconvenience.

Week in Review -- March 29, 2018

New Components

New Features

  • We have released the Guide Mode, an interactive tutorial for Keboola Connection
  • "Sudo" mode - important changes are protected by requiring password

Updated Components

  • Google Drive Extractor, Google Drive Writer and Google Sheets Writer all support Team Drives
  • Google Sheets Writer preserves formatting when writing into an existing sheet
  • Generic Extractor supports arrays as properties in child jobs.

Minor Improvements

  • Encrypted values are now filtered from component events. This prevents accidental leak of credentials from a component, e.g. when it crashes as prints its stack trace or other internal logs to events
  • keboola.processor-orthogonal is now available to fix malformed CSVs. Handy if you encounter Load error: Line 1 - Extra column(s) found errors

New HTTP(S) Extractor

Another one joins the band.

In our effort to replace Restbox with modern components, the next logical step was the HTTP(s) extractor. It allows you to download a single CSV file or a compressed, publicly available file and import them into a single table in Storage. In case you have more public files to download from a single domain, the UI allows you to reuse the same base URL for more files.

The UI of the new extractor supports many features out of the box, but the extractor is not limited by its UI: it's another component that supports processors. So your CSV file can be invalid, in a weird charset, pivoted or mutilated in some other way, and there's tooling to get that fixed.

The list of available processors will be kept and updated in the Developer Portal list of components. A full description of the extractor is available in our documentation.

New S3 Extractor

This one took us a while, but we believe it's worth it. We carefully gathered feedback and made the most commonly used features accessible through a new streamlined UI. And there's even more under the hood.

The original AWS S3 extractor was renamed to Simple AWS S3. It stays fully supported and is not being deprecated. There's no need to migrate your configurations.

There are several major differences between the original and the new extractor. The new AWS S3 extractor

  • can download multiple files/tables using a single set of credentials.
  • fully supports incremental loads.
  • is more flexible.

The UI of the new extractor supports many features, but the extractor is not limited by its UI: it is the first component that openly supports processors. Opening the JSON editor (aka Power User Mode) opens up the configuration to endless possibilities. The extractor itself does only a simple job – downloads a set of files from S3. All other jobs (decompression, CSV fixing, setting the manifest file, etc.) are delegated to processors. You can order and configure the processors so that they handle the files as required. You can even develop your own processor in case you're missing something. We're fully aware that this is not an easy concept to grasp, but it's intended for advanced users. Not advanced? Use the UI.

The list of available processors will be kept and updated in the Developer Portal list of components. A full description of the extractor is available in our documentation.

One step closer to replacing legacy Restbox. The HTTP extractor will follow shortly. 

Snowflake Outage in US Region

There was a short Snowflake outage between 10:30 and 10:35 CEST (09:30am and 09:35am UTC) in US region.

  • Sandboxes might have lost their data and worksheets
  • Transformation jobs might have finished with an error
  • Async data loads and exports were unaffected

We're investigating the impact and root cause and will update this post as soon as we know more. Snowflake is now back fully operational.

UPDATE Jan 30 2018: Snowflake released their RCA.

Week in Review -- January 9, 2018

You haven't heard from us for a while. We're sorry. Here's what's new.

New Components

Updated Components

Minor Improvements

  • SSL configuration of all database extractors is on the same page with credentials and the SSH tunnel configuration
  • Database extractors give a warning if your connection is invalid
  • Apify Extractor accepts a list of urls from a table in Storage
  • New part about ad-hoc data exploration in Jupyter was added to the tutorial on Ad-hoc Data Analysis

Fixes

  • Database extractors automatically changes table names to lowercase
  • Fixed a bug affecting non-incremental import of sliced tables in the BigQuery and Snowflake extractors 

Blog

Our developers have published 2 blog posts

Community News