Docker Jobs Application Errors

Unfortunately we we're unable to find a fix for yesterday's failures, so on Thursday June 7th between 3:49am CEST and 7:38am CEST (1:49am–5:38am UTC, 6:49pm–10:38pm PT) there was an increased application error rate on our Docker host instances in the US region.

The servers are now stabilized and it is safe to restart the failed jobs.

We're looking into this issue. We have started additional instances to help with the load and we'll be looking into the HW architecture of the instances to help us figure out what causes the issue. Meanwhile we'll try to implement a retry on such failed jobs.

We're sorry for this inconvenience.

Docker Jobs Application Errors

On June 6th between 2:15am CEST and 2:35am CEST (5:15pm PT and 5:35pm PT, 12:15am UTC and 12:35am UTC) there was an increased rate of application errors on one of our Docker host instances in the US region. The instance is now fully operational and the jobs are safe to restart.

Furthermore one of our EU region Docker host instances went down at 6:56am CEST and caused a few unexpected application errors. There is a new one in place, we recommend restarting any failed jobs.

We're sorry for this inconvenience, we're working on preventing these errors in the future.


Week in Review -- May 28, 2018

New Components

OneDrive Extractor (Beta)
  • You can now download your documents from OneDrive. The component was developed by Jakub Bartel.

Google Big Query Writer (Beta)

  • Write data from KBC to Google Big Query with our new writer. The component was developed by us, Keboola s.r.o. 


Updated Components

Currency Extractor

  • Now offers exchange rates of GBP.

Oracle Extractor

  • Now supports exporting columns of the LOB datatype family.

Gmail Attachments Extractor

  • Added support for processors, see example.

Create Manifest Processor 

  • Will test, if all the slices of a CSV file have the same header columns.

Flatten Folders Processor 

  • Throws a user exception when the flattened filename is too long (longer than 255 characters).
  • Has now configurable flatten strategies. Added `hash-sha256` strategy, which solves an issue with 255 characters filename limit in default `concat` strategy.


New Features

  • Component badges.
  • Validate your SQL in transformations with new Validate feature, provided by SQLDep.

  • MySQL extractor now supports incremental fetching. You can extract just the most recent records from a database table and write them incrementally into Storage.


    Minor Improvements

    • Show more / show less button added to the list of inputs and outputs of a job. It now shows all the tables and the view is more compact.

    GoodData Writer Issues

    Today between 2:30 and 7:00 CEST we experienced issues with GoodData Writer. Ironically, it failed to connect to a third-party service for utilization monitoring. The problem was fixed so there should be no other job failure. We are going to inspect the extent of the damage.

    Week in Review -- May 16, 2018

    Core

    Components

    Bugfixes

    • Component configuration state is not updated in case of attached processor failure. e.g in case of AWS S3 extractor and New Files Only option in leaves files as unprocessed in case of processor failure so the files can be processed again until the whole pipeline of processors is executed successfully.
    • MSSQL Writer - fixed support of unicode characters
    • Google Sheets Writer - fixed writing of large tables without performance issues
    • When specifying transformation output mapping, the bucket name is automatically webalized as being typed

    Developers

    We are happy to introduce the first version of Keboola Storage API Javascript client.


    SQL Server Extractor Connection Issues

    Between 17:00 May 15, 2018 and 8:00 on May 16, 2018 CET we experienced issues with the SQL Server database extractor. We recommend to review your orchestrations and take appropriate actions if needed.

    If you were affected by this, please accept our sincere apologies.

    SQL Server writer failures

    Between 9 May 2018, 10:53 CEST and 11 May 2018, 09:30 CEST there were job failures for SQL Server Writer configurations that had nullable data types. The issue was created by a new version of the writer so we have rolled it back to the  previous version while we investigate the root cause.

    We're sorry for any inconvenience. 

    Week in Review -- April 30, 2018

    Core

    • Improved generated configuration changes descriptions
    • Added configuration version to jobs results of Docker-based components (it is not yet available for legacy components like transformation and gooddata-writer)
    • Refreshed Manage API docs with working examples
    • Fixed loading of large tables for R-studio and Jupyter sandboxes
    • Fixed random CSV Import upload errors in EU region

    Components

    • Improved "show details" experience for input and output mappings
    • Added visibility of columns non-existing in Storage to writers
    • Increased query timeout for all Keboola Provisioned Snowflake writers from 15 seconds to 15 minutes
    • Added support of unconventional column names to MySQL extractor
    • Removed static state from MongoDB extractor

    Processors

    • Added support of snappy format to processor-decompress
    • Added processor filter-files
    • Added support for sanitization of invalid utf-8 in processor-iconv

    Developers

    New Debug API call is available (replaces very rarely used sandbox, dry-run and input-data calls). It creates a snapshot of the data directory used for running the component and stores it in your KBC project. To learn more, feel free to go through the API Docs or through the tutorial. In short the API call:

    • uses the same calling convention as the Run API,
    • filters encrypted values from the data directory,
    • works with all components (previously only those without encryption were supported),
    • works with Processors,
    • works with Configuration Rows,
    • works also with broken components and configurations (even if the run fails, you'll still get a snapshot of the data directory).

    Python transformations

    Pip version 10 was released recently which removes the pip.main method (more reading). The recommended way to install packages from within python is:

    import subprocess
    import sys
    subprocess.call([sys.executable, '-m', 'pip', 'install', '--disable-pip-version-check', 'PACKAGE_NAME'])

    Currently there are 70 transformations using the removed pip methods. If your projects are using them we'll be contacting you with a list of affected transformations. This breaking change introduced in pip is currently blocking us from upgrading python to 3.6.5 where pip 10 is used by default.

    Unexpected Job Failures

    Between April 28 2:30 and 3:15 UTC there was a high rate of application errors on one of our instances processing component jobs. 

    The instance was under heavy load and we're investigating the root cause. Instance is now back to normal and is safe to restart the jobs.

    We're sorry for any inconvenience. 

    Degraded performance of Google Sheets Writer

    On March 23, 2018 we released a new version of Google Sheets Writer to remove workaround which resized sheet's grid. Unfortunately this version caused a significant performance degradation for tables with larger number of rows.

    We decided to revert this version to bring back original performance.

    We are working on proper fix and it'll be released soon.