Generic Extractor Failures

We're encountering a series of "Found orphaned table manifest" errors in Generic Extractor. We have identified the root cause and reverting last changes to get it back to fully working state. We will restart all affected orchestrations.

We'll update this post when the fix will be deployed.

We're sorry for this inconvenience.

UPDATE 7:25pm CEST: The fix has been deployed to production, we're restarting all failed orchestrations.

Job failures

There were jobs failures between 10:30 AM - 12:50 PM caused by low disk space on one of the jobs workers.

We're sorry for this inconvenience.

YouTube Reporting API - Extractor update (v2)

It's my big pleasure to announce another major update on one of my first component I had ever build - YouTube Reporting API.

YouTube Reporting API offers a very simple way to download daily reports that belongs to a Content Owner of a Youtube Channel (in other words, your channel account must be authorised by Google if you want to download data from this API). These reports are generated by the defined jobs and all you need to do is to download these results. And this is something the extractor was build for. 

As the general process is very simple, the first version of this extractor was completed in such a short time. However, while we were using the extractor in the production, we found that (from time to time) Google triggers some background actions leading into generating reports which broke the original logic and produced incorrect result (caused by logic related to merge operations). And for that reason the first version of this extractor was not super useful for the production deployment.

However, based on that experience I really wanted to fix the problematic parts of the original version and turn this extractor into the project which is fun to use. And I simply believe I made it and I am extremely proud of what I achieved in this update.

You can read the full description in the documentation. In a nutshell, this extractor downloads reports generated by jobs. However, there are lots of extra features which help you to manage these downloads in a very convenient way. For example, the original configuration requirements implemented in the first version of this extractor was reduced significantly and there were also added several options for creating a backup (S3). But most importantly, all data should be downloaded correctly.

This writer is developed independently by Blue Sky Media. For more information on how to use the writer, please refer to the documentation. In case you run into some issues or you have more questions, please contact me directly. 

New Segment.io S3 extractor

Imagine you are building new web app. You want to measure all events that are in your app. Maybe you use Segment.io as a tool for sending those events to many destinations like Google Analytics, etc… Then you realise that you want to have all the data in one place (= Keboola Connection). How to send events from Segment.io to KBC?

The solution is simple. Just turn on the Segment.io S3 integration and all your events will be pushed to your own S3 bucket. Since the Segment - S3 integration uses a specific structure in S3 (each day has “folder” and logs are generated approx every hour into a separate file), we have developed custom Segment S3 extractor that will save you time and after simple configuration you can get all events into the KBC instantly.

The data is downloaded in JSON format which might be a bit tricky if you don’t use Snowflake backend. Anyway, if you do use Snowflake, the processing is just super easy and you can extract all data from JSON to columnar format and use in the ETL.

If you have any question contact support@bizztreat.com

Revision of Database Writers, new Impala Writer

We have released new versions of these database writers:

  • MySQL
  • Microsoft SQL Server
  • Redshift
  • Oracle

Also, we are introducing Cloudera Impala database writer.

All these writers are running on container based architecture with SSH tunnel support.

The Database writer and the old version of MSSQL writer are now marked as deprecated. We will continue to  support them for at least 3 months from now. After this period, we will migrate any remaining old configurations to the new versions.

We are now preparing a migration tool to help you migrate your existing configurations to the new versions.

If you have any questions or need help, please contact us at support@keboola.com.

Week In Review -- November 14, 2016

Here's what last week was about:

  • Fixed bug in Redshift backend: Exporting a table with column of maximum allowed length (64KB) no longer causes an error.
  • MariaDB, which is used as MySQL backend in transformations, was updated to version 5.5.53. This update enhances the handling of 4-byte UTF-8 characters. In previous version 5.5.44 a string was cut after the first utf8mb4 character. Now, all these characters will be converted to '?'. Full support of 4-byte UTF-8 characters is unfortunately unavailable in MySQL.
  • Fixed bug in MySQL extractor: When query returned an empty result, extractor ended up with 'Orphaned manifest' error

MySQL Transformation Job Failures

During last 24 hours some MySQL transformation jobs failed with the following error message:

SQLSTATE[HY093]: Invalid parameter number: no parameters were bound

This was caused by an update in the application and it only affects transformations, that contain empty queries (eg. just ";" character). The fix will be deployed shortly and we'll restart all affected orchestrations. 

We're sorry for this inconvenience.