Week in Review — December 3, 2019

New Components

AppsFlyer Extractor

  • We’re happy to welcome the AppsFlyer Extractor to the family!

  • AppsFlyer is the market leader in mobile advertising attribution and analytics, helping marketers to pinpoint their targeting, optimize their ad spend, and boost ROI.

  • This component was developed by KDS Team.

DynamoDB Writer

  • Another great addition to our component collection is DynamoDB Writer!

  • This component lets you write data from our Storage into AWS' DynamoDB, which is a key-value, document-based database. It's fully managed, multiregion, multimaster, durable with built-in security, backup and restore, and in-memory caching for internet-scale applications.

  • This component was developed by KDS Team.

CSV to XLSX Processor

  • Highly demanded utility to convert data in CSV format to XLSX is here!

  • This component was also developed by KDS Team.


Updated Components

AWS S3 Extractor

  • Uses parallel download and it's more than 20 times faster!

Snowflake Extractor

  • Supports incremental load.

  • Supports config rows.

Oracle Extractor

  • Supports incremental load.

  • Supports configuration rows.

  • Implemented retry mechanism for all types of queries

Redshift Extractor

  • Supports incremental load.

MySQL Extractor

  • Implemented retry mechanism for all types of queries

PostgreSQL Extractor

  • Added support for PostgreSQL Server v12


Improvements

Transformations

  • Transformations are collapsed by default.

Data Catalog

  • Improved "empty" page

  • Type of sharing is now shown and will be improved further. 

  • Improved “initial" modal

  • Bucket description is shown where applicable. 

Storage

  • When creating an alias table, you can now filter data by the EMPTY_STRING or SPACE characters.


Fixes

  • Fixed a bug in the Darksky extractor date conversion. The bug caused dates in the date column for places in UTC+ timezones to be one day less than the actual forecast date.


Developers


Table Aliases with your custom SQL

A few weeks ago, we silently launched ability to create Storage API Aliases by using your own SQL code. These Alias Tables with custom SQL can be created with Redshift backend only.

Create New Alias Table:

Define your own SQL code:

Why?

Alias Tables can help you structure your data. Imagine it as a "Transform on Demand" - everything is happening on-the-fly (aka real-time). Say we have business transactions in table "data". This is an example how to define "derived" table with weekly sum of all transactions, that can't be joined with our Customer (alarm, wrong data!! :-)

Raw Result of this simple alias table:

"year","week","total"
"2014","1","1314788.27"
"2014","2","3719694.16"
"2014","3","3907852.92"
"2014","4","4013945.26"
"2014","5","3884234.84"

Thanks to the almost unlimited power of the Redshift cluster, you can also create much more complex examples. For instance, this one creates a denormalised table of transactions that occur during the night at weekends, in EUR, outside of Czech Republic and not having one specific product code:

Redshift Integration: Release Preview

Last couple months we've been working hard to deliver this new exciting feature. The day is very close, but we'd like to invite you to our release preview. Until the end of August we're running Redshift as Beta or Release Preview, polishing bugs and delivering the best possible user experience and performance boost. 

To get started with Redshift, you need to create a Redshift bucket in your Storage. The Storage API console allows this or contact support@keboola.com. Once you have at least one Redshift bucket in your storage (you don't need to store any data in there), the provisioning, transformations and other features will unlock.

You can create transformations (and sandboxes) that use AWS Redshift as their backend. Your current MySQL transformations are incompatible, but the Redshift SQL syntax is very similar. In the input mappings there are some new options (SORTKEY, DISTKEY, COPY command options, datatypes), that are Redshift specific. To take full advantage of our next steps we'd recommend that your Redshift transformations use data from Redshift storage buckets - this will basically eliminate all input and output transfers. But you can use data from regular buckets as well. 

For a Redshift sandbox your Sequel Pro or the native Adminer app are unfortunately useless. You can use the free version of JackDB web app, the Amazon recommended  SQL Workbench/JDBeaver or 0xDBE. Always be careful to use only the schema provided you in the credentials.

During this Release Preview period the usage is free without any guarantee, contact support@keboola.com for production deployment during this period. Be careful with big data loads to non-production cluster. As a rule of thumb don't use it for anything >10G (all tables combined). 

SQLdep.com integration for Redshift transformations

All Redshift transformations can now be easily integrated into SQLdep. A single click uploads your current transformation (and all dependecies) to SQLdep and shows you a big picture of what’s happening in the transformation. You can look at it from the table, column or query perspective. 

Note: this is only available for Redshift transformations. For more information about our Redshift backend please contact support@keboola.com.