Week in review -- April 26, 2019

Column descriptions and user defined data types

Description and custom data type can be provided for each table column in Storage. Custom data type allows you to override data type provided by the system. These data types are then used as defaults for transformation and writer table inputs.

You can explore and edit these values on Storage table detail.

Chained aliases

An alias can be created from another alias. Also aliases created in Shared Buckets are propagated to linked buckets and can be further aliased. This simplifies data preparation and sharing, tables which don't require additional processing can be directly aliased to shared buckets.

Chaining is supported only for aliases with automatically synchronized columns and without filter.

Automatic Incremental Processing

With automatic incremental processing, the component will receive only data modified since the last successful run of that component.

Code Templates for  Jupyter and RStudio sandboxes

For Jupyter and RStudio sandboxes, code templates can be defined. Code templates can be set for a given user or for the entire project. A Jupyter template is a notebook file (.ipynb). An RStudio template is a simple text file. If a sandbox is loaded from transformation, the transformation code will be appended after the template code.

Google BigQuery

  • New extractor with Google Service Account service authentication was published
  • New writer with Google Service Account service authentication was published
  • Previous version of the writer is deprecated and will be shut down on 1. 8. 2019.The migration to new version of extractor is available.

Other Updates

  • Create a single task orchestration from component configuration
  • New version of Zboží.cz Extractor by Medio - Get your daily impressions, clicks, cost and conversion stats for preset time range or previous day.
  • Python sandboxes and transformations were upgraded to Python version 3.7.3
  • R sandboxes and transformation were upgraded to R version 3.5.3

Week in Review -- November 30, 2017

Database extractors

After recent announcement of new database extractors we are bringing other improvements to these extractors.

  • By default the new bucket is created for each extractor configuration. Previously extractor configurations were sharing one bucket which led to collisions.
  • You can reload the list of tables fetched from database. It is useful when you are tuning credentials permissions or you have just added some tables to the database. 
  • Primary key is now validated against table created in storage. Warning is thrown if the configured key is different than the key defined on the table.

Generic extractor

The new ignoreErrors option was introduced. This option allows you to force Generic Extractor to ignore certain extraction errors.  Read more in documentation.

Fixes

  • Google Drive Writer in append mode was overwriting rows instead of appending new rows under some circumstances
  • Google Big Query extractor was ignoring files exported to multiple slices
  • Tableau Writer is now available in the EU region


Fixed IP Address Ranges

We have mitigated all issues which forced us to rollback fixed IP ranges announced few weeks ago.

It is our pleasure to announce that as of today fixed IP ranges are back in production, all our outgoing network connections in the US region are using a pool of four static IP addresses. 

This can help you meet your company security standards. To find out more, please visit the IP Addresses page in our documentation.

PLEASE NOTE

Some of you were employing a very old concept where we performed an on-demand network source routing which allowed us to force the source IP under syrup-out.keboola.com. This was deprecated almost a year ago. 

Today, we are also announcing that old source routing is deprecated and will be turned off at the end of this month. If you rely on source IP, please move all your existing firewall rules to our new addresses before July 30, 2017. Then you can remove our legacy 54.85.151.211 (syrup-out.keboola.com) from your firewall. 

After July 30, 2017 we'll still hold that IP address but it won't be used anymore.

Snowflake Transformations Query Limits

We have introduced maximum query execution time limit for Snowflake transformations. 

If the query execution time exceeds 15 minutes, it will be terminated. This limit should not affect any of current transformations. 

This limit helps us to prevent accidental warehouse overloading by inefficient user queries (cartesian product). This was one of the causes of this week failures.

Week in Review -- September 19, 2016

Call for testers: OpenRefine Transformations BETA

Our new  OpenRefine transformations need some testers. Do you want early access and want to play with OpenRefine in Keboola Connection? Please contact us at support@keboola.com.

Google Analytics Extractor

The Google Analytics Extractor can now parse urls for queries.  So you can create your query with the very convenient  Google Analytics Query Explorer and simply copy and paste the url.

Versions management

Simplified access to latest version diff.

Announced Redshift maintenance canceled

We are cancelling previously announced maintenance of Redshift projects 

Other improvements and bugfixes

  • New Hubspot extractor published
  • Adwords extractor - more verbose logging
  • PostgreSQL extractor - fixed retry mechanism which previously caused invalid CSVs with duplicated header
  • Storage - fixed table size and rows count for Snowflake backed projects
  • Storage configurations - numeric ids are generated, fixed bug in duplicate ids creation
  • GoodData Writer - fixed grain settings for tables with custom identifiers 
  • Elasticsearch Writer - SSH tunnel support added


New Storage API Importer

We have launched a new version of Storage API Importer which is replacing old one running at https://syrup.keboola.com/sapi-importer/run

Storage API Importer simplifies the whole process of importing a table into Storage . The SAPI Importer allows you to make an HTTP POST request and import a file directly into a Storage table.

The HTTP request must contain the tableId and data form fields. Therefore to upload the new-table.csv CSV file (and replace the contents) into the new-table table in the in.c-main bucket, call:

curl --request POST --header "X-StorageApi-Token:storage-token" --form "tableId=in.c-main.new-table" --form "data=@new-table.csv" "https://import.keboola.com/write-table"

New Importer is now running as a standalone service so it gives us more control about service scaling, stability and performance.

Read more details in New Storage API Importer Documentation

Migration

The services are fully compatible the only difference is in hostname and path. That means all you need to do is replace https://syrup.keboola.com/sapi-importer/run to https://import.keboola.com/write-table in your scripts.

So the following script:

curl --request POST --header "X-StorageApi-Token:storage-token" --form "tableId=in.c-main.new-table" --form "data=@new-table.csv" "https://syrup.keboola.com/sapi-importer/run"

Will become:

curl --request POST --header "X-StorageApi-Token:storage-token" --form "tableId=in.c-main.new-table" --form "data=@new-table.csv" "https://import.keboola.com/write-table"

We will support old service December 2016. All customers using old service will be notified soon.

Are you not sure if you are using old importer service?

You can validate it with following steps:
  • Open storage page in your project
  • Search for component:sapi-importer in events
  • If there are no results then you are not using the old service.  If you see something similar to the screenshot below, then you are using the old service.






Snowflake backend project errors

We are investigating connection errors in projects with Snowflake backend. It is related to today's snowflake maintenance, we are in contact with their support.

We will update this post when we have more information. Sorry for any inconvenience.


UPDATE 09:30 PM PDT Storage is fixed and failed orchestrations were restarted. We are working on Snowflake transformations fix.

UPDATE 10:15 PM PDT Snowflake transformations are also working. Snowflake did the rollback of release which caused problems.

Waiting jobs

We are investigating a problem with waiting orchestrations jobs.

We will update this post when we have more information.


UPDATE 12:02 AM PDT We have found and fixed the issue, waiting orchestrations are now starting.  

UPDATE 12:26 AM PDT All waiting jobs were processed. Everything should now be working normally. If you encounter some problem, please let us know.

Sorry for any inconvenience.

New Twitter Extractor

We've launched a completely new version of the Twitter extractor which replaces the now slightly outdated version of Twitter extractor.

With the new extractor you can have Twitter data in Keboola Connection in just a few clicks. Just authorize your account, select whether you want timeline, mentions, followers or you can do a tweet search, save it and run the extraction.

There are two types of authorization available, you can authorize instantly if you have access to the target Twitter account, or you can use the external authorization and send an authorization link to the Twitter account user to authorize it for you.

If you are interested in more details like extractor limits and output data format please visit our new documentation.

Feel free to try the new extractor and in if  you have any questions or something is missing, please contact us at support@keboola.com

The old extractor, now named Twitter (DEPRECATED) in Keboola Connection, is now deprecated and will be shut down on 19th of June 2016.

If you need help with migration please let us know.


Week in Review - April 18th

Dropbox writer improvements

You can now more finely configure output settings. Previously you weren't able to specify output file name, it was alway generated, now you are able to choose any file name you want.

 

Tableau writer improvements

Exported TDE file name can be changed from default in writer table settings.

Transformations

Sandbox load data dialog was simplified, you can now select buckets or tables to load in one input.


Storage API

Partial import is deprecated from now. This feature is supported only by MySQL backend and it is not used anymore by any of extractors or writers so it's deprecation and later shut down should not affect any projects. 

Documentation

Developers

We've open source GoodData PHP Client under MIT licence. This library is used by Keboola Connection GoodData writer.