Snowflake Transformations Query Limits

We have introduced maximum query execution time limit for Snowflake transformations. 

If the query execution time exceeds 15 minutes, it will be terminated. This limit should not affect any of current transformations. 

This limit helps us to prevent accidental warehouse overloading by inefficient user queries (cartesian product). This was one of the causes of this week failures.

Snowflake backend is down - UPDATED

Snowflake backend is down since 11:02am GMT+2. We're trying to fix it, post mortem will follow once we recover platform.

UPDATE:

Dead DWH was switched, everything is running on new backend. All running jobs, employing Snowflake backend, crashed. You have to run it again - everything will be OK for now. In case you're suffering under some other issues, contact us at support@keboola.com, please. 

We'll also publish official post-mortem as soon as Snowflake guys resolve our Issue. Stay tuned and thanks for patience!

UPDATE II:

If you're experiencing errors in your personal sandboxes, reset your credentials under the "Sandbox" link on the right-top of Transformations (https://connection.keboola.com/admin/projects/$pid$/transformations/sandbox). 

Failed Jobs

Some jobs have failed between 2016-09-26 17:30 CEST and 2016-09-26 17:40 CEST due to problem with one of our DB servers.

All affected orchestrations have been checked and restarted.

We are very sorry for any inconvenience.

If you have any concerns about this, please contact us at support@keboola.com.

Anti-Sampling for Google Analytics Extractor

As some of you might know, Google Analytics API doesn't always return precise data. Under certain circumstances, the data returned are sampled. Read more about sampling here.

To work around this problem and get more precise results, we are introducing a much anticipated feature into Google Analytics Extractor:

Anti-Sampling

You can choose from one of two anti-sampling algorithms - DailyWalk or Adaptive. Both are based on the same principle, to divide the wanted date range into smaller chunks.

DailyWalk as the name suggest divides the date range by days. So the extractor needs to make as many request as there are days in the date range.

Adaptive algorithm is using more sophisticated approach, and divides the date range into few smaller date ranges. Read in-depth explanation of the algorithm here, if you are interested. 

DailyWalk algorithm might be more precise in some cases, but usually you will get the same results faster with the Adaptive algorithm.

Experiment with them and use what suits you best.

Google Analytics Extractor Bug

We have encountered a bug in the Google Analytics Extractor. It was downloading data only for one profile, even if the query was set to "All profiles".

This bug was introduced on 16th of September and it is now fixed.

We are very sorry for any inconvenience. Please adjust your date ranges and download the missing data if needed. 

Week in Review -- September 19, 2016

Call for testers: OpenRefine Transformations BETA

Our new  OpenRefine transformations need some testers. Do you want early access and want to play with OpenRefine in Keboola Connection? Please contact us at support@keboola.com.

Google Analytics Extractor

The Google Analytics Extractor can now parse urls for queries.  So you can create your query with the very convenient  Google Analytics Query Explorer and simply copy and paste the url.

Versions management

Simplified access to latest version diff.

Announced Redshift maintenance canceled

We are cancelling previously announced maintenance of Redshift projects 

Other improvements and bugfixes

  • New Hubspot extractor published
  • Adwords extractor - more verbose logging
  • PostgreSQL extractor - fixed retry mechanism which previously caused invalid CSVs with duplicated header
  • Storage - fixed table size and rows count for Snowflake backed projects
  • Storage configurations - numeric ids are generated, fixed bug in duplicate ids creation
  • GoodData Writer - fixed grain settings for tables with custom identifiers 
  • Elasticsearch Writer - SSH tunnel support added


New HubSpot Extractor

We've created new HubSpot extractor which enables you to extract basic data from your CRM.

It is based on our Generic Extractor with predefined templates to helps download your data easily. That also brings an option to adjust the configuration to your needs by switching to JSON editor.

If you want to learn more about the configuration of the extractor, how to get your HubSpot API Token and details of the extractor output tables, look into our new documentation.

Feel free to try the new extractor and in if you have any questions or something is missing, please contact us at support@keboola.com.

Weeks in Review -- September 12, 2016

Google Drive Extractor

Google Drive Extractor has been revamped. Not much has changed, this was standard update due to our container based infrastructure. Just one thing to mention -- its UI is much faster. Read more about New incarnation of Google Drive Extractor.

Migration tool for Google Analytics Extractor

Another important thing is Migration Tool for new Google Analytics Extractor.

Transformation/Sandboxes Provisioning

Also, we have changed the way how Redshift transformations and sandboxes are provisioned. Migration of existing projects is prepared to week from September 26th to September 30th.

Custom SQL Aliases

We decided to drop support for custom SQL aliases (existing aliases will continue to work), due to theirs rare usage. You can still use Simple Aliases, which are now supported for Redshift backend too.


As always, we did many small improvements:

  • Search in components while creating new orchestration (UI)

  • AdWords Extractor update to new AdWords API v201607 (Extractor Backend)
  • Automated Configuration Adjustment (UI)
  • Show Multi-factor Authentication (MFA) status in users list -- not enabled MFA produces warning (UI)

Extractor Failures

We have experienced some extractor failures between 2016-09-09 21:00 CEST and 2016-09-10 12:00 CEST.

Affected extractors were Gmail, Twitter, MongoDB and LinkedIn.

Due to these errors, some of the runs exited with failure or wrote to a wrong bucket stage.

All affected orchestrations have been checked and started again.

We are sorry for any inconvenience.

If you have any concerns about this, please contact us at support@keboola.com.