New Currency Extractor

We have created a new currency exchange rate extractor. This component allows you to extract currency exchange rates as published by European Central Bank (ECB). Its configuration is dead simple, all you need to do is select the source currency (USD and EUR are currently supported).


Migration

This extractor provides the same data in the same format as the old extractor (which had to be configured through support). We encourage you to switch to the new one - set it up as any other extractor. We will keep running the old extractor till January 2017. Just let us know once you don't need it, so that we can deactivate it and stop magically pushing data to your project.

Like in the old currency extractor, the resulting data contains gaps for bank holidays (including weekends). If you would like to fill the gaps with last known value, you can use a little SQL script we hacked together.

Failed Jobs: Database Server Restart (UPDATE)

On October 21, 2016 at 11:43:58 PM CEST/UTC+2 one of our database servers was restarted due to a hardware failure. Jobs running during that time failed when trying to reconnect to the database later in the processing.

Update October 21, 2016 at 01:20 AM CEST/UTC+2: Jobs/orchestrations affected with this restart didn't stop processing and finished successfully, they may show incorrect job result as error instead of success.

We're sorry for this inconvenience and we're restarting all failed orchestrations.

Failing jobs and API errors: AWS DSN problem

We have investigated issues with Amazon DNS service in us-east region.

It affects jobs processing and API availability of our components. Thanks for patience.

UPDATE: 

AWS confirmed the problem - 5:15 AM PDT We are investigating elevated errors resolving the DNS hostname used to access the EC2 APIs.

AWS status update - 5:51 AM PDT We have identified the root cause of the issue causing errors resolving the DNS hostname used to access the EC2 APIs and are currently working to resolve.

AWS status update -  [RESOLVED] Between 4:31 AM and 6:10 AM PDT, we experienced errors resolving the DNS hostnames used to access some AWS services in the US-EAST-1 Region. During the issue, customers may have experienced failures indicating "hostname unknown" or "unknown host exception" when attempting to resolve the hostnames for AWS services and EC2 instances. This issue has been resolved and the service is operating normally.

Stalled Transformations

Our transformation MySQL server was under heavy load between October 20th 12:00am and 3:00am UTC. Transformation processes were slowed down or halted. 

We have identified the blocking processes and all operations returned to normal. In a few cases we have terminated stalled transformations and restarted the orchestrations. 

We're sorry for this inconvenience.

Weeks in Review -- October 19, 2016

Even though the weekly status has been a little weak recently, it does not mean that we're dawdling. We are working on some quite big internal things, which take much more than a week. For example (in no particular order):

  • Developer Portal, which will enable 3rd party developers to manage their applications.
  • Keboola Connection in EU Region.
  • OpenRefine transformations.
  • Validation and simplification of input and output mapping.
  • Internal changes to transformations and sandboxes using so called Workspaces.
  • RStudio and Jupyter Sandboxes.
  • Collecting more statistics about running jobs:
  • New Redshift, Oracle and MSSQL Writers and DynamoDB, Facebook and S3 Extractors.
  • Support of LuckyGuess applications (e.g. Anomaly Detection) on Snowflake.
  • Shared buckets between projects (to replace copying data via Restbox).
  • Resolving problem with ever growing tables on Redshift.
  • Plus we fixed the annoying problem that yelled "Table already exists" when loading data into sandbox.

All of the above are in various state of completeness (except the last item, which is complete). So stay tuned for more announcements when these features are finished.


OpenRefine Transformations: Public Beta

We're opening OpenRefine transformations to public. You can now use OpenRefine in your transformation pipeline. 

No further need to write long string replaces in SQL or study how to open CSV files in Python or R. OpenRefine excels in data cleanup and many other data wrangling tasks. 

To create an OpenRefine transformation choose OpenRefine (beta) when creating a new transformation.

Learn more about OpenRefine and its functions and about OpenRefine integration in Keboola Connection.

Strict Input/Output Validation

During last days we have turned on strict input/output mapping validation. Each input/output mapping is checked against the table in Storage if

  • all columns exist
  • the primary key is equal in both cases 
  • datatype/indexes/distkey/sortkey or filter column names have the same letter case

Although we tried to detect all breaches of this ruleset beforehand and contact project owners some have unfortunately slipped through. We're closely monitoring all errors and fixing/restarting all failed orchestrations. 

In case your project is subject to this issue on a larger scale than a single failure, we're able to remove the validation temporarily. Please contact us at support@keboola.com with any further questions/requests.

We're deeply sorry for any inconvenience. 

Snowflake Issues UPDATED

We're currently investigating issues with Snowflake workspaces (sandboxes, transformations). We'll keep this post updated.

Update 7:36pm CEST: We have passed information to Snowflake support team and they're investigating the issue.

Update 22:36pm CEST: Snowflake team has identified the issue and is working on fix which should be deployed later tonight.

Update 05:55am CEST: The issue has been resolved and all operations are back to normal. Due to high number of affected transformations/orchestrations we won't be restarting them to prevent system overload. Please restart your orchestrations manually if needed.

Thanks for your patience and understanding.

Snowflake Transformations Query Limits

We have introduced maximum query execution time limit for Snowflake transformations. 

If the query execution time exceeds 15 minutes, it will be terminated. This limit should not affect any of current transformations. 

This limit helps us to prevent accidental warehouse overloading by inefficient user queries (cartesian product). This was one of the causes of this week failures.