Week in Review -- January 30, 2018

Plantyst Extractor

To those who are collecting data from productions machines to Plantyst, you can employ new extractor made by BizzTreat and start doing complex data analysis.

Stories.BI writer

You can automatically push data to Stories.bi and get automatic insights instead of crunching business data by hand.


Updated Components

  • Sklik extractor has new variable accountID
  • YouTube extractor has new version. It is based on Generic Extractor. Old extractor will be deprecated on March 1, 2018
  • Snowflake extractor is now a bit faster and has better error handling
  • Geneea NLP App is now available in EU region
  • BingAds extractor is now available in EU region
  • Facebook extractor with new Page Tokens can newly fetch Page Reviews
  • Twitter extractor is now available in EU region
  • Snowflake and Redshift writers has fixed eventual columns mismatch.


Minor Improvements

  • Quick search in component list was improved - it has better accuracy
  • Component name can be finally submitted by pressing ENTER


Fixed IP Address Ranges

It is our pleasure to announce that as of today, all our outgoing network connections in the US region are using a pool of four static IP addresses. 

This can help you meet your company security standards. To find out more, please visit the IP Addresses page in our documentation.

Be aware that IP addresses can change in the future. For your convenience, you can programatically fetch and parse the list of existing IP addresses in JSON format at https://help.keboola.com/extractors/ip-addresses/kbc-public-ip.json

PLEASE NOTE

Some of you were employing a very old concept where we performed an on-demand network source routing which allowed us to force the source IP under syrup-out.keboola.com. This was deprecated almost a year ago. Today, we are also announcing that old source routing is deprecated and will be turned off at the end of this month. If you rely on source IP, please move all your existing firewall rules to our new addresses before June 30, 2017.


Week in Review -- April 4, 2017

New Component - Papertrail Extractor

We’re happy to welcome the Papertrail Extractor to the family.  Papertrail manages billions of log messages for operations-savvy companies. It has been our log management system of choice for years. If you log to Papertrail and realise that log messages contain important information, feel free to incorporate this unstructured data into your data strategy. By using our Papertrail Extractor, you can download all records matching your search query within the retention period. The extractor can also incrementally add new records each run.

Discrete sessions from e-commerce, low-level transactions, developer stack traces or operational data - everything can fit in your Keboola project! 

We will cover this topic in an upcoming blog post next week. If you're interested in how we perform complex analytic deep-dive into our logs, follow our Medium account at https://500.keboola.com/

Minor Improvements

  • S3 extractor now displays how many files were downloaded in the specific job; it is very handy especially in case of wildcard rules

  • MSSQL Writer now supports the BCP method - you can activate it in the table settings - it can write your data to desired MSSQL DB at supersonic speed, but take note that it doesn't handle weird UTF8 characters properly

  • Transformations with Snowflake backend now support FLOAT data type in input mapping -> no hacking with NUMBER data type anymore



Snowflake backend is down - UPDATED

Snowflake backend is down since 11:02am GMT+2. We're trying to fix it, post mortem will follow once we recover platform.

UPDATE:

Dead DWH was switched, everything is running on new backend. All running jobs, employing Snowflake backend, crashed. You have to run it again - everything will be OK for now. In case you're suffering under some other issues, contact us at support@keboola.com, please. 

We'll also publish official post-mortem as soon as Snowflake guys resolve our Issue. Stay tuned and thanks for patience!

UPDATE II:

If you're experiencing errors in your personal sandboxes, reset your credentials under the "Sandbox" link on the right-top of Transformations (https://connection.keboola.com/admin/projects/$pid$/transformations/sandbox). 

Project Limits

Today we’re introducing limits to all Keboola Connection projects.  You can find them in the “Users & Settings” section.

It will let you know what your limits are for storage, user licenses, orchestrations, etc. 


If your project or component is over a limit, the metrics will be shown in red brick


Keep in mind that these are just soft quotas which can be easily exceeded.  So If you go over a limit, you don’t need to be afraid of anything happening to your project usage (we all are in the cloud after all, lots of room up here :) Our goal is just to keep the red metrics at a minimum, so you may be hearing from us if too many red boxes hang around for too long.  The end result should be that you get your desired performance, and we get our profit :)

The project settings will also list your monthly cost, project type and days remaining until project expiration (typically proof-of-concept or demo projects will have expiration conditions).

Project expiration will also be announced on the project homepage (Overview):


Since we’re moving to this new system from our old filing cabinet and fax machine solution, there might be some glitches in the numbers displayed. If you find any discrepancies with the numbers there, please let us know. 

End of Life Announcement for Sardine

Over the past few years, we’ve introduced several standalone Keboola Connection Applications that accomplish specific tasks. With our focus shifting to core Keboola Connection, we’ve made the decision to discontinue support for the Sardine App. This means that as of today, January 15th, 2016, we won't be making any further Sardine improvements or updates.  As of April 15th, 2016, support for Sardine will be completely discontinued.  

What will happen after April 15th, 2016?

You can expect that Sardine will continue to work as normal for some time. However, future updates of related APIs might result in certain features breaking. Since Sardine will no longer be supported, we won't be updating it in these cases.

Suggested Alternative

For clients using Sardine to provision users for their GoodData's projects, we suggest you to switch to the 3rd party Keboola Connection application made by BizzTreat.com. You can find it in Applications. Please don't hesitate to contact BizzTreat.com with feature suggestions or for help with implementation.

For clients using Sardine for GoodData access via a branded UI, we suggest using the http://www.dashboardizer.com/ developed by Slowpath.com. Feel free to contact them directly for a quote.

KBC as a Data Science Brain Interface

The Keboola Data App Store has a fresh new addition. That brings us to total of 16 currently available apps, three of which provided by development partners.

This new one is called “aLook Analytics”, and technically it is a clone of our development project, a “Custom Science” app (not available yet, but soon!). It facilitates connection to a GitHub/Bitbucket repository of a specific data science shop, which you can “hire” via the app and enable them to safely work on your project.

This first instance is connected to Adam Votava’s company aLook Analytics (check them out at http://www.alookanalytics.com/).

How does it work?

Let’s imagine you want to build something data-science-complex in your project. You get in touch with aLook and agree on what it is you want them to do for you. You exchange some data, the boys there will do some testing on their side, set up the environment and once they’re done, they’ll give you a short configuration script that you will enter into their app in KBC. Any business agreement regarding their work is to be made directly between you and aLook, Keboola stays on the sidelines for this one.

When you run the app, your data gets served to aLook’s prepared model and scripts, saved in aLooks repository get executed on Keboola servers. All the complex stuff happens and the resulting data gets returned into your project. The app can be (like any other) included in your Orchestrations, which means it can run automatically as a part of your regular workflow.

The user of KBC does not have direct access to the script, protecting aLook’s IP (of course, if you agree with them otherwise, we do not put up any barriers).

Very soon we will enable the generic “Custom Science” app mentioned above. That means that any data science pro can connect their GitHub/Bitbucket themselves - that gives you, our user, the freedom to find the best brain in the world for your job.

Why people and not just machines?

No “Machine Learning Drag&Drop” app provides the same quality as a bit of thought by a seasoned data scientist. We’re talking business analytics here! People can put things in context and be creative, while all machines can do is to adjust (sometimes thousands of) parameters and tests the results against a training set. That may be awesome for facial recognition or self-driving car AI, but in any specific business application, a trained brain will beat the machine. Often you don’t even have enough of a test sample so a bit of abstract thinking is critical and irreplaceable.

Amazon AWS - massive error rate in cloud API

Amazon Web Services - our major backend cloud provider - announcing massive API error rate in their infrastructure

More details and updates can be found at their official status page http://status.aws.amazon.com/. AWS Status page archive is also here.

Thanks to our heavy dependency on Amazon AWS cloud, our Keboola Connection platform suffer by their errors, so please be patient, check AWS status page and keep your fingers crossed! 

For more information, do not hesitate to contact us here in comments or by support@keboola.com.

UPDATE (2015-09-20 7pm CEST): Amazon API is back in business. 

Media coverage - VentureBeat

Retrying Orchestration Jobs and Warning Notifications

We've heard your cries about how difficult it was to re-run failed jobs in the Orchestrator, so we did something about it:

You can now retry any failed job in your orchestration's job queue. On the (failed) job's detail page you'll see a "Job Retry" button in the upper right corner:

Just click on it and press "run" to re-run failed tasks:

If you need to run just a few tasks (failed or not), click on "Choose orchestration tasks to run" to show the task selection list. Select the ones you want by click on grey button in the middle of the window and middle area and activate/de-activate desired tasks.

The run button will create new tasks, so everything will run in the original environment, under the same circumstances and with the same job parameters.  Just take care to note that it is possible that the data underlying the configuration may have been modified by a different process (ie: someone else working with it) in between the last time the job was run and your re-run.

Notifications

If some tasks are prone to fail often (i.e. wrong credentials in client's Google Analytics), you'll want to activate the "Continue on Failure" flag for the "unstable" tasks. If activated, the Orchestrator will not send an error notification when that specific task fails. Instead the Orchestrator will send out a message to our new notifications channel for "Warnings". Go ahead and subscribe to receive emails about all Warnings:

SSL security improvement

Please review the entire post carefully to determine whether your use of the services will be affected.

As of 12:00 AM PDT April 30, 2015, we will discontinue support of RC4 cipher for securing connections to connection.keboola.com. 

These requests will fail once we disable support for RC4 cipher for the Keboola Connection. To avoid interrupted access, you must update any client software (or inform any clients to update software) making the requests that are using RC4 cipher to connect to our API services.