Deleting Projects

To simplify cleanup when a KBC project no longer serves its purpose, we added simple method for deleting it. You'll find that in the "Users & Settings" tab.

We maintain a recoverable backup copy of each project for 60 days following the deletion. After that period, it's gone for good.


YouTube Analytics and Reporting API - Extractor

Google released recently a new Bulk API in order to retrieve viewing statistics, popularity metrics, and more for YouTube videos and channels more conveniently.

The use-case of this API is very simple. It is possible to list jobs and schedule processes that generate new data on a daily basis. As we work with Youtube in our company a lot, we wanted to adapt this new API as quickly as possible, despite its significant limitation (e.g. it is not possible to download historic data before the reports started to be generated by this Bulk API).

We implemented and integrated the first iteration of the YouTube Analytics and Reporting API Extractor for Keboola Connection and started using heavily. 

In a nutshell, the extractor expects that the jobs have already been scheduled and data has been generated (the configuration of the jobs must be done outside the Keboola extractor). You can specify particular report types as well as content owner id and timeframes.  

Configuration is done by Keboola generic GUI that expect a valid JSON object (we plan to build a custom GUI for the configuration part for the further releases as well as make the whole user experience better). The params related to credentials utilise the new encryption feature and are stored safely on Keboola Connection Backend. You can check the documentation to learn more about the configuration part. Documentation contains also important note about the current limitation of this extractor. 

In case of any question/issue, feel free to contact me at any time (radek@bluesky.pro). I am more than happy to help you with the configuration/fixing issues if any. Thank you and enjoy!


KBC as a Data Science Brain Interface

The Keboola Data App Store has a fresh new addition. That brings us to total of 16 currently available apps, three of which provided by development partners.

This new one is called “aLook Analytics”, and technically it is a clone of our development project, a “Custom Science” app (not available yet, but soon!). It facilitates connection to a GitHub/Bitbucket repository of a specific data science shop, which you can “hire” via the app and enable them to safely work on your project.

This first instance is connected to Adam Votava’s company aLook Analytics (check them out at http://www.alookanalytics.com/).

How does it work?

Let’s imagine you want to build something data-science-complex in your project. You get in touch with aLook and agree on what it is you want them to do for you. You exchange some data, the boys there will do some testing on their side, set up the environment and once they’re done, they’ll give you a short configuration script that you will enter into their app in KBC. Any business agreement regarding their work is to be made directly between you and aLook, Keboola stays on the sidelines for this one.

When you run the app, your data gets served to aLook’s prepared model and scripts, saved in aLooks repository get executed on Keboola servers. All the complex stuff happens and the resulting data gets returned into your project. The app can be (like any other) included in your Orchestrations, which means it can run automatically as a part of your regular workflow.

The user of KBC does not have direct access to the script, protecting aLook’s IP (of course, if you agree with them otherwise, we do not put up any barriers).

Very soon we will enable the generic “Custom Science” app mentioned above. That means that any data science pro can connect their GitHub/Bitbucket themselves - that gives you, our user, the freedom to find the best brain in the world for your job.

Why people and not just machines?

No “Machine Learning Drag&Drop” app provides the same quality as a bit of thought by a seasoned data scientist. We’re talking business analytics here! People can put things in context and be creative, while all machines can do is to adjust (sometimes thousands of) parameters and tests the results against a training set. That may be awesome for facial recognition or self-driving car AI, but in any specific business application, a trained brain will beat the machine. Often you don’t even have enough of a test sample so a bit of abstract thinking is critical and irreplaceable.

Configuration encryption

To address security of passwords and other components that require stronger protection, KBC now allows to encrypt certain values in stored configurations. All attributes prefixed with a hashmark sign (#) are automatically encrypted during save. The key is derived from the used component and project and there are no means in any UI or API to decrypt the value. The original value is available only internally and only to the app during its runtime.

What does that mean? When you save your password as an encrypted attribute, even you cannot decrypt it. It becomes available only in the application and in the project it was encrypted and the values cannot be transferred to any other apps or projects. Your passwords are safe and cannot be retrieved even by user with admin rights to your KBC project.

We hope this makes you feel safer! :-)

Note to developers and tech partners: The encryption is completely transparent. You only need 2 simple things: 

  1. tell us that your component uses encryption
  2. prefix all encrypted attributes with # (eg. password => #password)

The infrastructure takes care of the rest. Your application will "see" the decrypted value.

Stopped Docker jobs

Due to a spike in AWS SPOT instances price our Docker workers were shut down around 12am UTC. This affects all jobs that are running on Docker components. We're working on fixing this issue and hope to resume all operations shortly. Thanks for your patience.

Update 04:30am UTC: All operations back to normal, all jobs should have resumed their execution. There was a minor failure with a Docker image for Generic Extractor, some of its jobs have failed with this error

User error: Container 'keboola/docker-generic-extractor:latest' failed: no such file or directory Error response from daemon: Cannot start container 08763383d5370bcdd6e1479da00ae369fe5d845c33485df5337239cc7bdd9c90: [8] System error: no such file or directory

This issue is now fixed and if you have encountered this error, please restart the job. 

Thanks for bearing with us and we're sorry for the inconvenience. 

Data Takeout

As a part of our commitment to openness and total, utter and complete aversion to "customer lock in" tactics, we introduced a "Data Takeout" functionality. It's been around for awhile actually, but now it is right there in the UI. This means that shall our customer become less than completely satisfied with us, there's no technical barrier for them to collect all their data AND the project structure (including all transformation scripts and queries etc.) at a push of a button. (And yeah, we took a hint from Google on how to call the service.)

What this button does is to export everything into AWS Simple Storage Service (S3).

Several files will be created there, which will contain:

  • All bucket and table metadata (attributes, names, columns, settings)
  • All table data exported to gzipped CSV files
  • All component configurations (e.g. transformation configuration with all queries, database extractor settings and queries, etc.)

Export can be limited only to metadata and configurations and the "Data Take Out" button can be found at Users & Settings page.