New Extractor (and Migration Guide)

We developed a new version of the extractor and renamed the current version to (Deprecated).

As some of the changes introduced in the new version are backwards incompatible, these two versions will be running aside for a period of time and we kindly ask you to migrate your configurations.


  • Extractor now runs asynchronously - less CURL errors, more scalability and durability, monitoring via Jobs app
  • Configuration is stored in sys.c-ex-salesforce, was in sys.c-SFDC previously
  • Data is stored in in.c-ex-salesforce-config, was in in.c-config previously (where config is configuration name)
  • Some changes in the UI (mostly the menu on the right)

Migration Guide

As there is an OAuth authorization in the process, we can't automate and test the process. Follow this guide for each extractor configuration in your project. Migration can be performed in 4 easy steps:

  1. Copy configuration
  2. Reauthorize extractor
  3. Run extraction
  4. Integrate 

1. Copy Configuration

Using the UI create a new configuration in extractor and copy & paste all queries and credentials you have in your (Deprecated) configuration.

2. Reauthorize Extractor

As the extractor runs on a different worker, you need to get new OAuth tokens. Do it simply by clicking on Authorize SalesForce in the right menu.

3. Run extraction

You can now run all queries by clicking on Run all queries in the UI. You can monitor the progress in the Jobs application.

If you have any incremental query in your configuration you need to migrate the data extracted by these queries first. Repeat this for every incremental query: 

  • Use Storage application and make a snapshot of the two original incremental tables (eg. in.c-SFDC01.User and in.c-SFDC01.User_deleted). 
  • Use the Create new table from snapshot function to copy the tables to the new bucket (eg. in.c-ex-salesforce-SFDC01.User and in.c-ex-salesforce-SFDC01.User_deleted). If the destination bucket does not exist, simply create the bucket manually or run any non-incremental query.

4. Integrate

Once you have downloaded the initial set of data you may need to alter some transformations and orchestrations to integrate the new extractor in the whole pipeline.


Create a new orchestration task with the new extractor with the same parameters and then delete the old Deprecated extractor. 


There are two options how to migrate the transformations. You can change the input mappings from the old tables to the new tables (the structure and column names remain the same), or you can keep the old names and simply delete the old tables and make an alias for each deleted table (eg. delete in.c-SFDC01.User and make an alias, eg. in.c-ex-salesforce-SFDC01.User->in.c-SFDC01.User). 

End Of Life Announcement

The (Deprecated) extractor will be terminated on January 15th. If you have any trouble migratings your configuration, please contact

Pigeon Importer app

Attach your data(a csv or gzipped csv file) and send it to a given email, the pigeon will check the inbox and import the received attachment into a storage api table. The whole work flow can be configured via Pigeon Importer UI app and then registered as a regular orchestration task.

New App Annie Extractor

We have added a new App Annie ( extractor to our connectors portfolio. 

It is available in the Add Extractor menu in your project. The user interface is underway, but feel free to set up your configuration manually, and feel free to contact us at should you encounter any troubles during the process.

Transformation Input Mapping: Views and Tables

We just introduced an icon in input mapping to show, whether the input mapping is created as a view or a table

Running Redshift transformations and reading data from Redshift Storage (which is the current recommended fastest option) you can choose between creating a table or a view in the input mapping. Whats the difference?

Views are lightning fast to create. Input mapping just aliases a table to your working schema within the cluster and that's it (including all filters). You can then layer another view on that and another... until you're done and you can set the final view as a source table for an output mapping. All the work is then done when processing the output mapping. That is the snatch - it is easier to reach the cluster's limits (memory, disk) with one large query (multiple nested views). And because the cluster is out of memory, it will also terminate all other queries running on the cluster at the same time. 

So please be careful when using views. If you're not sure, feel free to reach out to for more assistance. 

Email notifications

Orchestrator's email notifications were redesigned. 

If anything wrong happen, Orchestrator send you brief visual overview. All necessary details are accessible through UI. We're not spamming you by long list of logs anymore.

Mobile skin:

Desktop skin:

Orchestrator's Job details

...were redesigned, so your debug scenario should work much smoothly. This is redesigned page with all Jobs and tasks details:

Extractors failures

Paymo, Facebook, Facebook Ads and Salesforce extractors were returning curl(60) error in orchestrations from 3 PM - 11PM PST November 6th. Error was caused by invalid SSL certificates.

To finish your tasks, just re-run your orchestrators. We're sorry for any inconvenience! 

Inaccessible Storage API files

Some files were not accessible between 7 PM - 10 PM PST November 4. It caused failures of loads to storage API tables and thus also orchestration failures.

Example of failed orchestration:

It was caused by failed Elasticsearch cluster node. We are still investigating the cause of this issue. However, our whole infrastructure works smoothly at this time. To finish your tasks, just re-run your orchestrators. We're sorry for any inconvenience!

Direct (r/o) Access to any Redshift Bucket

Today we're announcing new Storage API feature: Bucket Credentials (api here). 

If you're using Keboola Connection w/ Redshift backend, you can have read-only credentials (direct sql access) to any Redshift bucket. 

In Storage API Console, go to Bucket Detail > Credentials and press "Create new credentials" button:

Describe new credentials (you can have multiple credentials assigned to each bucket!):

When you create credentials, carefully copy&paste credentials to you SQL client or preferred remote service (, etc.). After closing displayed credentials, you can't display it's settings:

In case you need to re-use already created credentials, you have to delete it and create new combination of username and password. All existing credentials are listed under it's bucket:

  1. credentials can be used for accessing just one bucket
  2. write access isn't supported 

WARNING: Always employ SSL when accessing your data. Generated credentials are opening your dedicated AWS Redshift Cluster. Please read "Configure Security Options for Connections". Redshift Cluster's CA certificate can be downloaded here.