VPN connection errors

We are investigating VPN connection issues for extractors and writers in both regions. 

UPDATE Jan 01, 10:03 AM (UTC) - Our VPN provider is investigating the issue

UPDATE Jan 01, 11:54 AM (UTC) - Our VPN provider has found the root cause and fixed the issue. All VPN connections should be working again.

Job termination issue

We are investigating job termination issues for some components. After termination request is accepted jobs remain in terminating state until finished. 

UPDATE Nov 22, 9:28 AM (UTC) - We have identified the root cause and prepared the fix which should be released in few hours. Only minority of jobs termination was affected by this issue.

UPDATE Nov 22, 10:49 AM (UTC) - The fix was deployed and job termination is fully functional again for all jobs.

Component failures

We are experiencing failures of few components e.g. Salesforce Extractor. It is caused by recent infrastructure changes that were deployed on November 8th.

We are rolling back these changes and we'll update this status with new information.

UPDATE Nov 11, 8:37 AM (UTC) - Changes was rolled back and affected components should be working again without issues.

UPDATE Nov 11, 8:50 AM (UTC) - We're investigating further issues with Salesforce Extractor. 

UPDATE Nov 11, 9:00 AM (UTC) - Salesforce Extractor is working again. 

We're sorry for the inconvenience. 


GoodData issues in US region

GoodData was experiencing issues that were preventing users of GoodData system to access the platform and the projects located on this datacenter. The issue started on 10/30 at 23:43 (UTC) and was resolved on 10/31 02:18 (UTC).

Affected jobs in Keboola Connection ended up in error Running 147946154733.dkr.ecr.us-east-1.amazonaws.com/developer-portal-v2/keboola.gooddata-writer:1.4.4 container exceeded the timeout of 18000 seconds.

Read the original GoodData service status.

Job errors in EU region

We are investigating job failures in EU region started at 1:32 UTC.

We will provide an update when we'll have more information. 

UPDATE 06:06 UTC - We have identified the issue and fixed the cause. Backlog is processing now.

UPDATE 07:54 UTC - There is still backlog of orchestration jobs. We have increased the processing capacity. It should be cleared in half an hour.

UPDATE 08:26 UTC - The backlog was cleared. All services are running.

We apologize for the inconvenience, we'll share more details in a post-mortem.

Google Drive extractor authorization in EU [resolved]

We are investigating a Google Drive Extractor authorization verification issue in Keboola Connection EU.

Only creation of new configurations is affected.

We will provide an update as soon as the issue is resolved.

Update 2019-08-14: The verification issue is resolved and new configuration are working again.


Snowflake issues in US region

Update 12:36 UTC:

Everything should be back to normal. We'll keep monitoring our systems.


Update 9:15 UTC:

You should not experience any more errors, but things are a bit overloaded so longer execution and waiting times are expected. Thank you for patience.


Update 7:52 UTC:

Snowflake services restored. We're resuming processing of jobs.


Update 6:23 UTC:

Snowflake services are still not restored. We're slowing down job processing, so you'll see a much larger amount of queued jobs. EU region is unaffected.


Original post:

Some Snowflake queries started failing around 03:14 UTC which is causing errors in job processing. Failures are caused by incident which is currently investigated by Snowflake https://status.snowflake.com/incidents/0sjfn3d5jq2q .

We will provide an update as soon as the issue is resolved.


Delayed jobs in EU region

Execution of some table import jobs scheduled after 07:42 UTC was delayed up to 30 minutes. The delay was caused by new platform release which was immediately rolled back. All systems are now operational.

Failed and delayed jobs in EU region

Database storing locks was restarted at 03:49 UTC which caused the job failures. Also some of the jobs were queued after this failure.

The backlog of all jobs was cleared at 06:15 UTC. The system is fully operational now. We're working on infrastructure changes which should prevent similar issues.