MySQL Extractor errors

Today we have released a new version of MySQL extractor in which a bug was present.

It caused errors in the UI:

Decoding JSON response from component failed: Syntax error

It also affected jobs of this extractor. Although the jobs seemed to finish successfully, they didn't process any data.

The flawed version was released at 12:14 and reverted at 15:09 CET.


UPDATE:

Another version deployed on May 12th 9:25 introduced another bug, which affected certain queries, resulting in error message:

DB query failed: Trying to access array offset on value of type null

We have reverted this release today on May 13th 11:26.

We sincerely apologize for the errors. A postmortem reports will follow with further details.

Snowflake Slowdown in the EU Region

7 May 2020 8:30 UTC We're seeing a higher load and longer execution time in EU Snowflake queries. We have added more compute capacity and investigating the causes. Next update in two hours.

7 May 2020 9:50 UTC The performance should be back to normal, we're monitoring the situation.

7 May 2020 11:00 UTC We're seeing again slower execution, we're working with Snowflake on resolving the issue. Next update in two hours.

7 May 2020 12:13 UTC Snowflake engineering identified the cause of the reduced performance, we're now processing the backlog. There are still some queued orchestrations, but the run times of individual jobs are back to normal. Both us and Snowflake engineering are monitoring the load. Next update in two hours.

The incident is resolved.


Weeks in review -- April 2020

New Changes in the UI

  • Transformation script editing can now be done in fullscreen mode.

Normal mode:

Fullscreen mode


  • Database writers now have newly improved input mappings


  • The shared bucket detail now shows who shared it (if applicable)


  • And the sandbox modals have been cleaned up:

New Components:

  •  Active Campaign :  Use this component to gather information on your campaigns from your Active Campaign account.


Updated Components:

  • MySQL extractor now properly handles utf8mb4 emojis 
  • Data Warehouse Manager now allows password reset for schema users

TLS Security Update

As of May 12, 2020, Transport Layer Security (TLS) 1.0 and 1.1 will no longer be supported for securing connections to Keboola Connection endpoints.

The vast majority of HTTPS connections made to KBC endpoints use TLS 1.2 and will not be affected. This includes every currently shipping browser used by KBC users. 

We have separately contacted all affected projects. If you did not hear from us then no action is required

If you have any questions or concerns related to this announcement, please don’t hesitate to contact us.

Snowflake Slowdown in the US Region

Friday, 24 April 2020 14:42 UTC We're seeing a higher load and longer execution time in US Snowflake queries. We have added more compute capacity and investigating the causes. Next update in two hours.

Update 18:16 UTC: We're still seeing degraded performance in Snowflake in US region and we're investigating with Snowflake support. Next update in 2 hours.

Update 20:22 UTC: We are working with snowflake on reducing the queueing in our warehouse. We had to pause jobs execution at 20:00 UTC to reduce the influx of queries. When the queue is worked through we'll reenable the jobs.

Update 20:51 UTC: We reenabled the paused job queue with limited throughput and we're monitoring the Snowflake queue closely. So far we see no queueing. Next update in 2 hours. 

Update 22:21 UTC: Job queue is running at full capacity and there are no queries waiting in Snowflake warehouse. Preliminary analysis suggests that the issue was probably caused by a congestion in Snowflake's Cloud Service Layer, but it took Snowflake team some time to find out the root cause and fix it. Some jobs were delayed and some queries timed out resulting in job failures. Those jobs will need to be restarted. We're sorry for the problems this might have caused.

Snowflake Slowdown in EU

Monday, 20 April 2020 07:39:02 UTC: We're seeing degraded performance of Snowflake in EU region, we're investigating the cause with Snowflake. Next update in 1 hour.

Update 08:17:25 UTC: We have thrown more computing power in and the average running times are back to normal. We're still seeing some occasional isolated queries that take longer. We're still working with Snowflake on identifying and resolving the issue, but Keboola Connection is stable now. Next update in 4 hours.

Update 11:31:30 UTC: We still observe slight slowdown in some queries, while other queries run smoothly. From our analytics it seems that job run times are not affected as we've offset the slowdown with more computing power. Next update in 4 hours.

Update 15:33:10 UTC: No significant changes, the situation is stable, but not resolved. Snowflake is working on identifying the source of the performance issues. We're monitoring the situation and in case of significant slow downs we'll offset with more computational power. Next update tomorrow or earlier if there are any changes.

Update 21 April 2020: The situation is stable, we're working with Snowflake on maintaining the stability.

Update 22 April 2020: Snowflake engineers improved performance of impacted queries, together we're working on preventing this in future. We consider the incident closed. Postmortem will be published when we the root cause is fully understood.

Snowflake Job Delays in the US Region

In the early morning Snowflake had an incident in their US West region which caused a large backlog of job processing in Keboola's US Region.  The jobs were all eventually processed, but they may have taken much longer than what you normally experience.

The buildup in our queue began just before 2:00AM CEST and started to ease after 4:30AM  CEST.

Please refer to the above link for further information, and we will add a link to the RCA when it becomes available.

Transformation failures - Post-Mortem

Summary

Between March 30, 20:58 UTC and March 31, 6:15 UTC, some transformation jobs failed with an internal error. About 2% of all transformation jobs were affected. We sincerely apologize for this incident.

What Happened?

On March 30 at 20:58 UTC, we deployed a new version of the Transformation service which contained updated Snowflake ODBC drivers. The update was enforced by Snowflake as a security update patch. Unfortunately, the new version of the driver contained a critical bug which caused the driver to crash when some queries were running longer than one hour. This led to failed transformation jobs.

What Are We Doing About This?

We now treat all driver updates as major updates. This means they go through more careful deployment and monitoring so that we can detect possible problems faster. In the long term, we're working with Snowflake to update drivers in a more controlled manner.


Incident with Snowflake in the US Region

We are currently investigating an increased error rate from snowflake in the US region from approximately 10:00PM CEST.

We will update here as soon as we know more.

UPDATE 11:05 PM CEST: We are handling the issue with Snowflake support. So far all Snowflake operations in US region seem to be failing. Next update at 11:30 PM or sooner if there are any new information or situation changes.

UPDATE 11:30 PM CEST: Snowflake rolled back the release they made today and everything has returned to working condition.

UPDATE 12:00 PM CEST: We're very sorry for this inconvenience. The error started at 12:58 PST (19:58 PM UTC) and lasted until 14:24 PST (21:24 PM UTC). All new Snowflake connections in the US (including those from your DB clients) were failing during the period.

Unfortunately you will need to restart any failed jobs or orchestrations from this time period.

EU region was not affected by this issue.

Snowflake Slowdown in EU

A scaling script running at 12:00 AM CEST failed to scale up the Snowflake warehouse in EU region. All storage and transformation jobs in the EU were affected by this issue and were significantly slower than usual. 

To help process the queued load we have scaled up the warehouse at 9:45 AM CEST and will keep it running until all load is processed.

We're sorry for this inconvenience and we'll be implementing safeguards to prevent this from happening again.